test score results: Topics by WorldWideScience.org

Sample records for test score results

What Do Test Scores Really Mean? A Latent Class Analysis of Danish Test Score Performance

DEFF Research Database (Denmark)

Munk, Martin D.; McIntosh, James

2014-01-01

Latent class Poisson count models are used to analyze a sample of Danish test score results from a cohort of individuals born in 1954-55, tested in 1968, and followed until 2011. The procedure takes account of unobservable effects as well as excessive zeros in the data. We show that the test scores...... of intelligence explain a significant proportion of the variation in test scores. This adds to the complexity of interpreting test scores and suggests that school culture and possible incentive problems make it more di¢ cult to understand what the tests measure....
Predicting occupational personality test scores.

Science.gov (United States)

Furnham, A; Drakeley, R

2000-01-01

The relationship between students' actual test scores and their self-estimated scores on the Hogan Personality Inventory (HPI; R. Hogan & J. Hogan, 1992), an omnibus personality questionnaire, was examined. Despite being given descriptive statistics and explanations of each of the dimensions measured, the students tended to overestimate their scores; yet all correlations between actual and estimated scores were positive and significant. Correlations between self-estimates and actual test scores were highest for sociability, ambition, and adjustment (r = .62 to r = .67). The results are discussed in terms of employers' use and abuse of personality assessment for job recruitment.
Test/score/report: Simulation techniques for automating the test process

Science.gov (United States)

Hageman, Barbara H.; Sigman, Clayton B.; Koslosky, John T.

1994-01-01

A Test/Score/Report capability is currently being developed for the Transportable Payload Operations Control Center (TPOCC) Advanced Spacecraft Simulator (TASS) system which will automate testing of the Goddard Space Flight Center (GSFC) Payload Operations Control Center (POCC) and Mission Operations Center (MOC) software in three areas: telemetry decommutation, spacecraft command processing, and spacecraft memory load and dump processing. Automated computer control of the acceptance test process is one of the primary goals of a test team. With the proper simulation tools and user interface, the task of acceptance testing, regression testing, and repeatability of specific test procedures of a ground data system can be a simpler task. Ideally, the goal for complete automation would be to plug the operational deliverable into the simulator, press the start button, execute the test procedure, accumulate and analyze the data, score the results, and report the results to the test team along with a go/no recommendation to the test team. In practice, this may not be possible because of inadequate test tools, pressures of schedules, limited resources, etc. Most tests are accomplished using a certain degree of automation and test procedures that are labor intensive. This paper discusses some simulation techniques that can improve the automation of the test process. The TASS system tests the POCC/MOC software and provides a score based on the test results. The TASS system displays statistics on the success of the POCC/MOC system processing in each of the three areas as well as event messages pertaining to the Test/Score/Report processing. The TASS system also provides formatted reports documenting each step performed during the tests and the results of each step. A prototype of the Test/Score/Report capability is available and currently being used to test some POCC/MOC software deliveries. When this capability is fully operational it should greatly reduce the time necessary
Prediction of true test scores from observed item scores and ancillary data.

Science.gov (United States)

Haberman, Shelby J; Yao, Lili; Sinharay, Sandip

2015-05-01

In many educational tests which involve constructed responses, a traditional test score is obtained by adding together item scores obtained through holistic scoring by trained human raters. For example, this practice was used until 2008 in the case of GRE(®) General Analytical Writing and until 2009 in the case of TOEFL(®) iBT Writing. With use of natural language processing, it is possible to obtain additional information concerning item responses from computer programs such as e-rater(®). In addition, available information relevant to examinee performance may include scores on related tests. We suggest application of standard results from classical test theory to the available data to obtain best linear predictors of true traditional test scores. In performing such analysis, we require estimation of variances and covariances of measurement errors, a task which can be quite difficult in the case of tests with limited numbers of items and with multiple measurements per item. As a consequence, a new estimation method is suggested based on samples of examinees who have taken an assessment more than once. Such samples are typically not random samples of the general population of examinees, so that we apply statistical adjustment methods to obtain the needed estimated variances and covariances of measurement errors. To examine practical implications of the suggested methods of analysis, applications are made to GRE General Analytical Writing and TOEFL iBT Writing. Results obtained indicate that substantial improvements are possible both in terms of reliability of scoring and in terms of assessment reliability. © 2015 The British Psychological Society.
Scoring in genetically modified organism proficiency tests based on log-transformed results.

Science.gov (United States)

Thompson, Michael; Ellison, Stephen L R; Owen, Linda; Mathieson, Kenneth; Powell, Joanne; Key, Pauline; Wood, Roger; Damant, Andrew P

2006-01-01

The study considers data from 2 UK-based proficiency schemes and includes data from a total of 29 rounds and 43 test materials over a period of 3 years. The results from the 2 schemes are similar and reinforce each other. The amplification process used in quantitative polymerase chain reaction determinations predicts a mixture of normal, binomial, and lognormal distributions dominated by the latter 2. As predicted, the study results consistently follow a positively skewed distribution. Log-transformation prior to calculating z-scores is effective in establishing near-symmetric distributions that are sufficiently close to normal to justify interpretation on the basis of the normal distribution.
Towards reporting standards for neuropsychological study results: A proposal to minimize communication errors with standardized qualitative descriptors for normalized test scores.

Science.gov (United States)

Schoenberg, Mike R; Rum, Ruba S

2017-11-01

Rapid, clear and efficient communication of neuropsychological results is essential to benefit patient care. Errors in communication are a lead cause of medical errors; nevertheless, there remains a lack of consistency in how neuropsychological scores are communicated. A major limitation in the communication of neuropsychological results is the inconsistent use of qualitative descriptors for standardized test scores and the use of vague terminology. PubMed search from 1 Jan 2007 to 1 Aug 2016 to identify guidelines or consensus statements for the description and reporting of qualitative terms to communicate neuropsychological test scores was conducted. The review found the use of confusing and overlapping terms to describe various ranges of percentile standardized test scores. In response, we propose a simplified set of qualitative descriptors for normalized test scores (Q-Simple) as a means to reduce errors in communicating test results. The Q-Simple qualitative terms are: 'very superior', 'superior', 'high average', 'average', 'low average', 'borderline' and 'abnormal/impaired'. A case example illustrates the proposed Q-Simple qualitative classification system to communicate neuropsychological results for neurosurgical planning. The Q-Simple qualitative descriptor system is aimed as a means to improve and standardize communication of standardized neuropsychological test scores. Research are needed to further evaluate neuropsychological communication errors. Conveying the clinical implications of neuropsychological results in a manner that minimizes risk for communication errors is a quintessential component of evidence-based practice. Copyright © 2017 Elsevier B.V. All rights reserved.
Do Test Scores Buy Happiness?

Science.gov (United States)

McCluskey, Neal

2017-01-01

Since at least the enactment of No Child Left Behind in 2002, standardized test scores have served as the primary measures of public school effectiveness. Yet, such scores fail to measure the ultimate goal of education: maximizing happiness. This exploratory analysis assesses nation level associations between test scores and happiness, controlling…
Exploring a Source of Uneven Score Equity across the Test Score Range

Science.gov (United States)

Huggins-Manley, Anne Corinne; Qiu, Yuxi; Penfield, Randall D.

2018-01-01

Score equity assessment (SEA) refers to an examination of population invariance of equating across two or more subpopulations of test examinees. Previous SEA studies have shown that score equity may be present for examinees scoring at particular test score ranges but absent for examinees scoring at other score ranges. No studies to date have…
A Human Capital Model of Educational Test Scores

DEFF Research Database (Denmark)

McIntosh, James; D. Munk, Martin

Latent class Poisson count models are used to analyze a sample of Danish test score results from a cohort of individuals born in 1954-55 and tested in 1968. The procedure takes account of unobservable effects as well as excessive zeros in the data. The bulk of unobservable effects are uncorrelated...... with observable parental attributes and, thus, are environmental rather than genetic in origin. We show that the test scores measure manifest or measured ability as it has evolved over the life of the respondent and is, thus, more a product of the human capital formation process than some latent or fundamental...... measure of pure cognitive ability. We find that variables which are not closely associated with traditional notions of intelligence explain a significant proportion of the variation in test scores. This adds to the complexity of interpreting test scores and suggests that school culture, attitudes...
The Health Professions Admission Test (HPAT) score and leaving certificate results can independently predict academic performance in medical school: do we need both tests?

LENUS (Irish Health Repository)

Halpenny, D

2010-11-01

A recent study raised concerns regarding the ability of the health professions admission test (HPAT) Ireland to improve the selection process in Irish medical schools. We aimed to establish whether performance in a mock HPAT correlated with academic success in medicine. A modified HPAT examination and a questionnaire were administered to a group of doctors and medical students. There was a significant correlation between HPAT score and college results (r2: 0.314, P = 0.018, Spearman Rank) and between leaving cert score and college results (r2: 0.306, P = 0.049, Spearman Rank). There was no correlation between leaving cert points score and HPAT score. There was no difference in HPAT score across a number of other variables including gender, age and medical speciality. Our results suggest that both the HPAT Ireland and the leaving certificate examination could act as independent predictors of academic achievement in medicine.
Validating the Interpretations and Uses of Test Scores

Science.gov (United States)

Kane, Michael T.

2013-01-01

To validate an interpretation or use of test scores is to evaluate the plausibility of the claims based on the scores. An argument-based approach to validation suggests that the claims based on the test scores be outlined as an argument that specifies the inferences and supporting assumptions needed to get from test responses to score-based…
What do educational test scores really measure?

DEFF Research Database (Denmark)

McIntosh, James; D. Munk, Martin

Latent class Poisson count models are used to analyze a sample of Danish test score results from a cohort of individuals born in 1954-55 and tested in 1968. The procedure takes account of unobservable effects as well as excessive zeros in the data. The bulk of unobservable effects are uncorrelate......, and possible incentive problems make it more difficult to elicit true values of what the tests measure....
A process dissociation approach to objective-projective test score interrelationships.

Science.gov (United States)

Bornstein, Robert F

2002-02-01

Even when self-report and projective measures of a given trait or motive both predict theoretically related features of behavior, scores on the 2 tests correlate modestly with each other. This article describes a process dissociation framework for personality assessment, derived from research on implicit memory and learning, which can resolve these ostensibly conflicting results. Research on interpersonal dependency is used to illustrate 3 key steps in the process dissociation approach: (a) converging behavioral predictions, (b) modest test score intercorrelations, and (c) delineation of variables that differentially affect self-report and projective test scores. Implications of the process dissociation framework for personality assessment and test development are discussed.
Effects of white noise on Callsign Acquisition Test and Modified Rhyme Test scores.

Science.gov (United States)

Blue-Terry, Misty; Letowski, Tomasz

2011-02-01

The Callsign Acquisition Test (CAT) is a speech intelligibility test developed by the US Army Research Laboratory. The test has been used to evaluate speech transmission through various communication systems but has not been yet sufficiently standardised and validated. The aim of this study was to compare CAT and Modified Rhyme Test (MRT) performance in the presence of white noise across a range of signal-to-noise ratios (SNRs). A group of 16 normal-hearing listeners participated in the study. The speech items were presented at 65 dB(A) in the background of white noise at SNRs of -18, -15, -12, -9 and -6 dB. The results showed a strong positive association (75.14%) between the two tests, but significant differences between the CAT and MRT absolute scores in the range of investigated SNRs. Based on the data, a function to predict CAT scores based on existing MRT scores and vice versa was formulated. STATEMENT OF RELEVANCE: This work compares performance data of a common speech intelligibility test (MRT) with a new test (CAT) in the presence of white noise. The results here can be used as a part of the standardisation procedures and provide insights to the predictive capabilities of the CAT to quantify speech intelligibility communication in high-noise military environments.
Adaptive testing with equated number-correct scoring

NARCIS (Netherlands)

van der Linden, Willem J.

1999-01-01

A constrained CAT algorithm is presented that automatically equates the number-correct scores on adaptive tests. The algorithm can be used to equate number-correct scores across different administrations of the same adaptive test as well as to an external reference test. The constraints are derived
The Effect of Mock Tests on Iranian EFL learners’ Test Scores

Directory of Open Access Journals (Sweden)

Hossein Khodabakhshzadeh

2016-07-01

Full Text Available The effect of using tests in test preparation courses has been subject to debate. While some scholars such as Yang and Badger (2015 believe it is a cause of positive washback effect, others argue that this issue is tentative and context-bound (Green, 2007. Therefore, this study investigated the effect of using Mock tests in International English Language Testing System (IELTS preparation courses on students’ overall IELTS scores. Fifty one IELTS students were selected non-randomly through the quota sampling approach out of 76 students at Mahan Language Institute in Birjand, Iran. These participants were distributed into Group 1 (n=25 and Group 2 (n=26. A complete IELTS test was administered to ensure that the Groups were homogeneous and to serve as pretest. After 10 sessions of intervention, a different IELTS test was administered as posttest. The results of between subject analysis through independent samples t-test revealed that using Mock tests in the IELTS preparation courses can positively affect the participants scores on IELTS exam. Pedagogical implications are discussed.
ITC Guidelines on Quality Control in Scoring, Test Analysis, and Reporting of Test Scores

Science.gov (United States)

Allalouf, Avi

2014-01-01

The Quality Control (QC) Guidelines are intended to increase the efficiency, precision, and accuracy of the scoring, analysis, and reporting process of testing. The QC Guidelines focus on large-scale testing operations where multiple forms of tests are created for use on set dates. However, they may also be used for a wide variety of other testing…
The Effect of Pretest Exercise on Baseline Computerized Neurocognitive Test Scores.

Science.gov (United States)

Pawlukiewicz, Alec; Yengo-Kahn, Aaron M; Solomon, Gary

2017-10-01

Baseline neurocognitive assessment plays a critical role in return-to-play decision making following sport-related concussions. Prior studies have assessed the effect of a variety of modifying factors on neurocognitive baseline test scores. However, relatively little investigation has been conducted regarding the effect of pretest exercise on baseline testing. The aim of our investigation was to determine the effect of pretest exercise on baseline Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) scores in adolescent and young adult athletes. We hypothesized that athletes undergoing self-reported strenuous exercise within 3 hours of baseline testing would perform more poorly on neurocognitive metrics and would report a greater number of symptoms than those who had not completed such exercise. Cross-sectional study; Level of evidence, 3. The ImPACT records of 18,245 adolescent and young adult athletes were retrospectively analyzed. After application of inclusion and exclusion criteria, participants were dichotomized into groups based on a positive (n = 664) or negative (n = 6609) self-reported history of strenuous exercise within 3 hours of the baseline test. Participants with a positive history of exercise were then randomly matched, based on age, sex, education level, concussion history, and hours of sleep prior to testing, on a 1:2 basis with individuals who had reported no pretest exercise. The baseline ImPACT composite scores of the 2 groups were then compared. Significant differences were observed for the ImPACT composite scores of verbal memory, visual memory, reaction time, and impulse control as well as for the total symptom score. No significant between-group difference was detected for the visual motor composite score. Furthermore, pretest exercise was associated with a significant increase in the overall frequency of invalid test results. Our results suggest a statistically significant difference in ImPACT composite scores between
Summary of Score Changes (in other Tests).

Science.gov (United States)

Cleary, T. Anne; McCandless, Sam A.

Scholastic Aptitude Test (SAT) scores have declined during the last 14 years. Similar score declines have been observed in many different testing programs, many groups, and tested areas. The declines, while not large in any given year, have been consistent over time, area, and group. The period around 1965 is critical for the interpretation of…
Gender, Stereotype Threat and Mathematics Test Scores

OpenAIRE

Ming Tsui; Xiao Y. Xu; Edmond Venator

2011-01-01

Problem statement: Stereotype threat has repeatedly been shown to depress womens scores on difficult math tests. An attempt to replicate these findings in China found no support for the stereotype threat hypothesis. Our math test was characterized as being personally important for the student participants, an atypical condition in most stereotype threat laboratory research. Approach: To evaluate the effects of this personal demand, we conducted three experiments. Results: ...

Data-driven efficient score tests for deconvolution hypotheses

NARCIS (Netherlands)

Langovoy, M.

2008-01-01

We consider testing statistical hypotheses about densities of signals in deconvolution models. A new approach to this problem is proposed. We constructed score tests for the deconvolution density testing with the known noise density and efficient score tests for the case of unknown density. The
Facilitating the Interpretation of English Language Proficiency Scores: Combining Scale Anchoring and Test Score Mapping Methodologies

Science.gov (United States)

Powers, Donald; Schedl, Mary; Papageorgiou, Spiros

2017-01-01

The aim of this study was to develop, for the benefit of both test takers and test score users, enhanced "TOEFL ITP"® test score reports that go beyond the simple numerical scores that are currently reported. To do so, we applied traditional scale anchoring (proficiency scaling) to item difficulty data in order to develop performance…
The Truth about Scores Children Achieve on Tests.

Science.gov (United States)

Brown, Jonathan R.

1989-01-01

The importance of using the standard error of measurement (SEm) in determining reliability in test scores is emphasized. The SEm is compared to the hypothetical true score for standardized tests, and procedures for calculation of the SEm are explained. (JDD)
Biering-Sorensen test scores in coal miners

Energy Technology Data Exchange (ETDEWEB)

Tekin, Y.; Ortancil, O.; Ankarali, H.; Basaran, A.; Sarikaya, S.; Ozdolap, S. [Zonguldak Karaelmas University, Zonguldak (Turkey)

2009-05-15

Biering-Sorensen test is an isometric back endurance test. Biering-Sorensen test scores have varied in different cultural and occupational groups. The aims of this study were to collect normative data on Biering-Sorensen holding times, to determine the discriminative ability of the Biering-Sorensen test in Turkish coal miners, and to examine the association between Biering-Sorensen test result and functional disability. One hundred and fifty male coal miners participated in this study. Trunk extensor muscle strength was measured using the Biering-Sorensen test. Oswestry disability index was used to measure the functional disability level of low back pain. The mean Biering-Sorensen holding time for the total subject group was 107.3 {+-} 22.5 s. The mean time of Biering-Sorensen test of the subjects with and without low back pain were 99.9 {+-} 19.8 and 128.6 {+-} 15.2 s, respectively. The difference between the subjects with and without low back pain was statistically significant (p < 0.001). There was a statistically significant negative correlation between Oswestry functional disability score and Biering-Sorensen holding time (R = -0.824, p < 0.001). Turkish coal miners have low mean back extensor endurance holding times. Biering-Sorensen test had a good discriminative ability in our study group. Trunk muscle strength has a significant effect on the disability level of low back pain. Thus trunk muscle endurance training exercise therapy may be effective for the reduction of disability in patients with low back pain.
Evaluating the Predictive Validity of Graduate Management Admission Test Scores

Science.gov (United States)

Sireci, Stephen G.; Talento-Miller, Eileen

2006-01-01

Admissions data and first-year grade point average (GPA) data from 11 graduate management schools were analyzed to evaluate the predictive validity of Graduate Management Admission Test[R] (GMAT[R]) scores and the extent to which predictive validity held across sex and race/ethnicity. The results indicated GMAT verbal and quantitative scores had…
Relative Merits of Four Methods for Scoring Cloze Tests.

Science.gov (United States)

Brown, James Dean

1980-01-01

Describes study comparing merits of exact answer, acceptable answer, clozentropy and multiple choice methods for scoring tests. Results show differences among reliability, mean item facility, discrimination and usability, but not validity. (BK)
Effects of Test Media on Different EFL Test-Takers in Writing Scores and in the Cognitive Writing Process

Science.gov (United States)

Zou, Xiao-Ling; Chen, Yan-Min

2016-01-01

The effects of computer and paper test media on EFL test-takers with different computer familiarity in writing scores and in the cognitive writing process have been comprehensively explored from the learners' aspect as well as on the basis of related theories and practice. The results indicate significant differences in test scores among the…
Accountancy, teaching methods, sex, and American College Test scores.

Science.gov (United States)

Heritage, J; Harper, B S; Harper, J P

1990-10-01

This study examines the significance of sex, methodology, academic preparation, and age as related to development of judgmental and problem-solving skills. Sex, American College Test (ACT) Mathematics scores, Composite ACT scores, grades in course work, grade point average (GPA), and age were used in studying the effects of teaching method on 96 students' ability to analyze data in financial statements. Results reflect positively on accounting students compared to the general college population and the women students in particular.
Validation of new prognostic and predictive scores by sequential testing approach

International Nuclear Information System (INIS)

Nieder, Carsten; Haukland, Ellinor; Pawinski, Adam; Dalhaug, Astrid

2010-01-01

Background and Purpose: For practitioners, the question arises how their own patient population differs from that used in large-scale analyses resulting in new scores and nomograms and whether such tools actually are valid at a local level and thus can be implemented. A recent article proposed an easy-to-use method for the in-clinic validation of new prediction tools with a limited number of patients, a so-called sequential testing approach. The present study evaluates this approach in scores related to radiation oncology. Material and Methods: Three different scores were used, each predicting short overall survival after palliative radiotherapy (bone metastases, brain metastases, metastatic spinal cord compression). For each scenario, a limited number of consecutive patients entered the sequential testing approach. The positive predictive value (PPV) was used for validation of the respective score and it was required that the PPV exceeded 80%. Results: For two scores, validity in the own local patient population could be confirmed after entering 13 and 17 patients, respectively. For the third score, no decision could be reached even after increasing the sample size to 30. Conclusion: In-clinic validation of new predictive tools with sequential testing approach should be preferred over uncritical adoption of tools which provide no significant benefit to local patient populations. Often the necessary number of patients can be reached within reasonable time frames even in small oncology practices. In addition, validation is performed continuously as the data are collected. (orig.)
Validation of new prognostic and predictive scores by sequential testing approach

Energy Technology Data Exchange (ETDEWEB)

Nieder, Carsten [Radiation Oncology Unit, Nordland Hospital, Bodo (Norway); Inst. of Clinical Medicine, Univ. of Tromso (Norway); Haukland, Ellinor; Pawinski, Adam; Dalhaug, Astrid [Radiation Oncology Unit, Nordland Hospital, Bodo (Norway)

2010-03-15

Background and Purpose: For practitioners, the question arises how their own patient population differs from that used in large-scale analyses resulting in new scores and nomograms and whether such tools actually are valid at a local level and thus can be implemented. A recent article proposed an easy-to-use method for the in-clinic validation of new prediction tools with a limited number of patients, a so-called sequential testing approach. The present study evaluates this approach in scores related to radiation oncology. Material and Methods: Three different scores were used, each predicting short overall survival after palliative radiotherapy (bone metastases, brain metastases, metastatic spinal cord compression). For each scenario, a limited number of consecutive patients entered the sequential testing approach. The positive predictive value (PPV) was used for validation of the respective score and it was required that the PPV exceeded 80%. Results: For two scores, validity in the own local patient population could be confirmed after entering 13 and 17 patients, respectively. For the third score, no decision could be reached even after increasing the sample size to 30. Conclusion: In-clinic validation of new predictive tools with sequential testing approach should be preferred over uncritical adoption of tools which provide no significant benefit to local patient populations. Often the necessary number of patients can be reached within reasonable time frames even in small oncology practices. In addition, validation is performed continuously as the data are collected. (orig.)
The Formalization of Fairness: Issues in Testing for Measurement Invariance Using Subtest Scores

Science.gov (United States)

Molenaar, Dylan; Borsboom, Denny

2013-01-01

Measurement invariance is an important prerequisite for the adequate comparison of group differences in test scores. In psychology, measurement invariance is typically investigated by means of linear factor analyses of subtest scores. These subtest scores typically result from summing the item scores. In this paper, we discuss 4 possible problems…
Explaining the black-white gap in cognitive test scores: Toward a theory of adverse impact.

Science.gov (United States)

Cottrell, Jonathan M; Newman, Daniel A; Roisman, Glenn I

2015-11-01

In understanding the causes of adverse impact, a key parameter is the Black-White difference in cognitive test scores. To advance theory on why Black-White cognitive ability/knowledge test score gaps exist, and on how these gaps develop over time, the current article proposes an inductive explanatory model derived from past empirical findings. According to this theoretical model, Black-White group mean differences in cognitive test scores arise from the following racially disparate conditions: family income, maternal education, maternal verbal ability/knowledge, learning materials in the home, parenting factors (maternal sensitivity, maternal warmth and acceptance, and safe physical environment), child birth order, and child birth weight. Results from a 5-wave longitudinal growth model estimated on children in the NICHD Study of Early Child Care and Youth Development from ages 4 through 15 years show significant Black-White cognitive test score gaps throughout early development that did not grow significantly over time (i.e., significant intercept differences, but not slope differences). Importantly, the racially disparate conditions listed above can account for the relation between race and cognitive test scores. We propose a parsimonious 3-Step Model that explains how cognitive test score gaps arise, in which race relates to maternal disadvantage, which in turn relates to parenting factors, which in turn relate to cognitive test scores. This model and results offer to fill a need for theory on the etiology of the Black-White ethnic group gap in cognitive test scores, and attempt to address a missing link in the theory of adverse impact. (c) 2015 APA, all rights reserved).
Validity of GRE General Test scores and TOEFL scores for graduate admission to a technical university in Western Europe

Science.gov (United States)

Zimmermann, Judith; von Davier, Alina A.; Buhmann, Joachim M.; Heinimann, Hans R.

2018-01-01

Graduate admission has become a critical process in tertiary education, whereby selecting valid admissions instruments is key. This study assessed the validity of Graduate Record Examination (GRE) General Test scores for admission to Master's programmes at a technical university in Europe. We investigated the indicative value of GRE scores for the Master's programme grade point average (GGPA) with and without the addition of the undergraduate GPA (UGPA) and the TOEFL score, and of GRE scores for study completion and Master's thesis performance. GRE scores explained 20% of the variation in the GGPA, while additional 7% were explained by the TOEFL score and 3% by the UGPA. Contrary to common belief, the GRE quantitative reasoning score showed only little explanatory power. GRE scores were also weakly related to study progress but not to thesis performance. Nevertheless, GRE and TOEFL scores were found to be sensible admissions instruments. Rigorous methodology was used to obtain highly reliable results.
Relationships between spatial activities and scores on the mental rotation test as a function of sex.

Science.gov (United States)

Ginn, Sheryl R; Pickens, Stefanie J

2005-06-01

Previous results suggested that female college students' scores on the Mental Rotations Test might be related to their prior experience with spatial tasks. For example, women who played video games scored better on the test than their non-game-playing peers, whereas playing video games was not related to men's scores. The present study examined whether participation in different types of spatial activities would be related to women's performance on the Mental Rotations Test. 31 men and 59 women enrolled at a small, private church-affiliated university and majoring in art or music as well as students who participated in intercollegiate athletics completed the Mental Rotations Test. Women's scores on the Mental Rotations Test benefitted from experience with spatial activities; the more types of experience the women had, the better their scores. Thus women who were athletes, musicians, or artists scored better than those women who had no experience with these activities. The opposite results were found for the men. Efforts are currently underway to assess how length of experience and which types of experience are related to scores.
The Mediating Effect of Listening Metacognitive Awareness between Test-Taking Motivation and Listening Test Score: An Expectancy-Value Theory Approach.

Science.gov (United States)

Xu, Jian

2017-01-01

The present study investigated test-taking motivation in L2 listening testing context by applying Expectancy-Value Theory as the framework. Specifically, this study was intended to examine the complex relationships among expectancy, importance, interest, listening anxiety, listening metacognitive awareness, and listening test score using data from a large-scale and high-stakes language test among Chinese first-year undergraduates. Structural equation modeling was used to examine the mediating effect of listening metacognitive awareness on the relationship between expectancy, importance, interest, listening anxiety, and listening test score. According to the results, test takers' listening scores can be predicted by expectancy, interest, and listening anxiety significantly. The relationship between expectancy, interest, listening anxiety, and listening test score was mediated by listening metacognitive awareness. The findings have implications for test takers to improve their test taking motivation and listening metacognitive awareness, as well as for L2 teachers to intervene in L2 listening classrooms.
The Mediating Effect of Listening Metacognitive Awareness between Test-Taking Motivation and Listening Test Score: An Expectancy-Value Theory Approach

Directory of Open Access Journals (Sweden)

Jian Xu

2017-12-01

Full Text Available The present study investigated test-taking motivation in L2 listening testing context by applying Expectancy-Value Theory as the framework. Specifically, this study was intended to examine the complex relationships among expectancy, importance, interest, listening anxiety, listening metacognitive awareness, and listening test score using data from a large-scale and high-stakes language test among Chinese first-year undergraduates. Structural equation modeling was used to examine the mediating effect of listening metacognitive awareness on the relationship between expectancy, importance, interest, listening anxiety, and listening test score. According to the results, test takers’ listening scores can be predicted by expectancy, interest, and listening anxiety significantly. The relationship between expectancy, interest, listening anxiety, and listening test score was mediated by listening metacognitive awareness. The findings have implications for test takers to improve their test taking motivation and listening metacognitive awareness, as well as for L2 teachers to intervene in L2 listening classrooms.
The Performance of the Upper Limb scores correlate with pulmonary function test measures and Egen Klassifikation scores in Duchenne muscular dystrophy.

Science.gov (United States)

Lee, Ha Neul; Sawnani, Hemant; Horn, Paul S; Rybalsky, Irina; Relucio, Lani; Wong, Brenda L

2016-01-01

The Performance of the Upper Limb scale was developed as an outcome measure specifically for ambulant and non-ambulant patients with Duchenne muscular dystrophy and is implemented in clinical trials needing longitudinal data. The aim of this study is to determine whether this novel tool correlates with functional ability using pulmonary function test, cardiac function test and Egen Klassifikation scale scores as clinical measures. In this cross-sectional study, 43 non-ambulatory Duchenne males from ages 10 to 30 years and on long-term glucocorticoid treatment were enrolled. Cardiac and pulmonary function test results were analyzed to assess cardiopulmonary function, and Egen Klassifikation scores were analyzed to assess functional ability. The Performance of the Upper Limb scores correlated with pulmonary function measures and had inverse correlation with Egen Klassifikation scores. There was no correlation with left ventricular ejection fraction and left ventricular dysfunction. Body mass index and decreased joint range of motion affected total Performance of the Upper Limb scores and should be considered in clinical trial designs. Copyright © 2016 Elsevier B.V. All rights reserved.
Improving personality facet scores with multidimensional computer adaptive testing

DEFF Research Database (Denmark)

Makransky, Guido; Mortensen, Erik Lykke; Glas, Cees A W

2013-01-01

personality tests contain many highly correlated facets. This article investigates the possibility of increasing the precision of the NEO PI-R facet scores by scoring items with multidimensional item response theory and by efficiently administering and scoring items with multidimensional computer adaptive...
Reporting Diagnostic Scores in Educational Testing: Temptations, Pitfalls, and Some Solutions

Science.gov (United States)

Sinharay, Sandip; Puhan, Gautam; Haberman, Shelby J.

2010-01-01

Diagnostic scores are of increasing interest in educational testing due to their potential remedial and instructional benefit. Naturally, the number of educational tests that report diagnostic scores is on the rise, as are the number of research publications on such scores. This article provides a critical evaluation of diagnostic score reporting…
Testing statistical significance scores of sequence comparison methods with structure similarity

Directory of Open Access Journals (Sweden)

Leunissen Jack AM

2006-10-01

Full Text Available Abstract Background In the past years the Smith-Waterman sequence comparison algorithm has gained popularity due to improved implementations and rapidly increasing computing power. However, the quality and sensitivity of a database search is not only determined by the algorithm but also by the statistical significance testing for an alignment. The e-value is the most commonly used statistical validation method for sequence database searching. The CluSTr database and the Protein World database have been created using an alternative statistical significance test: a Z-score based on Monte-Carlo statistics. Several papers have described the superiority of the Z-score as compared to the e-value, using simulated data. We were interested if this could be validated when applied to existing, evolutionary related protein sequences. Results All experiments are performed on the ASTRAL SCOP database. The Smith-Waterman sequence comparison algorithm with both e-value and Z-score statistics is evaluated, using ROC, CVE and AP measures. The BLAST and FASTA algorithms are used as reference. We find that two out of three Smith-Waterman implementations with e-value are better at predicting structural similarities between proteins than the Smith-Waterman implementation with Z-score. SSEARCH especially has very high scores. Conclusion The compute intensive Z-score does not have a clear advantage over the e-value. The Smith-Waterman implementations give generally better results than their heuristic counterparts. We recommend using the SSEARCH algorithm combined with e-values for pairwise sequence comparisons.

Relationship between substances in seminal plasma and Acrobeads Test results.

Science.gov (United States)

Komori, Kazuhiko; Tsujimura, Akira; Okamoto, Yoshio; Matsuoka, Yasuhiro; Takao, Tetsuya; Miyagawa, Yasushi; Takada, Shingo; Nonomura, Norio; Okuyama, Akihiko

2009-01-01

To asses the effects of seminal plasma on sperm function. Retrospective case-control study. University hospital. One hundred fourteen infertile men. Acrobeads Test scores (0-4) and measurement of interleukin (IL)-6, soluble IL-6 receptor, epidermal growth factor, insulin-like growth factor-I (IGF-I), transforming growth factor-beta I, superoxide dismutase, calcitonin, and macrophage migration inhibitory factor (MIF) levels in seminal plasma. Kruskal-Wallis test to compare the concentrations of substances as a nonparametric test for differences among Acrobeads Test scores and a multivariable logistic regression model to find independent risk factors associated with abnormal Acrobeads Test results. The Acrobeads Test score was 0 for 7 samples, 1 for 20 samples, 2 for 18 samples, 3 for 28 samples, and 4 for 41 samples. Age, abstinence period, and semen parameters, except for sperm motility and percentage of sperm with abnormal morphology, had no effect on the Acrobeads Test results. Concentrations of IGF-I and MIF were significantly higher in patients with abnormal Acrobeads Test results. Multivariate analysis indicated that MIF and IGF-I were significantly associated with abnormal Acrobeads Test results (scores 0 to 1). Although further studies are needed, IGF-I and MIF in seminal plasma may have negative effects on sperm function.
A prognostic scoring system for arm exercise stress testing.

Science.gov (United States)

Xie, Yan; Xian, Hong; Chandiramani, Pooja; Bainter, Emily; Wan, Leping; Martin, Wade H

2016-01-01

Arm exercise stress testing may be an equivalent or better predictor of mortality outcome than pharmacological stress imaging for the ≥50% for patients unable to perform leg exercise. Thus, our objective was to develop an arm exercise ECG stress test scoring system, analogous to the Duke Treadmill Score, for predicting outcome in these individuals. In this retrospective observational cohort study, arm exercise ECG stress tests were performed in 443 consecutive veterans aged 64.1 (11.1) years. (mean (SD)) between 1997 and 2002. From multivariate Cox models, arm exercise scores were developed for prediction of 5-year and 12-year all-cause and cardiovascular mortality and 5-year cardiovascular mortality or myocardial infarction (MI). Arm exercise capacity in resting metabolic equivalents (METs), 1 min heart rate recovery (HRR) and ST segment depression ≥1 mm were the stress test variables independently associated with all-cause and cardiovascular mortality by step-wise Cox analysis (all pstatistic of 0.81 before and 0.88 after adjustment for significant demographic and clinical covariates. Arm exercise scores for the other outcome end points yielded C-statistic values of 0.77-0.79 before and 0.82-0.86 after adjustment for significant covariates versus 0.64-0.72 for best fit pharmacological myocardial perfusion imaging models in a cohort of 1730 veterans who were evaluated over the same time period. Arm exercise scores, analogous to the Duke Treadmill Score, have good power for prediction of mortality or MI in patients who cannot perform leg exercise.
Increased correlation coefficient between the written test score and tutors' performance test scores after training of tutors for assessment of medical students during problem-based learning course in Malaysia.

Science.gov (United States)

Jaiprakash, Heethal; Min, Aung Ko Ko; Ghosh, Sarmishtha

2016-03-01

This paper is aimed at finding if there was a change of correlation between the written test score and tutors' performance test scores in the assessment of medical students during a problem-based learning (PBL) course in Malaysia. This is a cross-sectional observational study, conducted among 264 medical students in two groups from November 2010 to November 2012. The first group's tutors did not receive tutor training; while the second group's tutors were trained in the PBL process. Each group was divided into high, middle and low achievers based on their end-of-semester exam scores. PBL scores were taken which included written test scores and tutors' performance test scores. Pearson correlation coefficient was calculated between the two kinds of scores in each group. The correlation coefficient between the written scores and tutors' scores in group 1 was 0.099 (pcorrelation coefficient in the group where tutors received the PBL training reinforces the importance of tutor training before their participation in the PBL course.
Effects of Classroom Ventilation Rate and Temperature on Students' Test Scores.

Science.gov (United States)

Haverinen-Shaughnessy, Ulla; Shaughnessy, Richard J

2015-01-01

Using a multilevel approach, we estimated the effects of classroom ventilation rate and temperature on academic achievement. The analysis is based on measurement data from a 70 elementary school district (140 fifth grade classrooms) from Southwestern United States, and student level data (N = 3109) on socioeconomic variables and standardized test scores. There was a statistically significant association between ventilation rates and mathematics scores, and it was stronger when the six classrooms with high ventilation rates that were indicated as outliers were filtered (> 7.1 l/s per person). The association remained significant when prior year test scores were included in the model, resulting in less unexplained variability. Students' mean mathematics scores (average 2286 points) were increased by up to eleven points (0.5%) per each liter per second per person increase in ventilation rate within the range of 0.9-7.1 l/s per person (estimated effect size 74 points). There was an additional increase of 12-13 points per each 1°C decrease in temperature within the observed range of 20-25°C (estimated effect size 67 points). Effects of similar magnitude but higher variability were observed for reading and science scores. In conclusion, maintaining adequate ventilation and thermal comfort in classrooms could significantly improve academic achievement of students.
ANOVA Analysis of Student Daily Test Scores in Multi-Day Test Periods

Science.gov (United States)

Mouritsen, Matthew L.; Davis, Jefferson T.; Jones, Steven C.

2016-01-01

Instructors are often concerned when giving multiple-day tests because students taking the test later in the exam period may have an advantage over students taking the test early in the exam period due to information leakage. However, exam scores seemed to decline as students took the same test later in a multi-day exam period (Mouritsen and…
The Effect of Mock Tests on Iranian EFL learners’ Test Scores

OpenAIRE

Hossein Khodabakhshzadeh; Reza Zardkanloo

2016-01-01

The effect of using tests in test preparation courses has been subject to debate. While some scholars such as Yang and Badger (2015) believe it is a cause of positive washback effect, others argue that this issue is tentative and context-bound (Green, 2007). Therefore, this study investigated the effect of using Mock tests in International English Language Testing System (IELTS) preparation courses on students’ overall IELTS scores. Fifty one IELTS students were selected non-randomly through ...
Increased correlation coefficient between the written test score and tutors’ performance test scores after training of tutors for assessment of medical students during problem-based learning course in Malaysia

Directory of Open Access Journals (Sweden)

Heethal Jaiprakash

2016-03-01

Full Text Available This paper is aimed at finding if there was a change of correlation between the written test score and tutors’ performance test scores in the assessment of medical students during a problem-based learning (PBL course in Malaysia. This is a cross-sectional observational study, conducted among 264 medical students in two groups from November 2010 to November 2012. The first group’s tutors did not receive tutor training; while the second group’s tutors were trained in the PBL process. Each group was divided into high, middle and low achievers based on their end-of-semester exam scores. PBL scores were taken which included written test scores and tutors’ performance test scores. Pearson correlation coefficient was calculated between the two kinds of scores in each group. The correlation coefficient between the written scores and tutors’ scores in group 1 was 0.099 (p<0.001 and for group 2 was 0.305 (p<0.001. The higher correlation coefficient in the group where tutors received the PBL training reinforces the importance of tutor training before their participation in the PBL course.
Individual Differences in Digit Span, Susceptibility to Proactive Interference, and Aptitude/Achievement Test Scores.

Science.gov (United States)

Dempster, Frank N.; Cooney, John B.

1982-01-01

Individual differences in digit span, susceptibility to proactive interference, and various aptitude/achievement test scores were investigated in two experiments with college students. Results indicated that digit span was strongly correlated with aptitude/achievement scores, but did not indicate that susceptibility to proactive interference…
Effects of Classroom Ventilation Rate and Temperature on Students' Test Scores.

Directory of Open Access Journals (Sweden)

Ulla Haverinen-Shaughnessy

Full Text Available Using a multilevel approach, we estimated the effects of classroom ventilation rate and temperature on academic achievement. The analysis is based on measurement data from a 70 elementary school district (140 fifth grade classrooms from Southwestern United States, and student level data (N = 3109 on socioeconomic variables and standardized test scores. There was a statistically significant association between ventilation rates and mathematics scores, and it was stronger when the six classrooms with high ventilation rates that were indicated as outliers were filtered (> 7.1 l/s per person. The association remained significant when prior year test scores were included in the model, resulting in less unexplained variability. Students' mean mathematics scores (average 2286 points were increased by up to eleven points (0.5% per each liter per second per person increase in ventilation rate within the range of 0.9-7.1 l/s per person (estimated effect size 74 points. There was an additional increase of 12-13 points per each 1°C decrease in temperature within the observed range of 20-25°C (estimated effect size 67 points. Effects of similar magnitude but higher variability were observed for reading and science scores. In conclusion, maintaining adequate ventilation and thermal comfort in classrooms could significantly improve academic achievement of students.
Effects of Classroom Ventilation Rate and Temperature on Students’ Test Scores

Science.gov (United States)

2015-01-01

Using a multilevel approach, we estimated the effects of classroom ventilation rate and temperature on academic achievement. The analysis is based on measurement data from a 70 elementary school district (140 fifth grade classrooms) from Southwestern United States, and student level data (N = 3109) on socioeconomic variables and standardized test scores. There was a statistically significant association between ventilation rates and mathematics scores, and it was stronger when the six classrooms with high ventilation rates that were indicated as outliers were filtered (> 7.1 l/s per person). The association remained significant when prior year test scores were included in the model, resulting in less unexplained variability. Students’ mean mathematics scores (average 2286 points) were increased by up to eleven points (0.5%) per each liter per second per person increase in ventilation rate within the range of 0.9–7.1 l/s per person (estimated effect size 74 points). There was an additional increase of 12–13 points per each 1°C decrease in temperature within the observed range of 20–25°C (estimated effect size 67 points). Effects of similar magnitude but higher variability were observed for reading and science scores. In conclusion, maintaining adequate ventilation and thermal comfort in classrooms could significantly improve academic achievement of students. PMID:26317643
Decision making under internal uncertainty: the case of multiple-choice tests with different scoring rules.

Science.gov (United States)

Bereby-Meyer, Yoella; Meyer, Joachim; Budescu, David V

2003-02-01

This paper assesses framing effects on decision making with internal uncertainty, i.e., partial knowledge, by focusing on examinees' behavior in multiple-choice (MC) tests with different scoring rules. In two experiments participants answered a general-knowledge MC test that consisted of 34 solvable and 6 unsolvable items. Experiment 1 studied two scoring rules involving Positive (only gains) and Negative (only losses) scores. Although answering all items was the dominating strategy for both rules, the results revealed a greater tendency to answer under the Negative scoring rule. These results are in line with the predictions derived from Prospect Theory (PT) [Econometrica 47 (1979) 263]. The second experiment studied two scoring rules, which allowed respondents to exhibit partial knowledge. Under the Inclusion-scoring rule the respondents mark all answers that could be correct, and under the Exclusion-scoring rule they exclude all answers that might be incorrect. As predicted by PT, respondents took more risks under the Inclusion rule than under the Exclusion rule. The results illustrate that the basic process that underlies choice behavior under internal uncertainty and especially the effect of framing is similar to the process of choice under external uncertainty and can be described quite accurately by PT. Copyright 2002 Elsevier Science B.V.
Clock Drawing Test and the diagnosis of amnestic mild cognitive impairment: can more detailed scoring systems do the work?

Science.gov (United States)

Rubínová, Eva; Nikolai, Tomáš; Marková, Hana; Siffelová, Kamila; Laczó, Jan; Hort, Jakub; Vyhnálek, Martin

2014-01-01

The Clock Drawing Test is a frequently used cognitive screening test with several scoring systems in elderly populations. We compare simple and complex scoring systems and evaluate the usefulness of the combination of the Clock Drawing Test with the Mini-Mental State Examination to detect patients with mild cognitive impairment. Patients with amnestic mild cognitive impairment (n = 48) and age- and education-matched controls (n = 48) underwent neuropsychological examinations, including the Clock Drawing Test and the Mini-Mental State Examination. Clock drawings were scored by three blinded raters using one simple (6-point scale) and two complex (17- and 18-point scales) systems. The sensitivity and specificity of these scoring systems used alone and in combination with the Mini-Mental State Examination were determined. Complex scoring systems, but not the simple scoring system, were significant predictors of the amnestic mild cognitive impairment diagnosis in logistic regression analysis. At equal levels of sensitivity (87.5%), the Mini-Mental State Examination showed higher specificity (31.3%, compared with 12.5% for the 17-point Clock Drawing Test scoring scale). The combination of Clock Drawing Test and Mini-Mental State Examination scores increased the area under the curve (0.72; p Drawing Test did not differentiate between healthy elderly and patients with amnestic mild cognitive impairment in our sample. Complex scoring systems were slightly more efficient, yet still were characterized by high rates of false-positive results. We found psychometric improvement using combined scores from the Mini-Mental State Examination and the Clock Drawing Test when complex scoring systems were used. The results of this study support the benefit of using combined scores from simple methods.
The Weighted Airman Promotion System: Standardizing Test Scores

Science.gov (United States)

2008-01-01

u th o ri ze d Top 3/E6 ratio, inventory 1401206040 100 70 130 5R 2F 2G 3N 2M 2A 4J 4C 4P 4T 4B 1W 2T 3P 1T 4A 2S 5J 1A 1S1C 6F 4N 7S 4R 4E 1N 3A 3V...System: Standardizing Test Scores AFHRL convened a panel to identify the relevant factors to consider, and then sit as a promotion board and rank...Costs If the Air Force decided to standardize test scores, there would be three basic types of costs: implementation costs, marketing costs, and
Generalization of the Lord-Wingersky Algorithm to Computing the Distribution of Summed Test Scores Based on Real-Number Item Scores

Science.gov (United States)

Kim, Seonghoon

2013-01-01

With known item response theory (IRT) item parameters, Lord and Wingersky provided a recursive algorithm for computing the conditional frequency distribution of number-correct test scores, given proficiency. This article presents a generalized algorithm for computing the conditional distribution of summed test scores involving real-number item…
Online pre-race education improves test scores for volunteers at a marathon.

Science.gov (United States)

Maxwell, Shane; Renier, Colleen; Sikka, Robby; Widstrom, Luke; Paulson, William; Christensen, Trent; Olson, David; Nelson, Benjamin

2017-09-01

This study examined whether an online course would lead to increased knowledge about the medical issues volunteers encounter during a marathon. Health care professionals who volunteered to provide medical coverage for an annual marathon were eligible for the study. Demographic information about medical volunteers including profession, specialty, education level and number of marathons they had volunteered for was collected. A 15-question test about the most commonly encountered medical issues was created by the authors and administered before and after the volunteers took the online educational course and compared to a pilot study the previous year. Seventy-four subjects completed the pre-test. Those who participated in the pilot study last year (N = 15) had pre-test scores that were an average of 2.4 points higher than those who did not (mean ranks: pilot study = 51.6 vs. non-pilot = 33.9, p = 0.004). Of the 74 subjects who completed the pre-test, 54 also completed the post-test. The overall post-pre mean score difference was 3.8 ± 2.7 (t = 10.5 df = 53 p online education demonstrated a long-term (one-year) increase in test scores. Testing also continued to show short-term improvement in post-course test scores, compared to pre-course test scores. In general, marathon medical volunteers who had no volunteer experience demonstrated greater improvement than those who had prior volunteer experience.
Power and sample size evaluation for the Cochran-Mantel-Haenszel mean score (Wilcoxon rank sum) test and the Cochran-Armitage test for trend.

Science.gov (United States)

Lachin, John M

2011-11-10

The power of a chi-square test, and thus the required sample size, are a function of the noncentrality parameter that can be obtained as the limiting expectation of the test statistic under an alternative hypothesis specification. Herein, we apply this principle to derive simple expressions for two tests that are commonly applied to discrete ordinal data. The Wilcoxon rank sum test for the equality of distributions in two groups is algebraically equivalent to the Mann-Whitney test. The Kruskal-Wallis test applies to multiple groups. These tests are equivalent to a Cochran-Mantel-Haenszel mean score test using rank scores for a set of C-discrete categories. Although various authors have assessed the power function of the Wilcoxon and Mann-Whitney tests, herein it is shown that the power of these tests with discrete observations, that is, with tied ranks, is readily provided by the power function of the corresponding Cochran-Mantel-Haenszel mean scores test for two and R > 2 groups. These expressions yield results virtually identical to those derived previously for rank scores and also apply to other score functions. The Cochran-Armitage test for trend assesses whether there is an monotonically increasing or decreasing trend in the proportions with a positive outcome or response over the C-ordered categories of an ordinal independent variable, for example, dose. Herein, it is shown that the power of the test is a function of the slope of the response probabilities over the ordinal scores assigned to the groups that yields simple expressions for the power of the test. Copyright © 2011 John Wiley & Sons, Ltd.
Critique of the Watson-Glaser Critical Thinking Appraisal Test: The More You Know, the Lower Your Score

Directory of Open Access Journals (Sweden)

Kevin Possin

2014-12-01

Full Text Available The Watson-Glaser Critical Thinking Appraisal Test is one of the oldest, most frequently used, multiple-choice critical-thinking tests on the market in business, government, and legal settings for purposes of hiring and promotion. I demonstrate, however, that the test has serious construct-validity issues, stemming primarily from its ambiguous, unclear, misleading, and sometimes mysterious instructions, which have remained unaltered for decades. Erroneously scored items further diminish the test’s validity. As a result, having enhanced knowledge of formal and informal logic could well result in test subjects receiving lower scores on the test. That’s not how things should work for a CT assessment test.
Using Raters from India to Score a Large-Scale Speaking Test

Science.gov (United States)

Xi, Xiaoming; Mollaun, Pam

2011-01-01

We investigated the scoring of the Speaking section of the Test of English as a Foreign Language[TM] Internet-based (TOEFL iBT[R]) test by speakers of English and one or more Indian languages. We explored the extent to which raters from India, after being trained and certified, were able to score the TOEFL examinees with mixed first languages…
Group differences in the heritability of items and test scores

NARCIS (Netherlands)

Wicherts, J.M.; Johnson, W.

2009-01-01

It is important to understand potential sources of group differences in the heritability of intelligence test scores. On the basis of a basic item response model we argue that heritabilities which are based on dichotomous item scores normally do not generalize from one sample to the next. If groups
Testing the applicability of the SASS5 scoring procedure for ...

African Journals Online (AJOL)

A study was undertaken between 29th January and 17th February 2004 to test the applicability of the South African Scoring System Version 5 (SASS5) scoring and calculation procedure in nutrient-enriched palustrine wetlands in the midlands of KwaZulu-Natal, South Africa. Four reference wetlands and three dairy-effluent ...

International Test Score Comparisons and Educational Policy: A Review of the Critiques

Science.gov (United States)

Carnoy, Martin

2015-01-01

Stanford education professor Martin Carnoy examines four main critiques of how international test results are used in policymaking. Of particular interest are critiques of the policy analyses published by the Program for International Student Assessment (PISA). Using average PISA scores as a comparative measure of student achievement is misleading…
The Impact of the Use of Hierarchical Teaching on Test Scores of Students’ Technology

Directory of Open Access Journals (Sweden)

Zhao Guorong

2015-01-01

Full Text Available Test scores of students’ technology is the main basis for physical examination of college students’ physical, fitness evaluation based on test results. To change the view by the stratified teaching method consistent system of teaching mode, special movement technical level of students is improved significantly.
Does breastfeeding contribute to the racial gap in reading and math test scores?

Science.gov (United States)

Peters, Kristen E; Huang, Jin; Vaughn, Michael G; Witko, Christopher

2013-10-01

The aim of this study was to examine the impact of divergent breastfeeding practices between Caucasian and African American mothers on the lingering achievement test gap between Caucasian and African American children. The Child Development Supplement of the Panel Study of Income Dynamics, beginning in 1997, followed a cohort of 3563 children aged 0-12 years. Reading and math test scores from 2002 for 1928 children were linked with breastfeeding history. Regression analysis was used to examine associations between ever having been breastfed and duration of breastfeeding and test scores, controlling for characteristics of child, mother, and household. African American students scored significantly lower than Caucasian children by 10.6 and 10.9 points on reading and math tests, respectively. After accounting for the impact of having been breastfed during infancy, the racial test gap decreased by 17% for reading scores and 9% for math scores. Study findings indicate that breastfeeding explains 17% and 9% of the observed gaps in reading and math scores, respectively, between African Americans and Caucasians, an effect larger than most recent educational policy interventions. Renewed efforts around policies and clinical practices that promote and remove barriers for African American mothers to breastfeed should be implemented. Copyright © 2013 Elsevier Inc. All rights reserved.
Reduce, Reuse, Recycle: The Longitudinal Value of Local Cut Scores Using State Test Data

Science.gov (United States)

Nelson, Peter M.; Van Norman, Ethan R.; VanDerHeyden, Amanda

2017-01-01

We used existing reading (n = 1,498) and math (n = 2,260) data to evaluate state test scores for screening middle school students. In Phase 1, state test data were used to create a research-derived cut score that was optimal for predicting state test performance the following year. In Phase 2, those cut scores were applied with future cohorts.…
A weighted generalized score statistic for comparison of predictive values of diagnostic tests.

Science.gov (United States)

Kosinski, Andrzej S

2013-03-15

Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations that are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we presented, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic that incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, always reduces to the score statistic in the independent samples situation, and preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe that the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the WGS test statistic in a general GEE setting. Copyright © 2012 John Wiley & Sons, Ltd.
Do later wake times and increased sleep duration of 12th graders result in more studying, higher grades, and improved SAT/ACT test scores?

Science.gov (United States)

Cole, James S

2016-09-01

The aim of this study was to investigate the relationship between sleep duration, wake time, and hours studying on high school grades and performance on the Scholastic Aptitude Test (SAT)/ American College Testing (ACT) college entrance exams. Data were collected from 13,071 recently graduated high school seniors who were entering college in the fall of 2014. A column proportions z test with a Bonferroni adjustment was used to analyze proportional differences. Analysis of covariance (ANCOVA) was used to examine mean group differences. Students who woke up prior to 6 a.m. and got less than 8 h of sleep (27 %) were significantly more likely to report studying 11 or more hours per week (30 %), almost double the rate compared to students who got more than 8 h of sleep and woke up the latest (16 %). Post hoc results revealed students who woke up at 7 a.m. or later reported significantly higher high school grades than all other groups (p students who woke up between 6:01 a.m. and 7:00 a.m. and got eight or more hours of sleep. The highest reported SAT/ACT scores were from the group that woke up after 7 a.m. but got less than 8 h sleep (M = 1099.5). Their scores were significantly higher than all other groups. This study provides additional evidence that increased sleep and later wake time are associated with increased high school grades. However, this study also found that students who sleep the longest also reported less studying and lower SAT/ACT scores.
Effects of correcting for prematurity on cognitive test scores in childhood.

Science.gov (United States)

Wilson-Ching, Michelle; Pascoe, Leona; Doyle, Lex W; Anderson, Peter J

2014-03-01

The American Academy of Pediatrics recommends that test scores should be corrected for prematurity up to 3 years of age, but this practice varies greatly in both clinical and research settings. The aim of this study was to contrast the effects of using chronological age and those of using corrected age on measures of cognitive outcome across childhood. A theoretical model was constructed using norms from the Bayley Scales of Infant and Toddler Development, Third Edition; the Wechsler Preschool and Primary Scale of Intelligence, Third Edition Australian; and the Wechsler Intelligence Scales for Children, Fourth Edition Australian. Baseline scores representing different levels of functioning (70, below average; 85, borderline; and 100, average) were recalculated using the normative data for ages 6 months to 16 years to account for 1, 2, 3 and 4 months of prematurity. The model created depicted the difference in standardised scores between chronological and corrected age. Compared with scores corrected for prematurity, the absolute reduction in scores using chronological age was greater for increasing degree of prematurity, younger ages at assessment and higher baseline scores and was substantial even beyond 3 years of age. However, the pattern was erratic, with considerable fluctuation evident across different ages and baseline scores. Chronological age results in a lowering of scores at all ages for preterm-born subjects that is greater in the first few years and in those born at earlier gestational ages. Whether or not to correct for prematurity depends upon the context of the assessment. © 2014 The Authors. Journal of Paediatrics and Child Health © 2014 Paediatrics and Child Health Division (Royal Australasian College of Physicians).
The Dental Hygiene Aptitude Tests and the American College Testing Program Tests as Predictors of Scores on the National Board Dental Hygiene Examination.

Science.gov (United States)

Longenbecker, Sueann; Wood, Peter H.

1984-01-01

Scores from the National Board Dental Hygiene Examination (NBDHE) served as the criterion variable in a comparison of the predictive validity of the Dental Hygiene Aptitude Tests (DHAT) and the ACT Assessment tests. The DHAT-Science and Verbal tests combined to produce the highest multiple correlation with NBDHE scores. (Author/DWH)
Exploration of analysis methods for diagnostic imaging tests: problems with ROC AUC and confidence scores in CT colonography.

Science.gov (United States)

Mallett, Susan; Halligan, Steve; Collins, Gary S; Altman, Doug G

2014-01-01

Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC) for the evaluation of CT colonography for the detection of polyps, either with or without computer assisted detection. In a multireader multicase study of 10 readers and 107 cases we compared sensitivity and specificity, using radiological reporting of the presence or absence of polyps, to ROC AUC calculated from confidence scores concerning the presence of polyps. Both methods were assessed against a reference standard. Here we focus on five readers, selected to illustrate issues in design and analysis. We compared diagnostic measures within readers, showing that differences in results are due to statistical methods. Reader performance varied widely depending on whether sensitivity and specificity or ROC AUC was used. There were problems using confidence scores; in assigning scores to all cases; in use of zero scores when no polyps were identified; the bimodal non-normal distribution of scores; fitting ROC curves due to extrapolation beyond the study data; and the undue influence of a few false positive results. Variation due to use of different ROC methods exceeded differences between test results for ROC AUC. The confidence scores recorded in our study violated many assumptions of ROC AUC methods, rendering these methods inappropriate. The problems we identified will apply to other detection studies using confidence scores. We found sensitivity and specificity were a more reliable and clinically appropriate method to compare diagnostic tests.
Contributions of Hamstring Stiffness to Straight-Leg-Raise and Sit-and-Reach Test Scores.

Science.gov (United States)

Miyamoto, Naokazu; Hirata, Kosuke; Kimura, Noriko; Miyamoto-Mikami, Eri

2018-02-01

The passive straight-leg-raise (PSLR) and the sit-and-reach (SR) tests have been widely used to assess hamstring extensibility. However, it remains unclear to what extent hamstring stiffness (a measure of material properties) contributes to PSLR and SR test scores. Therefore, we aimed to clarify the relationship between hamstring stiffness and PSLR and SR scores using ultrasound shear wave elastography. Ninety-eight healthy subjects completed the study. Each subject completed PSLR testing, and classic and modified SR testing of the right leg. Muscle shear modulus of the biceps femoris, semitendinosus, and semimembranosus was quantified as an index of muscle stiffness. The relationships between shear modulus of each muscle and PSLR or SR scores were calculated using Pearson's product-moment correlation coefficients. Shear modulus of the semitendinosus and semimembranosus showed negative correlations with the two PSLR and two SR scores (absolute r value≤0.484). Shear modulus of the biceps femoris was significantly correlated with the PSLR score determined by the examiner and the modified SR score (absolute r value≤0.308). The present findings suggest that PSLR and SR test scores are strongly influenced by factors other than hamstring stiffness and therefore might not accurately evaluate hamstring stiffness. © Georg Thieme Verlag KG Stuttgart · New York.
Manual for Scoring the Test of Directed Imagination.

Science.gov (United States)

Veldman, Donald J.; And Others

A scoring manual for the Directed Imagination Test, a projective technique wherein the subject is instructed to write four fictional stories (four minutes are allowed for each) about teachers and their experiences, is presented. The manual provides detailed instructions for rating each story by fifteen dimensions relevant to teacher education…
Pediatric residents' learning styles and temperaments and their relationships to standardized test scores.

Science.gov (United States)

Tuli, Sanjeev Y; Thompson, Lindsay A; Saliba, Heidi; Black, Erik W; Ryan, Kathleen A; Kelly, Maria N; Novak, Maureen; Mellott, Jane; Tuli, Sonal S

2011-12-01

Board certification is an important professional qualification and a prerequisite for credentialing, and the Accreditation Council for Graduate Medical Education (ACGME) assesses board certification rates as a component of residency program effectiveness. To date, research has shown that preresidency measures, including National Board of Medical Examiners scores, Alpha Omega Alpha Honor Medical Society membership, or medical school grades poorly predict postresidency board examination scores. However, learning styles and temperament have been identified as factors that 5 affect test-taking performance. The purpose of this study is to characterize the learning styles and temperaments of pediatric residents and to evaluate their relationships to yearly in-service and postresidency board examination scores. This cross-sectional study analyzed the learning styles and temperaments of current and past pediatric residents by administration of 3 validated tools: the Kolb Learning Style Inventory, the Keirsey Temperament Sorter, and the Felder-Silverman Learning Style test. These results were compared with known, normative, general and medical population data and evaluated for correlation to in-service examination and postresidency board examination scores. The predominant learning style for pediatric residents was converging 44% (33 of 75 residents) and the predominant temperament was guardian 61% (34 of 56 residents). The learning style and temperament distribution of the residents was significantly different from published population data (P = .002 and .04, respectively). Learning styles, with one exception, were found to be unrelated to standardized test scores. The predominant learning style and temperament of pediatric residents is significantly different than that of the populations of general and medical trainees. However, learning styles and temperament do not predict outcomes on standardized in-service and board examinations in pediatric residents.
Measurement of ability emotional intelligence: results for two new tests.

Science.gov (United States)

Austin, Elizabeth J

2010-08-01

Emotional intelligence (EI) has attracted considerable interest amongst both individual differences researchers and those in other areas of psychology who are interested in how EI relates to criteria such as well-being and career success. Both trait (self-report) and ability EI measures have been developed; the focus of this paper is on ability EI. The associations of two new ability EI tests with psychometric intelligence, emotion perception, and the Mayer-Salovey-Caruso EI test (MSCEIT) were examined. The new EI tests were the Situational Test of Emotion Management (STEM) and the Situational Test of Emotional Understanding (STEU). Only the STEU and the MSCEIT Understanding Emotions branch were significantly correlated with psychometric intelligence, suggesting that only understanding emotions can be regarded as a candidate new intelligence component. These understanding emotions tests were also positively correlated with emotion perception tests, and STEM and STEU scores were positively correlated with MSCEIT total score and most branch scores. Neither the STEM nor the STEU were significantly correlated with trait EI tests, confirming the distinctness of trait and ability EI. Taking the present results as a starting-point, approaches to the development of new ability EI tests and models of EI are suggested.
AP Trends: Tests Soar, Scores Slip--Gaps between Groups Spur Equity Concerns

Science.gov (United States)

Cech, Scott J.

2008-01-01

More students are taking Advanced Placement tests, but the proportion of tests receiving what is deemed a passing score has dipped, and the mean score is down for the fourth year in a row. Data released here this week by the New York City-based nonprofit organization that owns the AP brand shows that a greater-than-ever proportion of students…
The Impact of the 2004 Hurricanes on Florida Comprehensive Assessment Test Scores: Implications for School Counselors

Science.gov (United States)

Baggerly, Jennifer; Ferretti, Larissa K.

2008-01-01

What is the impact of natural disasters on students' statewide assessment scores? To answer this question, Florida Comprehensive Assessment Test (FCAT) scores of 55,881 students in grades 4 through 10 were analyzed to determine if there were significant decreases after the 2004 hurricanes. Results reveal that there was statistical but no practical…
Validity of GRE General Test Scores and TOEFL Scores for Graduate Admission to a Technical University in Western Europe

Science.gov (United States)

Zimmermann, Judith; von Davier, Alina A.; Buhmann, Joachim M.; Heinimann, Hans R.

2018-01-01

Graduate admission has become a critical process in tertiary education, whereby selecting valid admissions instruments is key. This study assessed the validity of Graduate Record Examination (GRE) General Test scores for admission to Master's programmes at a technical university in Europe. We investigated the indicative value of GRE scores for the…
Impact of Answer-Switching Behavior on Multiple-Choice Test Scores in Higher Education

Directory of Open Access Journals (Sweden)

Ramazan BAŞTÜRK

2011-06-01

Full Text Available The multiple- choice format is one of the most popular selected-response item formats used in educational testing. Researchers have shown that Multiple-choice type test is a useful vehicle for student assessment in core university subjects that usually have large student numbers. Even though the educators, test experts and different test recourses maintain the idea that the first answer should be retained, many researchers argued that this argument is not dependent with empirical findings. The main question of this study is to examine how the answer switching behavior affects the multiple-choice test score. Additionally, gender differences and relationship between number of answer switching behavior and item parameters (item difficulty and item discrimination were investigated. The participants in this study consisted of 207 upper-level College of Education students from mid-sized universities. A Midterm exam consisted of 20 multiple-choice questions was used. According to the result of this study, answer switching behavior statistically increase test scores. On the other hand, there is no significant gender difference in answer-switching behavior. Additionally, there is a significant negative relationship between answer switching behavior and item difficulties.
The Impact of Correction for Guessing Formula on MC and Yes/No Vocabulary Tests' Scores

Directory of Open Access Journals (Sweden)

abdollah baradaran

2009-10-01

Full Text Available A standard correction for random guessing (cfg formula on multiple-choice and Yes/Noexaminations was examined retrospectively in the scores of the intermediate female EFL learners in an English language school. The correctionwas a weighting formula for points awarded for correct answers,incorrect answers, and unanswered questions so that the expectedvalue of the increase in test score due to guessing was zero. The researcher compared uncorrected and corrected scores on examinationsusing multiple-choice and Yes/No formats. These short-answer formats eliminatedor at least greatly reduced the potential for guessing the correctanswer. The expectation for students to improve their grade by guessingon multiple-choice and Yes/No format examinations is well known. The researcher examined a method for correcting for random guessing (cfg " no knowledge" on multiple- choice and Yes/No vocabulary examinations by comparing application and non-application of correction for guessing (cfg formula on scores on these examinations. It was done to determine whether the test takers really knew the correct answer, or they had resorted to a kind of guessing. This study represented a unique opportunity to compare scores from multiple-choice and Yes/No examinations in a settingin which students were given the same number of questions ineach of the two format types testing their knowledge over thesame subject matter. The results of this study indicated that the significant differences were highlighted between the subjects' scores when cfg formula was applied and when it was not.
[Relationship between unipedal stance test score and center of pressure velocity in elderly].

Science.gov (United States)

Rodrigo Antonio, Guzmán; Rony, Silvestre; Francisco Aniceto, Rodríguez; David Andrés, Arriagada; Pablo Andrés, Ortega

2011-01-01

Frequent falls are one of the most important health problems in the elderly population. The unipedal stance test (UPST), asses postural stability and is used in fall risk measures. Despite this, there is little information about its relationship with posturographic parameters (PP) that characterizes postural stability. Center of pressure velocity (CoPV) is one of the best PP that describes postural stability. The aim of this study was to analyze the relation between UST score and CoPV in elderly population. A sample of 38 healthy elderly subjects where divided in two groups according to their UPST score, low performance (LP, n=11) and high performance (HP, n=27). The correlation between UPST score and COP mean velocity (CoPmV), recorded from a posturographic test, was analyzed between both groups. An inverse correlation between UPST score and CoPmV was found in both groups. However, this was higher in the LP group (r=-0.69, P=.02) compared to the HP (r=-0.39, P=.04). Based on the results of this investigation, it may be concluded that the achievement on UPST has an inverse relationship with CoPmV, especially in subjects with low performance in the UPST. Copyright © 2010 SEGG. Published by Elsevier Espana. All rights reserved.
The TAIGA timing array HiSCORE - first results

Directory of Open Access Journals (Sweden)

Tluczykont M.

2017-01-01

Full Text Available Observations of gamma rays up to several 100 TeV are particularly important to spectrally resolve the cutoff regime of the long-sought Pevatrons, the cosmic-ray PeV accelerators. One component of the TAIGA hybrid detector is the TAIGA-HiSCORE timing array, which currently consists of 28 wide angle (0.6 sr air Cherenkov timing stations distributed on an area of 0.25 km2. The HiSCORE concept is based on (non-imaging air shower front sampling with Cherenkov light. First results are presented.

A Maturing Global Testing Regime Meets the World Economy: Test Scores and Economic Growth, 1960-2012

Science.gov (United States)

Kamens, David H.

2015-01-01

This article considers the growth of the international testing regime. It discusses sources of growth and empirically examines two related sets of issues: (1) the stability of countries' achievement scores, and (2) the influence of those national scores on subsequent economic development over different time lags. The article suggests that…
Psychometric Properties of Raw and Scale Scores on Mixed-Format Tests

Science.gov (United States)

Kolen, Michael J.; Lee, Won-Chan

2011-01-01

This paper illustrates that the psychometric properties of scores and scales that are used with mixed-format educational tests can impact the use and interpretation of the scores that are reported to examinees. Psychometric properties that include reliability and conditional standard errors of measurement are considered in this paper. The focus is…
The Apgar score has survived the test of time.

Science.gov (United States)

Finster, Mieczyslaw; Wood, Margaret

2005-04-01

In 1953, Virginia Apgar, M.D. published her proposal for a new method of evaluation of the newborn infant. The avowed purpose of this paper was to establish a simple and clear classification of newborn infants which can be used to compare the results of obstetric practices, types of maternal pain relief and the results of resuscitation. Having considered several objective signs pertaining to the condition of the infant at birth she selected five that could be evaluated and taught to the delivery room personnel without difficulty. These signs were heart rate, respiratory effort, reflex irritability, muscle tone and color. Sixty seconds after the complete birth of the baby a rating of zero, one or two was given to each sign, depending on whether it was absent or present. Virginia Apgar reviewed anesthesia records of 1025 infants born alive at Columbia Presbyterian Medical Center during the period of this report. All had been rated by her method. Infants in poor condition scored 0-2, infants in fair condition scored 3-7, while scores 8-10 were achieved by infants in good condition. The most favorable score 1 min after birth was obtained by infants delivered vaginally with the occiput the presenting part (average 8.4). Newborns delivered by version and breech extraction had the lowest score (average 6.3). Infants delivered by cesarean section were more vigorous (average score 8.0) when spinal was the method of anesthesia versus an average score of 5.0 when general anesthesia was used. Correlating the 60 s score with neonatal mortality, Virginia found that mature infants receiving 0, 1 or 2 scores had a neonatal death rate of 14%; those scoring 3, 4, 5, 6 or 7 had a death rate of 1.1%; and those in the 8-10 score group had a death rate of 0.13%. She concluded that the prognosis of an infant is excellent if he receives one of the upper three scores, and poor if one of the lowest three scores.
Test Scores, Class Rank and College Performance: Lessons for Broadening Access and Promoting Success.

Science.gov (United States)

Niu, Sunny X; Tienda, Marta

2012-04-01

Using administrative data for five Texas universities that differ in selectivity, this study evaluates the relative influence of two key indicators for college success-high school class rank and standardized tests. Empirical results show that class rank is the superior predictor of college performance and that test score advantages do not insulate lower ranked students from academic underperformance. Using the UT-Austin campus as a test case, we conduct a simulation to evaluate the consequences of capping students admitted automatically using both achievement metrics. We find that using class rank to cap the number of students eligible for automatic admission would have roughly uniform impacts across high schools, but imposing a minimum test score threshold on all students would have highly unequal consequences by greatly reduce the admission eligibility of the highest performing students who attend poor high schools while not jeopardizing admissibility of students who attend affluent high schools. We discuss the implications of the Texas admissions experiment for higher education in Europe.
Optimal Scoring Methods of Hand-Strength Tests in Patients with Stroke

Science.gov (United States)

Huang, Sheau-Ling; Hsieh, Ching-Lin; Lin, Jau-Hong; Chen, Hui-Mei

2011-01-01

The purpose of this study was to determine the optimal scoring methods for measuring strength of the more-affected hand in patients with stroke by examining the effect of reducing measurement errors. Three hand-strength tests of grip, palmar pinch, and lateral pinch were administered at two sessions in 56 patients with stroke. Five scoring methods…
A Case Study About Why It Can Be Difficult To Test Whether Propensity Score Analysis Works in Field Experiments

Directory of Open Access Journals (Sweden)

Thomas D. Cook

2012-01-01

Full Text Available Peikes, Moreno and Orzol (2008 sensibly caution researchers that propensity score analysis may not lead to valid causal inference in field applications. But at the same time, they made the far stronger claim to have performed an ideal test of whether propensity score matching in quasi-experimental data is capable of approximating the results of a randomized experiment in their dataset, and that this ideal test showed that such matching could not do so. In this article we show that their study does not support that conclusion because it failed to meet a number of basic criteria for an ideal test. By implication, many other purported tests of the effectiveness of propensity score analysis probably also fail to meet these criteria, and are therefore questionable contributions to the literature on the effects of propensity score analysis.
A quality score for coronary artery tree extraction results

Science.gov (United States)

Cao, Qing; Broersen, Alexander; Kitslaar, Pieter H.; Lelieveldt, Boudewijn P. F.; Dijkstra, Jouke

2018-02-01

Coronary artery trees (CATs) are often extracted to aid the fully automatic analysis of coronary artery disease on coronary computed tomography angiography (CCTA) images. Automatically extracted CATs often miss some arteries or include wrong extractions which require manual corrections before performing successive steps. For analyzing a large number of datasets, a manual quality check of the extraction results is time-consuming. This paper presents a method to automatically calculate quality scores for extracted CATs in terms of clinical significance of the extracted arteries and the completeness of the extracted CAT. Both right dominant (RD) and left dominant (LD) anatomical statistical models are generated and exploited in developing the quality score. To automatically determine which model should be used, a dominance type detection method is also designed. Experiments are performed on the automatically extracted and manually refined CATs from 42 datasets to evaluate the proposed quality score. In 39 (92.9%) cases, the proposed method is able to measure the quality of the manually refined CATs with higher scores than the automatically extracted CATs. In a 100-point scale system, the average scores for automatically and manually refined CATs are 82.0 (+/-15.8) and 88.9 (+/-5.4) respectively. The proposed quality score will assist the automatic processing of the CAT extractions for large cohorts which contain both RD and LD cases. To the best of our knowledge, this is the first time that a general quality score for an extracted CAT is presented.
Zertifikat Deutsch als Fremdsprache and the Oral Proficiency Interview: A Comparison of Test Scores and Examinations.

Science.gov (United States)

Lalande, John F.; Schweckendiek, Jurgen

1986-01-01

Investigates what correlations might exist between an individual's score on the Zertifikat Deutsch als Fremdsprache and on the Oral Proficiency Interview. The tests themselves are briefly described. Results indicate that the two tests appear to correlate well in their evaluation of speaking skills. (SED)
Postpartum body condition score and results from the first test day milk as predictors of disease, fertility, yield, and culling in commercial dairy herds.

Science.gov (United States)

Heuer, C; Schukken, Y H; Dobbelaar, P

1999-02-01

The study used field data from a regular herd health service to investigate the relationships between body condition scores or first test day milk data and disease incidence, milk yield, fertility, and culling. Path model analysis with adjustment for time at risk was applied to delineate the time sequence of events. Milk fever occurred more often in fat cows, and endometritis occurred between calving and 20 d of lactation more often in thin cows. Fat cows were less likely to conceive at first service than were cows in normal condition. Fat body condition postpartum, higher first test day milk yield, and a fat to protein ratio of > 1.5 increased body condition loss. Fat or thin condition or condition loss was not related to other lactation diseases, fertility parameters, milk yield, or culling. First test day milk yield was 1.3 kg higher after milk fever and was 7.1 kg lower after displaced abomasum. Higher first test day milk yield directly increased the risk of ovarian cyst and lameness, increased 100-d milk yield, and reduced the risk of culling and indirectly decreased reproductive performance. Cows with a fat to protein ratio of > 1.5 had higher risks for ketosis, displaced abomasum, ovarian cyst, lameness, and mastitis. Those cows produced more milk but showed poor reproductive performance. Given this type of herd health data, we concluded that the first test day milk yield and the fat to protein ratio were more reliable indicators of disease, fertility, and milk yield than was body condition score or loss of body condition score.
High Test Scores: The Wrong Road to National Economic Success

Science.gov (United States)

Baker, Keith

2011-01-01

A widely held view is that good schools are essential to a nation's international economic success and that high test scores on international tests of academic skills and knowledge indicate how good a nation's schools are. The widespread belief that good schools are an important contributor to a nation's economic success in the world is supported…
The effect of instructional methodology on high school students natural sciences standardized tests scores

Science.gov (United States)

Powell, P. E.

Educators have recently come to consider inquiry based instruction as a more effective method of instruction than didactic instruction. Experience based learning theory suggests that student performance is linked to teaching method. However, research is limited on inquiry teaching and its effectiveness on preparing students to perform well on standardized tests. The purpose of the study to investigate whether one of these two teaching methodologies was more effective in increasing student performance on standardized science tests. The quasi experimental quantitative study was comprised of two stages. Stage 1 used a survey to identify teaching methods of a convenience sample of 57 teacher participants and determined level of inquiry used in instruction to place participants into instructional groups (the independent variable). Stage 2 used analysis of covariance (ANCOVA) to compare posttest scores on a standardized exam by teaching method. Additional analyses were conducted to examine the differences in science achievement by ethnicity, gender, and socioeconomic status by teaching methodology. Results demonstrated a statistically significant gain in test scores when taught using inquiry based instruction. Subpopulation analyses indicated all groups showed improved mean standardized test scores except African American students. The findings benefit teachers and students by presenting data supporting a method of content delivery that increases teacher efficacy and produces students with a greater cognition of science content that meets the school's mission and goals.
A Case Study About Why It Can Be Difficult To Test Whether Propensity Score Analysis Works in Field Experiments

Directory of Open Access Journals (Sweden)

William R. Shadish

2013-02-01

Full Text Available Peikes, Moreno and Orzol (2008 sensibly caution researchers that propensity score analysis may not lead to valid causal inference in field applications. But at the same time, they made the far stronger claim to have performed an ideal test of whether propensity score matching in quasi-experimental data is capable of approximating the results of a randomized experiment in their dataset, and that this ideal test showed that such matching could not do so. In this article we show that their study does not support that conclusion because it failed to meet a number of basic criteria for an ideal test. By implication, many other purported tests of the effectiveness of propensity score analysis probably also fail to meet these criteria, and are therefore questionable contributions to the literature on the effects of propensity score analysis. DOI: 10.2458/azu_jmmss.v3i2.16475
The Evolution of the Black-White Test Score Gap in Grades K-3: The Fragility of Results. NBER Working Paper No. 17960

Science.gov (United States)

Bond, Timothy N.; Lang, Kevin

2012-01-01

Although both economists and psychometricians typically treat them as interval scales, test scores are reported using ordinal scales. Using the Early Childhood Longitudinal Study and the Children of the National Longitudinal Survey, we examine the effect of order-preserving scale transformations on the evolution of the black-white reading test…
Reformulation of the Children's Eating Attitudes Test (ChEAT): factor structure and scoring method in a non-clinical population.

Science.gov (United States)

Anton, S D; Han, H; Newton, R L; Martin, C K; York-Crowe, E; Stewart, T M; Williamson, D A

2006-12-01

The primary aims of this study were to empirically test the factor structure of the Children's Eating Attitudes Test (ChEAT) through both exploratory and confirmatory factor analyses and to interpret the factor structure of the ChEAT within the context of a new scoring method. The ChEAT was administered to 728 children in the 2nd through 6th grades (from five schools) at two different time points. Exactly half the students were male and half were female. To the best of our knowledge, this is the first study to empirically test the merits of an alternative 6-point scoring system as compared to the traditionally used 4-point scoring system. With the new scoring procedure, the skewness for all factor scores decreased, which resulted in increased variance in the item scores, as well as the total ChEAT score. Since the internal consistency of two factors in a recently proposed model was not acceptable (ChEAT reported by previous investigations. Intercorrelations among the factors suggested three higher order constructs. These findings indicate that the ChEAT subscales may be sufficiently stable to allow use in non-clinical samples of children.
A knowledge-based theory of rising scores on "culture-free" tests.

Science.gov (United States)

Fox, Mark C; Mitchum, Ainsley L

2013-08-01

Secular gains in intelligence test scores have perplexed researchers since they were documented by Flynn (1984, 1987). Gains are most pronounced on abstract, so-called culture-free tests, prompting Flynn (2007) to attribute them to problem-solving skills availed by scientifically advanced cultures. We propose that recent-born individuals have adopted an approach to analogy that enables them to infer higher level relations requiring roles that are not intrinsic to the objects that constitute initial representations of items. This proposal is translated into item-specific predictions about differences between cohorts in pass rates and item-response patterns on the Raven's Matrices (Flynn, 1987), a seemingly culture-free test that registers the largest Flynn effect. Consistent with predictions, archival data reveal that individuals born around 1940 are less able to map objects at higher levels of relational abstraction than individuals born around 1990. Polytomous Rasch models verify predicted violations of measurement invariance, as raw scores are found to underestimate the number of analogical rules inferred by members of the earlier cohort relative to members of the later cohort who achieve the same overall score. The work provides a plausible cognitive account of the Flynn effect, furthers understanding of the cognition of matrix reasoning, and underscores the need to consider how test-takers select item responses. PsycINFO Database Record (c) 2013 APA, all rights reserved.
A Latent Class Approach to Estimating Test-Score Reliability

Science.gov (United States)

van der Ark, L. Andries; van der Palm, Daniel W.; Sijtsma, Klaas

2011-01-01

This study presents a general framework for single-administration reliability methods, such as Cronbach's alpha, Guttman's lambda-2, and method MS. This general framework was used to derive a new approach to estimating test-score reliability by means of the unrestricted latent class model. This new approach is the latent class reliability…
Computerized scoring algorithms for the Autobiographical Memory Test.

Science.gov (United States)

Takano, Keisuke; Gutenbrunner, Charlotte; Martens, Kris; Salmon, Karen; Raes, Filip

2018-02-01

Reduced specificity of autobiographical memories is a hallmark of depressive cognition. Autobiographical memory (AM) specificity is typically measured by the Autobiographical Memory Test (AMT), in which respondents are asked to describe personal memories in response to emotional cue words. Due to this free descriptive responding format, the AMT relies on experts' hand scoring for subsequent statistical analyses. This manual coding potentially impedes research activities in big data analytics such as large epidemiological studies. Here, we propose computerized algorithms to automatically score AM specificity for the Dutch (adult participants) and English (youth participants) versions of the AMT by using natural language processing and machine learning techniques. The algorithms showed reliable performances in discriminating specific and nonspecific (e.g., overgeneralized) autobiographical memories in independent testing data sets (area under the receiver operating characteristic curve > .90). Furthermore, outcome values of the algorithms (i.e., decision values of support vector machines) showed a gradient across similar (e.g., specific and extended memories) and different (e.g., specific memory and semantic associates) categories of AMT responses, suggesting that, for both adults and youth, the algorithms well capture the extent to which a memory has features of specific memories. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Sequential Neighborhood Effects: The Effect of Long-Term Exposure to Concentrated Disadvantage on Children's Reading and Math Test Scores.

Science.gov (United States)

Hicks, Andrew L; Handcock, Mark S; Sastry, Narayan; Pebley, Anne R

2018-02-01

Prior research has suggested that children living in a disadvantaged neighborhood have lower achievement test scores, but these studies typically have not estimated causal effects that account for neighborhood choice. Recent studies used propensity score methods to account for the endogeneity of neighborhood exposures, comparing disadvantaged and nondisadvantaged neighborhoods. We develop an alternative propensity function approach in which cumulative neighborhood effects are modeled as a continuous treatment variable. This approach offers several advantages. We use our approach to examine the cumulative effects of neighborhood disadvantage on reading and math test scores in Los Angeles. Our substantive results indicate that recency of exposure to disadvantaged neighborhoods may be more important than average exposure for children's test scores. We conclude that studies of child development should consider both average cumulative neighborhood exposure and the timing of this exposure.
Relationships between the handball-specific complex test, non-specific field tests and the match performance score in elite professional handball players.

Science.gov (United States)

Hermassi, Souhail; Chelly, Mohamed-Souhaiel; Wollny, Rainer; Hoffmeyer, Birgit; Fieseler, Georg; Schulze, Stephan; Irlenbusch, Lars; Delank, Karl-Stefan; Shephard, Roy J; Bartels, Thomas; Schwesig, René

2018-06-01

This study assessed the validity of the handball-specific complex test (HBCT) and two non-specific field tests in professional elite handball athletes, using the match performance score (MPS) as the gold standard of performance. Thirteen elite male handball players (age: 27.4±4.8 years; premier German league) performed the HBCT, the Yo-Yo Intermittent Recovery (YYIR) test and a repeated shuttle sprint ability (RSA) test at the beginning of pre-season training. The RSA results were evaluated in terms of best time, total time, and fatigue decrement. Heart rates (HR) were assessed at selected times throughout all tests; the recovery HR was measured immediately post-test and 10 minutes later. The match performance score was based on various handball specific parameters (e.g., field goals, assists, steals, blocks, and technical mistakes) as seen during all matches of the immediately subsequent season (2015/2016). The parameters of run 1, run 2, and HR recovery at minutes 6 and 10 of the RSA test all showed a variance of more than 10% (range: 11-15%). However, the variance of scores for the YYIR test was much smaller (range: 1-7%). The resting HR (r2=0.18), HR recovery at minute 10 (r2=0.10), lactate concentration at rest (r2=0.17), recovery of heart rate from 0 to 10 minutes (r2=0.15), and velocity of second throw at first trial (r2=0.37) were the most valid HBCT parameters. Much effort is necessary to assess MPS and to develop valid tests. Speed and the rate of functional recovery seem the best predictors of competitive performance for elite handball players.
Robust joint score tests in the application of DNA methylation data analysis.

Science.gov (United States)

Li, Xuan; Fu, Yuejiao; Wang, Xiaogang; Qiu, Weiliang

2018-05-18

Recently differential variability has been showed to be valuable in evaluating the association of DNA methylation to the risks of complex human diseases. The statistical tests based on both differential methylation level and differential variability can be more powerful than those based only on differential methylation level. Anh and Wang (2013) proposed a joint score test (AW) to simultaneously detect for differential methylation and differential variability. However, AW's method seems to be quite conservative and has not been fully compared with existing joint tests. We proposed three improved joint score tests, namely iAW.Lev, iAW.BF, and iAW.TM, and have made extensive comparisons with the joint likelihood ratio test (jointLRT), the Kolmogorov-Smirnov (KS) test, and the AW test. Systematic simulation studies showed that: 1) the three improved tests performed better (i.e., having larger power, while keeping nominal Type I error rates) than the other three tests for data with outliers and having different variances between cases and controls; 2) for data from normal distributions, the three improved tests had slightly lower power than jointLRT and AW. The analyses of two Illumina HumanMethylation27 data sets GSE37020 and GSE20080 and one Illumina Infinium MethylationEPIC data set GSE107080 demonstrated that three improved tests had higher true validation rates than those from jointLRT, KS, and AW. The three proposed joint score tests are robust against the violation of normality assumption and presence of outlying observations in comparison with other three existing tests. Among the three proposed tests, iAW.BF seems to be the most robust and effective one for all simulated scenarios and also in real data analyses.

The effect of an intervention program on functional movement screen test scores in mixed martial arts athletes.

Science.gov (United States)

Bodden, Jamie G; Needham, Robert A; Chockalingam, Nachiappan

2015-01-01

This study assessed the basic fundamental movements of mixed martial arts (MMA) athletes using the functional movement screen (FMS) assessment and determined if an intervention program was successful at improving results. Participants were placed into 1 of the 2 groups: intervention and control groups. The intervention group was required to complete a corrective exercise program 4 times per week, and all participants were asked to continue their usual MMA training routine. A mid-intervention FMS test was included to examine if successful results were noticed sooner than the 8-week period. Results highlighted differences in FMS test scores between the control group and intervention group (p = 0.006). Post hoc testing revealed a significant increase in the FMS score of the intervention group between weeks 0 and 8 (p = 0.00) and weeks 0 and 4 (p = 0.00) and no significant increase between weeks 4 and 8 (p = 1.00). A χ analysis revealed that the intervention group participants were more likely to have an FMS score >14 than participants in the control group at week 4 (χ = 7.29, p < 0.01) and week 8 (χ = 5.2, p ≤ 0.05). Finally, a greater number of participants in the intervention group were free from asymmetry at week 4 and week 8 compared with the initial test period. The results of the study suggested that a 4-week intervention program was sufficient at improving FMS scores. Most if not all, the movements covered on the FMS relate to many aspects of MMA training. The knowledge that the FMS can identify movement dysfunctions and, furthermore, the fact that the issues can be improved through a standardized intervention program could be advantageous to MMA coaches, thus, providing the opportunity to adapt and implement new additions to training programs.
Are students' impressions of improved learning through active learning methods reflected by improved test scores?

Science.gov (United States)

Everly, Marcee C

2013-02-01

To report the transformation from lecture to more active learning methods in a maternity nursing course and to evaluate whether student perception of improved learning through active-learning methods is supported by improved test scores. The process of transforming a course into an active-learning model of teaching is described. A voluntary mid-semester survey for student acceptance of the new teaching method was conducted. Course examination results, from both a standardized exam and a cumulative final exam, among students who received lecture in the classroom and students who had active learning activities in the classroom were compared. Active learning activities were very acceptable to students. The majority of students reported learning more from having active-learning activities in the classroom rather than lecture-only and this belief was supported by improved test scores. Students who had active learning activities in the classroom scored significantly higher on a standardized assessment test than students who received lecture only. The findings support the use of student reflection to evaluate the effectiveness of active-learning methods and help validate the use of student reflection of improved learning in other research projects. Copyright © 2011 Elsevier Ltd. All rights reserved.
Increasing the reliability of the fluid/crystallized difference score from the Kaufman Adolescent and Adult Intelligence Test with reliable component analysis.

Science.gov (United States)

Caruso, J C

2001-06-01

The unreliability of difference scores is a well documented phenomenon in the social sciences and has led researchers and practitioners to interpret differences cautiously, if at all. In the case of the Kaufman Adult and Adolescent Intelligence Test (KAIT), the unreliability of the difference between the Fluid IQ and the Crystallized IQ is due to the high correlation between the two scales. The consequences of the lack of precision with which differences are identified are wide confidence intervals and unpowerful significance tests (i.e., large differences are required to be declared statistically significant). Reliable component analysis (RCA) was performed on the subtests of the KAIT in order to address these problems. RCA is a new data reduction technique that results in uncorrelated component scores with maximum proportions of reliable variance. Results indicate that the scores defined by RCA have discriminant and convergent validity (with respect to the equally weighted scores) and that differences between the scores, derived from a single testing session, were more reliable than differences derived from equal weighting for each age group (11-14 years, 15-34 years, 35-85+ years). This reliability advantage results in narrower confidence intervals around difference scores and smaller differences required for statistical significance.
Effect of Mindfulness Meditation on Perceived Stress Scores and Autonomic Function Tests of Pregnant Indian Women.

Science.gov (United States)

Muthukrishnan, Shobitha; Jain, Reena; Kohli, Sangeeta; Batra, Swaraj

2016-04-01

Various pregnancy complications like hypertension, preeclampsia have been strongly correlated with maternal stress. One of the connecting links between pregnancy complications and maternal stress is mind-body intervention which can be part of Complementary and Alternative Medicine (CAM). Biologic measures of stress during pregnancy may get reduced by such interventions. To evaluate the effect of Mindfulness meditation on perceived stress scores and autonomic function tests of pregnant Indian women. Pregnant Indian women of 12 weeks gestation were randomised to two treatment groups: Test group with Mindfulness meditation and control group with their usual obstetric care. The effect of Mindfulness meditation on perceived stress scores and cardiac sympathetic functions and parasympathetic functions (Heart rate variation with respiration, lying to standing ratio, standing to lying ratio and respiratory rate) were evaluated on pregnant Indian women. There was a significant decrease in perceived stress scores, a significant decrease of blood pressure response to cold pressor test and a significant increase in heart rate variability in the test group (pwomen. The results of this study suggest that mindfulness meditation improves parasympathetic functions in pregnant women and is a powerful modulator of the sympathetic nervous system during pregnancy.
America's Mediocre Test Scores: Education Crisis or Poverty Crisis?

Science.gov (United States)

Petrilli, Michael J.; Wright, Brandon L.

2016-01-01

At a time when the national conversation is focused on lagging upward mobility, it is no surprise that many educators point to poverty as the explanation for mediocre test scores among U.S. students compared to those of students in other countries. If American teachers in struggling U.S. schools taught in Finland, says Finnish educator Pasi…
The Implications of Family Size and Birth Order for Test Scores and Behavioral Development

Science.gov (United States)

Silles, Mary A.

2010-01-01

This article, using longitudinal data from the National Child Development Study, presents new evidence on the effects of family size and birth order on test scores and behavioral development at age 7, 11 and 16. Sibling size is shown to have an adverse causal effect on test scores and behavioral development. For any given family size, first-borns…
A score based on screening tests to differentiate mild cognitive impairment from subjective memory complaints

Directory of Open Access Journals (Sweden)

Fábio Henrique de Gobbi Porto

2013-09-01

Full Text Available It is not easy to differentiate patients with mild cognitive impairment (MCI from subjective memory complainers (SMC. Assessments with screening cognitive tools are essential, particularly in primary care where most patients are seen. The objective of this study was to evaluate the diagnostic accuracy of screening cognitive tests and to propose a score derived from screening tests. Elderly subjects with memory complaints were evaluated using the Mini Mental State Examination (MMSE and the Brief Cognitive Battery (BCB. We added two delayed recalls in the MMSE (a delayed recall and a late-delayed recall, LDR, and also a phonemic fluency test of letter P fluency (LPF. A score was created based on these tests. The diagnoses were made on the basis of clinical consensus and neuropsychological testing. Receiver operating characteristic curve analyses were used to determine area under the curve (AUC, the sensitivity and specificity for each test separately and for the final proposed score. MMSE, LDR, LPF and delayed recall of BCB scores reach statistically significant differences between groups (P=0.000, 0.03, 0.001 and 0.01, respectively. Sensitivity, specificity and AUC were MMSE: 64%, 79% and 0.75 (cut off <29; LDR: 56%, 62% and 0.62 (cut off <3; LPF: 71%, 71% and 0.71 (cut off <14; delayed recall of BCB: 56%, 82% and 0.68 (cut off <9. The proposed score reached a sensitivity of 88% and 76% and specificity of 62% and 75% for cut off over 1 and over 2, respectively. AUC were 0.81. In conclusion, a score created from screening tests is capable of discriminating MCI from SMC with moderate to good accurancy.
A physical function test for use in the intensive care unit: validity, responsiveness, and predictive utility of the physical function ICU test (scored).

Science.gov (United States)

Denehy, Linda; de Morton, Natalie A; Skinner, Elizabeth H; Edbrooke, Lara; Haines, Kimberley; Warrillow, Stephen; Berney, Sue

2013-12-01

Several tests have recently been developed to measure changes in patient strength and functional outcomes in the intensive care unit (ICU). The original Physical Function ICU Test (PFIT) demonstrates reliability and sensitivity. The aims of this study were to further develop the original PFIT, to derive an interval score (the PFIT-s), and to test the clinimetric properties of the PFIT-s. A nested cohort study was conducted. One hundred forty-four and 116 participants performed the PFIT at ICU admission and discharge, respectively. Original test components were modified using principal component analysis. Rasch analysis examined the unidimensionality of the PFIT, and an interval score was derived. Correlations tested validity, and multiple regression analyses investigated predictive ability. Responsiveness was assessed using the effect size index (ESI), and the minimal clinically important difference (MCID) was calculated. The shoulder lift component was removed. Unidimensionality of combined admission and discharge PFIT-s scores was confirmed. The PFIT-s displayed moderate convergent validity with the Timed "Up & Go" Test (r=-.60), the Six-Minute Walk Test (r=.41), and the Medical Research Council (MRC) sum score (rho=.49). The ESI of the PFIT-s was 0.82, and the MCID was 1.5 points (interval scale range=0-10). A higher admission PFIT-s score was predictive of: an MRC score of ≥48, increased likelihood of discharge home, reduced likelihood of discharge to inpatient rehabilitation, and reduced acute care hospital length of stay. Scoring of sit-to-stand assistance required is subjective, and cadence cutpoints used may not be generalizable. The PFIT-s is a safe and inexpensive test of physical function with high clinical utility. It is valid, responsive to change, and predictive of key outcomes. It is recommended that the PFIT-s be adopted to test physical function in the ICU.
Evaluation of Factors Affecting Continuous Performance Test Identical Pairs Version Score of Schizophrenic Patients in a Japanese Clinical Sample

Directory of Open Access Journals (Sweden)

Takayoshi Koide

2012-01-01

Full Text Available Aim. Cognitive impairment in schizophrenia strongly relates to social outcome and is a good candidate for endophenotypes. When we accurately measure drug efficacy or effects of genes or variants relevant to schizophrenia on cognitive impairment, clinical factors that can affect scores on cognitive tests, such as age and severity of symptoms, should be considered. To elucidate the effect of clinical factors, we conducted multiple regression analysis using scores of the Continuous Performance Test Identical Pairs Version (CPT-IP, which is often used to measure attention/vigilance in schizophrenia. Methods. We conducted the CPT-IP (4-4 digit and examined clinical information (sex, age, education years, onset age, duration of illness, chlorpromazine-equivalent dose, and Positive and Negative Symptom Scale (PANSS scores in 126 schizophrenia patients in Japanese population. Multiple regression analysis was used to evaluate the effect of clinical factors. Results. Age, chlorpromazine-equivalent dose, and PANSS-negative symptom score were associated with mean d′ score in patients. These three clinical factors explained about 28% of the variance in mean d′ score. Conclusions. As conclusion, CPT-IP score in schizophrenia patients is influenced by age, chlorpromazine-equivalent dose and PANSS negative symptom score.
Evaluation of the Discrepancy between the European Pharmacopoeia Test and an Adopted United States Pharmacopoeia Test Regarding the Weight Uniformity of Scored Tablet Halves: Is Harmonization Required?

Science.gov (United States)

Zaid, Abdel Naser; Ghoush, Abeer Abu; Al-Ramahi, Rowa'; Are'r, Mohammed

2012-01-01

The aim of this study was to evaluate whether there exists any discrepancy between the European Pharmacopoeia (Ph. Eur.) and adopted United States Pharmacopeia (USP) tests concerning the weight uniformity measurements of tablet halves after splitting. The USP method does not contain provisions to evaluate split tablets, so here we adopt their whole tablet weight uniformity method. Twenty-nine different commercial scored tablets (local and imported) were divided. The split units were individually weighed and the relative standard deviation (RSD) for each product was calculated and then evaluated according to both the adopted USP and the Ph. Eur. tests of weight uniformity. Twenty out of the 29 products tested failed the USP test, while 14 of them failed the Ph. Eur. test. Nine products passed both the USP and Ph. Eur. tests. Six products passed the Ph. Eur. test but failed the USP test, with all of these products having an RSD greater than 6%. The correlation coefficient between the weight and content of split halves for three randomly selected products-corotenol 100 mg, corotenol 50 mg, and lorazepam 2.5 mg-was found to be 0.986, 0.998, and 0.72, respectively. A clear difference can be seen between outcomes obtained by the two compendial tablet splitting methods with regard to weight uniformity. Results from the USP test showed that tighter measures are needed to pass the test. Our results argue that the Ph. Eur. should revise the existing weight uniformity test on scored tablets to include the RSD parameter in it. The USP should include this adopted test as a specific test for scored tablet halves, not just whole tablets. Manufacturers in some cases will need to improve the quality of the produced scored tablets in order to pass the USP test, especially those with low therapeutic indices. Finally, harmonization between the pharmacopoeias regarding the weight uniformity testing of split tablets is warranted. The aim of this study was to evaluate whether there
Allele-sharing models: LOD scores and accurate linkage tests.

Science.gov (United States)

Kong, A; Cox, N J

1997-11-01

Starting with a test statistic for linkage analysis based on allele sharing, we propose an associated one-parameter model. Under general missing-data patterns, this model allows exact calculation of likelihood ratios and LOD scores and has been implemented by a simple modification of existing software. Most important, accurate linkage tests can be performed. Using an example, we show that some previously suggested approaches to handling less than perfectly informative data can be unacceptably conservative. Situations in which this model may not perform well are discussed, and an alternative model that requires additional computations is suggested.
Identifying genetic marker sets associated with phenotypes via an efficient adaptive score test

KAUST Repository

Cai, T.

2012-06-25

In recent years, genome-wide association studies (GWAS) and gene-expression profiling have generated a large number of valuable datasets for assessing how genetic variations are related to disease outcomes. With such datasets, it is often of interest to assess the overall effect of a set of genetic markers, assembled based on biological knowledge. Genetic marker-set analyses have been advocated as more reliable and powerful approaches compared with the traditional marginal approaches (Curtis and others, 2005. Pathways to the analysis of microarray data. TRENDS in Biotechnology 23, 429-435; Efroni and others, 2007. Identification of key processes underlying cancer phenotypes using biologic pathway analysis. PLoS One 2, 425). Procedures for testing the overall effect of a marker-set have been actively studied in recent years. For example, score tests derived under an Empirical Bayes (EB) framework (Liu and others, 2007. Semiparametric regression of multidimensional genetic pathway data: least-squares kernel machines and linear mixed models. Biometrics 63, 1079-1088; Liu and others, 2008. Estimation and testing for the effect of a genetic pathway on a disease outcome using logistic kernel machine regression via logistic mixed models. BMC bioinformatics 9, 292-2; Wu and others, 2010. Powerful SNP-set analysis for case-control genome-wide association studies. American Journal of Human Genetics 86, 929) have been proposed as powerful alternatives to the standard Rao score test (Rao, 1948. Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation. Mathematical Proceedings of the Cambridge Philosophical Society, 44, 50-57). The advantages of these EB-based tests are most apparent when the markers are correlated, due to the reduction in the degrees of freedom. In this paper, we propose an adaptive score test which up- or down-weights the contributions from each member of the marker-set based on the Z-scores of
Results of the Intelligence Test for Visually Impaired Children (ITVIC).

Science.gov (United States)

Dekker, R.; And Others

1991-01-01

Statistical analyses of scores on subtests of the Intelligence Test for Visually Impaired Children were done for two groups of children, either with or without usable vision. Results suggest that the battery has differential factorial and predictive validity. (Author/DB)
Comparing the Effects of Elementary Music and Visual Arts Lessons on Standardized Mathematics Test Scores

Science.gov (United States)

King, Molly Elizabeth

2016-01-01

The purpose of this quantitative, causal-comparative study was to compare the effect elementary music and visual arts lessons had on third through sixth grade standardized mathematics test scores. Inferential statistics were used to compare the differences between test scores of students who took in-school, elementary, music instruction during the…
Specific algorithm method of scoring the Clock Drawing Test applied in cognitively normal elderly

Directory of Open Access Journals (Sweden)

Liana Chaves Mendes-Santos

Full Text Available The Clock Drawing Test (CDT is an inexpensive, fast and easily administered measure of cognitive function, especially in the elderly. This instrument is a popular clinical tool widely used in screening for cognitive disorders and dementia. The CDT can be applied in different ways and scoring procedures also vary. OBJECTIVE: The aims of this study were to analyze the performance of elderly on the CDT and evaluate inter-rater reliability of the CDT scored by using a specific algorithm method adapted from Sunderland et al. (1989. METHODS: We analyzed the CDT of 100 cognitively normal elderly aged 60 years or older. The CDT ("free-drawn" and Mini-Mental State Examination (MMSE were administered to all participants. Six independent examiners scored the CDT of 30 participants to evaluate inter-rater reliability. RESULTS AND CONCLUSION: A score of 5 on the proposed algorithm ("Numbers in reverse order or concentrated", equivalent to 5 points on the original Sunderland scale, was the most frequent (53.5%. The CDT specific algorithm method used had high inter-rater reliability (p<0.01, and mean score ranged from 5.06 to 5.96. The high frequency of an overall score of 5 points may suggest the need to create more nuanced evaluation criteria, which are sensitive to differences in levels of impairment in visuoconstructive and executive abilities during aging.
COMPARISON BETWEEN WOOD DRYING DEFECT SCORES: SPECIMEN TESTING X ANALYSIS OF KILN-DRIED BOARDS

Directory of Open Access Journals (Sweden)

Djeison Cesar Batista

2015-04-01

Full Text Available It is important to develop drying technologies for Eucalyptus grandis lumber, which is one of the most planted species of this genus in Brazil and plays an important role as raw material for the wood industry. The general aim of this work was to assess the conventional kiln drying of juvenile wood of three clones of Eucalyptus grandis. The specific aims were to compare the behavior between: i drying defects indicated by tests with wood specimens and conventional kiln-dried boards; and ii physical properties and the drying quality. Five 11-year-old trees of each clone were felled, and only flatsawn boards of the first log were used. Basic density and total shrinkage were determined, and the drying test with wood specimens at 100 °C was carried out. Kiln drying of boards was performed, and initial and final moisture content, moisture gradient in thickness, drying stresses and drying defects were assessed. The defect scoring method was used to verify the behavior between the defects detected by specimen testing and the defects detected in kiln-dried boards. As main results, the drying schedule was too severe for the wood, resulting in a high level of boards with defects. The behavior between the defects in the drying test with specimens and the defects of kiln-dried boards was different, there was no correspondence, according to the defect scoring method.
Test and Score Data Summary for TOEFL[R] Internet-Based and Paper-Based Tests. January 2008-December 2008 Test Data

Science.gov (United States)

Educational Testing Service, 2008

2008-01-01

The Test of English as a Foreign Language[TM], better known as TOEFL[R], is designed to measure the English-language proficiency of people whose native language is not English. TOEFL scores are accepted by more than 6,000 colleges, universities, and licensing agencies in 130 countries. The test is also used by governments, and scholarship and…
Use of Standardized Test Scores to Predict Success in a Computer Applications Course

Science.gov (United States)

Harris, Robert V.; King, Stephanie B.

2016-01-01

The purpose of this study was to see if a relationship existed between American College Testing (ACT) scores (i.e., English, reading, mathematics, science reasoning, and composite) and student success in a computer applications course at a Mississippi community college. The study showed that while the ACT scores were excellent predictors of…
Evaluating the RELM Test Results

Directory of Open Access Journals (Sweden)

Michael K. Sachs

2012-01-01

Full Text Available We consider implications of the Regional Earthquake Likelihood Models (RELM test results with regard to earthquake forecasting. Prospective forecasts were solicited for M≥4.95 earthquakes in California during the period 2006–2010. During this period 31 earthquakes occurred in the test region with M≥4.95. We consider five forecasts that were submitted for the test. We compare the forecasts utilizing forecast verification methodology developed in the atmospheric sciences, specifically for tornadoes. We utilize a “skill score” based on the forecast scores λfi of occurrence of the test earthquakes. A perfect forecast would have λfi=1, and a random (no skill forecast would have λfi=2.86×10-3. The best forecasts (largest value of λfi for the 31 earthquakes had values of λfi=1.24×10-1 to λfi=5.49×10-3. The best mean forecast for all earthquakes was λ̅f=2.84×10-2. The best forecasts are about an order of magnitude better than random forecasts. We discuss the earthquakes, the forecasts, and alternative methods of evaluation of the performance of RELM forecasts. We also discuss the relative merits of alarm-based versus probability-based forecasts.
A Comparison of the Approaches of Generalizability Theory and Item Response Theory in Estimating the Reliability of Test Scores for Testlet-Composed Tests

Science.gov (United States)

Lee, Guemin; Park, In-Yong

2012-01-01

Previous assessments of the reliability of test scores for testlet-composed tests have indicated that item-based estimation methods overestimate reliability. This study was designed to address issues related to the extent to which item-based estimation methods overestimate the reliability of test scores composed of testlets and to compare several…

Racial Differences in Mathematics Test Scores for Advanced Mathematics Students

Science.gov (United States)

Minor, Elizabeth Covay

2016-01-01

Research on achievement gaps has found that achievement gaps are larger for students who take advanced mathematics courses compared to students who do not. Focusing on the advanced mathematics student achievement gap, this study found that African American advanced mathematics students have significantly lower test scores and are less likely to be…
Errors in ADAS-cog administration and scoring may undermine clinical trials results.

Science.gov (United States)

Schafer, K; De Santi, S; Schneider, L S

2011-06-01

The Alzheimer's Disease Assessment Scale - cognitive subscale (ADAS-cog) is the most widely used cognitive outcome measure in AD trials. Although errors in administration and scoring have been suggested as factors masking accurate estimates and potential effects of treatments, there have been few formal examinations of errors with the ADAS-cog. We provided ADAS-cog administration training using standard methods to raters who were designated as experienced, potential raters by sponsors or contract research organizations for two clinical trials. Training included 1 hour sessions on test administration, scoring, question periods, and required that raters individually view and score a model ADAS-cog administration. Raters scores were compared to the criterion scores established for the model administration. A total of 108 errors were made by 80.6% of the 72 raters; 37.5% made 1 error, 25.0% made 2 errors and 18.0% made 3 or more. Errors were made in all ADAS-cog subsections. The most common were in word finding difficulty (67% of the raters), word recognition (22%), and orientation (22%). For the raters who made 1, 2, or ≥ 3 errors the ADAS-cog score was 17.5 (95% CI, 17.3 - 17.8), 17.8 (17.0 - 18.5), and 18.8 (17.6 - 20.0), respectively, and compared to the criterion score, 18.3. ADAS-cog means differed significantly and the variances were more than twice as large between those who made errors on word finding and those who did not, 17.6 (SD=1.4) vs. 18.8 (SD=0.9), respectively (χ(2) = 37.2, P ADAS-cog scores and clinical trials outcomes. These errors may undermine detection of medication effects by contributing both to a biased point estimate and increased variance of the outcome.
TOEFL iBT Speaking Test Scores as Indicators of Oral Communicative Language Proficiency

Science.gov (United States)

Bridgeman, Brent; Powers, Donald; Stone, Elizabeth; Mollaun, Pamela

2012-01-01

Scores assigned by trained raters and by an automated scoring system (SpeechRater[TM]) on the speaking section of the TOEFL iBT[TM] were validated against a communicative competence criterion. Specifically, a sample of 555 undergraduate students listened to speech samples from 184 examinees who took the Test of English as a Foreign Language…
Association between the gait pattern characteristics of older people and their two-step test scores.

Science.gov (United States)

Kobayashi, Yoshiyuki; Ogata, Toru

2018-04-27

The Two-Step test is one of three official tests authorized by the Japanese Orthopedic Association to evaluate the risk of locomotive syndrome (a condition of reduced mobility caused by an impairment of the locomotive organs). It has been reported that the Two-Step test score has a good correlation with one's walking ability; however, its association with the gait pattern of older people during normal walking is still unknown. Therefore, this study aims to clarify the associations between the gait patterns of older people observed during normal walking and their Two-Step test scores. We analyzed the whole waveforms obtained from the lower-extremity joint angles and joint moments of 26 older people in various stages of locomotive syndrome using principal component analysis (PCA). The PCA was conducted using a 260 × 2424 input matrix constructed from the participants' time-normalized pelvic and right-lower-limb-joint angles along three axes (ten trials of 26 participants, 101 time points, 4 angles, 3 axes, and 2 variable types per trial). The Pearson product-moment correlation coefficient between the scores of the principal component vectors (PCVs) and the scores of the Two-Step test revealed that only one PCV (PCV 2) among the 61 obtained relevant PCVs is significantly related to the score of the Two-Step test. We therefore concluded that the joint angles and joint moments related to PCV 2-ankle plantar-flexion, ankle plantar-flexor moments during the late stance phase, ranges of motion and moments on the hip, knee, and ankle joints in the sagittal plane during the entire stance phase-are the motions associated with the Two-Step test.
Validity and reliability of Abbreviated Mental Test Score (AMTS) among older Iranian.

Science.gov (United States)

Foroughan, Mahshid; Wahlund, Lars-Olof; Jafari, Zahra; Rahgozar, Mehdi; Farahani, Ida G; Rashedi, Vahid

2017-11-01

Cognitive impairment is common among older people and is associated with increased morbidity and mortality. The main aim of this study was to evaluate the validity of the Persian version of the Abbreviated Mental Test Score (AMTS) as a screening tool for dementia. Data were obtained from a cross-sectional study. One hundred and one older adults who were members of Iranian Alzheimer Association and 101 of their siblings were entered into this study by convenient sampling. The Diagnostic and Statistical Manual of Mental Disorders, 4th edition, criteria for diagnosing dementia and the Mini-Mental State Examination were used as the study tools. The gathered data were analyzed by the Mann-Whitney U-test, the Kruskal-Wallis test, Spearman's rank correlation coefficient, and the receiver-operating characteristic. The AMTS could successfully differentiate the dementia group from the non-dementia group. Scores were significantly correlated with Diagnostic and Statistical Manual of Mental Disorders diagnosis for dementia and Mini-Mental State Examination scores (P < 0.001). Educational level (P < 0.001) and male sex (P = 0.015) were positively associated with AMTS, whereas (P < 0.001) was negatively associated with AMTS. Total Cronbach's α coefficient was 0.90. The scores 6 and 7 showed the optimum balance between sensitivity (99% and 94%, respectively) and specificity (85% and 86%, respectively). The Persian version of the AMTS is a valid cognitive assessment tool for older Iranian adults and can be used for dementia screening in Iran. © 2017 Japanese Psychogeriatric Society.
A general equation to obtain multiple cut-off scores on a test from multinomial logistic regression.

Science.gov (United States)

Bersabé, Rosa; Rivas, Teresa

2010-05-01

The authors derive a general equation to compute multiple cut-offs on a total test score in order to classify individuals into more than two ordinal categories. The equation is derived from the multinomial logistic regression (MLR) model, which is an extension of the binary logistic regression (BLR) model to accommodate polytomous outcome variables. From this analytical procedure, cut-off scores are established at the test score (the predictor variable) at which an individual is as likely to be in category j as in category j+1 of an ordinal outcome variable. The application of the complete procedure is illustrated by an example with data from an actual study on eating disorders. In this example, two cut-off scores on the Eating Attitudes Test (EAT-26) scores are obtained in order to classify individuals into three ordinal categories: asymptomatic, symptomatic and eating disorder. Diagnoses were made from the responses to a self-report (Q-EDD) that operationalises DSM-IV criteria for eating disorders. Alternatives to the MLR model to set multiple cut-off scores are discussed.
School accountability and the black-white test score gap.

Science.gov (United States)

Gaddis, S Michael; Lauen, Douglas Lee

2014-03-01

Since at least the 1960s, researchers have closely examined the respective roles of families, neighborhoods, and schools in producing the black-white achievement gap. Although many researchers minimize the ability of schools to eliminate achievement gaps, the No Child Left Behind Act (NCLB) increased pressure on schools to do so by 2014. In this study, we examine the effects of NCLB's subgroup-specific accountability pressure on changes in black-white math and reading test score gaps using a school-level panel dataset on all North Carolina public elementary and middle schools between 2001 and 2009. Using difference-in-difference models with school fixed effects, we find that accountability pressure reduces black-white achievement gaps by raising mean black achievement without harming mean white achievement. We find no differential effects of accountability pressure based on the racial composition of schools, but schools with more affluent populations are the most successful at reducing the black-white math achievement gap. Thus, our findings suggest that school-based interventions have the potential to close test score gaps, but differences in school composition and resources play a significant role in the ability of schools to reduce racial inequality. Copyright © 2013 Elsevier Inc. All rights reserved.
Source Country Differences in Test Score Gaps: Evidence from Denmark

Science.gov (United States)

Rangvid, Beatrice Schindler

2010-01-01

We combine data from three studies for Denmark in the PISA 2000 framework to investigate differences in the native-immigrant test score gap by country of origin. In addition to the controls available from PISA data sources, we use student-level data on home background and individual migration histories linked from administrative registers. We find…
The Effects of Group Members' Personalities on a Test Taker's L2 Group Oral Discussion Test Scores

Science.gov (United States)

Ockey, Gary J.

2009-01-01

The second language group oral is a test of second language speaking proficiency, in which a group of three or more English language learners discuss an assigned topic without interaction with interlocutors. Concerns expressed about the extent to which test takers' personal characteristics affect the scores of others in the group have limited its…
Standardised test protocol (Constant Score) for evaluation of functionality in patients with shoulder disorders

DEFF Research Database (Denmark)

Ban, Ilija; Troelsen, Anders; Christiansen, David Høyrup

2013-01-01

INTRODUCTION: The Constant Score (CS), developed as a scoring system to evaluate overall functionality of patients with shoulder disorders, is widely used but has been criticised for relying on an imprecise terminology and for lack of a standardised methodology. A modified guideline was therefore...... differences. One of the authors of the modified CS approved both the English and the Danish test protocol. CONCLUSION: A simple test protocol of the modified CS was developed in both English and Danish. With precise terminology and definitions, the test protocol is the first of its kind. We suggest its use...
Opportunity to learn: Investigating possible predictors for pre-course Test Of Astronomy STandards TOAST scores

Science.gov (United States)

Berryhill, Katie J.

As astronomy education researchers become more interested in experimentally testing innovative teaching strategies to enhance learning in introductory astronomy survey courses ("ASTRO 101"), scholars are placing increased attention toward better understanding factors impacting student gain scores on the widely used Test Of Astronomy STandards (TOAST). Usually used in a pre-test and post-test research design, one might naturally assume that the pre-course differences observed between high- and low-scoring college students might be due in large part to their pre-existing motivation, interest, experience in science, and attitudes about astronomy. To explore this notion, 11 non-science majoring undergraduates taking ASTRO 101 at west coast community colleges were interviewed in the first few weeks of the course to better understand students' pre-existing affect toward learning astronomy with an eye toward predicting student success. In answering this question, we hope to contribute to our understanding of the incoming knowledge of students taking undergraduate introductory astronomy classes, but also gain insight into how faculty can best meet those students' needs and assist them in achieving success. Perhaps surprisingly, there was only weak correlation between students' motivation toward learning astronomy and their pre-test scores. Instead, the most fruitful predictor of TOAST pre-test scores was the quantity of pre-existing, informal, self-directed astronomy learning experiences.
Construction of an Exome-Wide Risk Score for Schizophrenia Based on a Weighted Burden Test.

Science.gov (United States)

Curtis, David

2018-01-01

Polygenic risk scores obtained as a weighted sum of associated variants can be used to explore association in additional data sets and to assign risk scores to individuals. The methods used to derive polygenic risk scores from common SNPs are not suitable for variants detected in whole exome sequencing studies. Rare variants, which may have major effects, are seen too infrequently to judge whether they are associated and may not be shared between training and test subjects. A method is proposed whereby variants are weighted according to their frequency, their annotations and the genes they affect. A weighted sum across all variants provides an individual risk score. Scores constructed in this way are used in a weighted burden test and are shown to be significantly different between schizophrenia cases and controls using a five-way cross-validation procedure. This approach represents a first attempt to summarise exome sequence variation into a summary risk score, which could be combined with risk scores from common variants and from environmental factors. It is hoped that the method could be developed further. © 2017 John Wiley & Sons Ltd/University College London.
Poisson Approximation-Based Score Test for Detecting Association of Rare Variants.

Science.gov (United States)

Fang, Hongyan; Zhang, Hong; Yang, Yaning

2016-07-01

Genome-wide association study (GWAS) has achieved great success in identifying genetic variants, but the nature of GWAS has determined its inherent limitations. Under the common disease rare variants (CDRV) hypothesis, the traditional association analysis methods commonly used in GWAS for common variants do not have enough power for detecting rare variants with a limited sample size. As a solution to this problem, pooling rare variants by their functions provides an efficient way for identifying susceptible genes. Rare variant typically have low frequencies of minor alleles, and the distribution of the total number of minor alleles of the rare variants can be approximated by a Poisson distribution. Based on this fact, we propose a new test method, the Poisson Approximation-based Score Test (PAST), for association analysis of rare variants. Two testing methods, namely, ePAST and mPAST, are proposed based on different strategies of pooling rare variants. Simulation results and application to the CRESCENDO cohort data show that our methods are more powerful than the existing methods. © 2016 John Wiley & Sons Ltd/University College London.
Spinal appearance questionnaire: factor analysis, scoring, reliability, and validity testing.

Science.gov (United States)

Carreon, Leah Y; Sanders, James O; Polly, David W; Sucato, Daniel J; Parent, Stefan; Roy-Beaudry, Marjolaine; Hopkins, Jeffrey; McClung, Anna; Bratcher, Kelly R; Diamond, Beverly E

2011-08-15

Cross sectional. This study presents the factor analysis of the Spinal Appearance Questionnaire (SAQ) and its psychometric properties. Although the SAQ has been administered to a large sample of patients with adolescent idiopathic scoliosis (AIS) treated surgically, its psychometric properties have not been fully evaluated. This study presents the factor analysis and scoring of the SAQ and evaluates its psychometric properties. The SAQ and the Scoliosis Research Society-22 (SRS-22) were administered to AIS patients who were being observed, braced or scheduled for surgery. Standard demographic data and radiographic measures including Lenke type and curve magnitude were also collected. Of the 1802 patients, 83% were female; with a mean age of 14.8 years and mean initial Cobb angle of 55.8° (range, 0°-123°). From the 32 items of the SAQ, 15 loaded on two factors with consistent and significant correlations across all Lenke types. There is an Appearance (items 1-10) and an Expectations factor (items 12-15). Responses are summed giving a range of 5 to 50 for the Appearance domain and 5 to 20 for the Expectations domain. The Cronbach's α was 0.88 for both domains and Total score with a test-retest reliability of 0.81 for Appearance and 0.91 for Expectations. Correlations with major curve magnitude were higher for the SAQ Appearance and SAQ Total scores compared to correlations between the SRS Appearance and SRS Total scores. The SAQ and SRS-22 Scores were statistically significantly different in patients who were scheduled for surgery compared to those who were observed or braced. The SAQ is a valid measure of self-image in patients with AIS with greater correlation to curve magnitude than SRS Appearance and Total score. It also discriminates between patients who require surgery from those who do not.
Associations between cadmium exposure and neurocognitive test scores in a cross-sectional study of US adults.

Science.gov (United States)

Ciesielski, Timothy; Bellinger, David C; Schwartz, Joel; Hauser, Russ; Wright, Robert O

2013-02-05

Low-level environmental cadmium exposure and neurotoxicity has not been well studied in adults. Our goal was to evaluate associations between neurocognitive exam scores and a biomarker of cumulative cadmium exposure among adults in the Third National Health and Nutrition Examination Survey (NHANES III). NHANES III is a nationally representative cross-sectional survey of the U.S. population conducted between 1988 and 1994. We analyzed data from a subset of participants, age 20-59, who participated in a computer-based neurocognitive evaluation. There were four outcome measures: the Simple Reaction Time Test (SRTT: visual motor speed), the Symbol Digit Substitution Test (SDST: attention/perception), the Serial Digit Learning Test (SDLT) trials-to-criterion, and the SDLT total-error-score (SDLT-tests: learning recall/short-term memory). We fit multivariable-adjusted models to estimate associations between urinary cadmium concentrations and test scores. 5662 participants underwent neurocognitive screening, and 5572 (98%) of these had a urinary cadmium level available. Prior to multivariable-adjustment, higher urinary cadmium concentration was associated with worse performance in each of the 4 outcomes. After multivariable-adjustment most of these relationships were not significant, and age was the most influential variable in reducing the association magnitudes. However among never-smokers with no known occupational cadmium exposure the relationship between urinary cadmium and SDST score (attention/perception) was significant: a 1 μg/L increase in urinary cadmium corresponded to a 1.93% (95%CI: 0.05, 3.81) decrement in performance. These results suggest that higher cumulative cadmium exposure in adults may be related to subtly decreased performance in tasks requiring attention and perception, particularly among those adults whose cadmium exposure is primarily though diet (no smoking or work based cadmium exposure). This association was observed among exposure levels
Your move: The effect of chess on mathematics test scores.

Science.gov (United States)

Rosholm, Michael; Mikkelsen, Mai Bjørnskov; Gumede, Kamilla

2017-01-01

We analyse the effect of substituting a weekly mathematics lesson in primary school grades 1-3 with a lesson in mathematics based on chess instruction. We use data from the City of Aarhus in Denmark, combining test score data with a comprehensive data set obtained from administrative registers. We use two different methodological approaches to identify and estimate treatment effects and we tend to find positive effects, indicating that knowledge acquired through chess play can be transferred to the domain of mathematics. We also find larger impacts for unhappy children and children who are bored in school, perhaps because chess instruction facilitates learning by providing an alternative approach to mathematics for these children. The results are encouraging and suggest that chess may be an important and effective tool for improving mathematical capacity in young students.
Are WISC IQ scores in children with mathematical learning disabilities underestimated? The influence of a specialized intervention on test performance.

Science.gov (United States)

Lambert, Katharina; Spinath, Birgit

2018-01-01

Intelligence measures play a pivotal role in the diagnosis of mathematical learning disabilities (MLD). Probably as a result of math-related material in IQ tests, children with MLD often display reduced IQ scores. However, it remains unclear whether the effects of math remediation extend to IQ scores. The present study investigated the impact of a special remediation program compared to a control group receiving private tutoring (PT) on the WISC IQ scores of children with MLD. We included N=45 MLD children (7-12 years) in a study with a pre- and post-test control group design. Children received remediation for two years on average. The analyses revealed significantly greater improvements in the experimental group on the Full-Scale IQ, and the Verbal Comprehension, Perceptual Reasoning, and Working Memory indices, but not Processing Speed, compared to the PT group. Children in the experimental group showed an average WISC IQ gain of more than ten points. Results indicate that the WISC IQ scores of MLD children might be underestimated and that an effective math intervention can improve WISC IQ test performance. Taking limitations into account, we discuss the use of IQ measures more generally for defining MLD in research and practice. Copyright © 2017 Elsevier Ltd. All rights reserved.
ACER Mathematics Profile Series: Number Test. (Test Booklet, Answer and Record Sheet, Score Key, and Teachers Handbook).

Science.gov (United States)

Cornish, Greg; Wines, Robin

The Number Test of the ACER Mathematics Profile Series, contains 30 items, for each of three suggested grade levels: 7-8, 8-9, and 9-10. Raw scores on all tests in the ACER Mathematics Profile Series (Number, Operations, Space and Measurement) are converted to a common scale called MAPS, a major feature of the Series. Based on the Rasch Model,…
Effects of Public Preschool Expenditures on the Test Scores of 4 Graders: Evidence from TIMSS.

Science.gov (United States)

Waldfogel, Jane; Zhai, Fuhua

2008-02-01

This study examines the effects of public preschool expenditures on the math and science scores of 4(th) graders, holding constant child, family, and school characteristics, other relevant social expenditures, and country and year effects, in seven Organization for Economic Co-operation and Development (OECD) countries -- Australia, Japan, Netherlands, New Zealand, Norway, U.K., and U.S -- using data from the 1995 and 2003 Trends in International Mathematics and Science Study (TIMSS). Our results indicate that there are small but significant positive effects of public preschool expenditures on the math and science scores of 4(th) graders and preschool expenditures reduce the risk of children scoring at the low level of proficiency. We also find some evidence that children from low-resource homes and homes where the test language is not always spoken may tend to gain more from increased public preschool expenditures than other children,.
[Propensity score matching in SPSS].

Science.gov (United States)

Huang, Fuqiang; DU, Chunlin; Sun, Menghui; Ning, Bing; Luo, Ying; An, Shengli

2015-11-01

To realize propensity score matching in PS Matching module of SPSS and interpret the analysis results. The R software and plug-in that could link with the corresponding versions of SPSS and propensity score matching package were installed. A PS matching module was added in the SPSS interface, and its use was demonstrated with test data. Score estimation and nearest neighbor matching was achieved with the PS matching module, and the results of qualitative and quantitative statistical description and evaluation were presented in the form of a graph matching. Propensity score matching can be accomplished conveniently using SPSS software.

Lack of Correlation between Severity of Clinical Symptoms, Skin Test Reactivity, and Radioallergosorbent Test Results in Venom-Allergic Patients

Directory of Open Access Journals (Sweden)

Warrington RJ

2006-06-01

Full Text Available Abstract Purpose To retrospectively examine the relation between skin test reactivity, venom-specific immunoglobulin E (IgE antibody levels, and severity of clinical reaction in patients with insect venom allergy. Method Thirty-six patients (including 15 females who presented with a history of allergic reactions to insect stings were assessed. The mean age at the time of the reactions was 33.4 ± 15.1 years (range, 4-76 years, and patients were evaluated 43.6 ± 90 months (range, 1-300 months after the reactions. Clinical reactions were scored according to severity, from 1 (cutaneous manifestations only to 3 (anaphylaxis with shock. These scores were compared to scores for skin test reactivity (0 to 5, indicating the log increase in sensitivity from 1 μg/mL to 0.0001 μg/mL and radioallergosorbent test (RAST levels (0 to 4, indicating venom-specific IgE levels, from undetectable to >17.5 kilounits of antigen per litre [kUA/L]. Results No correlation was found between skin test reactivity (Spearman's coefficient = 0.15, p = .377 or RAST level (Spearman's coefficient = 0.32, p = .061 and the severity of reaction. Skin test and RAST scores both differed significantly from clinical severity (p p = .042. There was no correlation between skin test reactivity and time since reaction (Spearman's coefficient = 0.18, p = .294 nor between RAST and time since reaction (r = 0.1353, p = .438. Elimination of patients tested more than 12 months after their reaction still produced no correlation between skin test reactivity (p = .681 or RAST score (p = .183 and the severity of the clinical reaction. Conclusion In venom-allergic patients (in contrast to reported findings in cases of inhalant IgE-mediated allergy, there appears to be no significant correlation between the degree of skin test reactivity or levels of venom-specific IgE (determined by RAST and the severity of the clinical reaction.
Scoring CT/HRCT findings among asbestos-exposed workers: effects of patient's age, body mass index and common laboratory test results

Energy Technology Data Exchange (ETDEWEB)

Vehmas, T.; Huuskonen, M.S. [Finnish Institute of Occupational Health, Department of Radiology, Helsinki (Finland); Kivisaari, L. [Helsinki University Central Hospital, Department of Radiology, Helsinki (Finland); Jaakkola, M.S. [Finnish Institute of Occupational Health, Department of Radiology, Helsinki (Finland); University of Birmingham, Institute of Occupational and Environmental Medicine, Birmingham (United Kingdom)

2005-02-01

We studied the effects of age, body mass index (BMI) and some common laboratory test results on several pulmonary CT/HRCT signs. Five hundred twenty-eight construction workers (age 38-80, mean 63 years) were imaged with spiral and high resolution CT. Images were scored by three radiologists for solitary pulmonary nodules, signs indicative of fibrosis and emphysema, ground glass opacities, bronchial wall thickness and bronchiectasis. Multivariate statistical analyses were adjusted for smoking and asbestos exposure. Increasing age, blood haemoglobin value and erythrocyte sedimentation rate correlated positively with several HRCT signs. Increasing BMI was associated with a decrease in several signs, especially parenchymal bands, honeycombing, all kinds of emphysema and bronchiectasis. The latter finding might be due to the suboptimal image quality in obese individuals, which may cause suspicious findings to be overlooked. Background data, including patient's age and body constitution, should be considered when CT/HRCT images are interpreted. (orig.)
Linear-rank testing of a non-binary, responder-analysis, efficacy score to evaluate pharmacotherapies for substance use disorders.

Science.gov (United States)

Holmes, Tyson H; Li, Shou-Hua; McCann, David J

2016-11-23

The design of pharmacological trials for management of substance use disorders is shifting toward outcomes of successful individual-level behavior (abstinence or no heavy use). While binary success/failure analyses are common, McCann and Li (CNS Neurosci Ther 2012; 18: 414-418) introduced "number of beyond-threshold weeks of success" (NOBWOS) scores to avoid dichotomized outcomes. NOBWOS scoring employs an efficacy "hurdle" with values reflecting duration of success. Here, we evaluate NOBWOS scores rigorously. Formal analysis of mathematical structure of NOBWOS scores is followed by simulation studies spanning diverse conditions to assess operating characteristics of five linear-rank tests on NOBWOS scores. Simulations include assessment of Fisher's exact test applied to hurdle component. On average, statistical power was approximately equal for five linear-rank tests. Under none of conditions examined did Fisher's exact test exhibit greater statistical power than any of the linear-rank tests. These linear-rank tests provide good Type I and Type II error control for comparing distributions of NOBWOS scores between groups (e.g. active vs. placebo). All methods were applied to re-analyses of data from four clinical trials of differing lengths and substances of abuse. These linear-rank tests agreed across all trials in rejecting (or not) their null (equality of distributions) at ≤ 0.05. © The Author(s) 2016.
Speech-discrimination scores modeled as a binomial variable.

Science.gov (United States)

Thornton, A R; Raffin, M J

1978-09-01

Many studies have reported variability data for tests of speech discrimination, and the disparate results of these studies have not been given a simple explanation. Arguments over the relative merits of 25- vs 50-word tests have ignored the basic mathematical properties inherent in the use of percentage scores. The present study models performance on clinical tests of speech discrimination as a binomial variable. A binomial model was developed, and some of its characteristics were tested against data from 4120 scores obtained on the CID Auditory Test W-22. A table for determining significant deviations between scores was generated and compared to observed differences in half-list scores for the W-22 tests. Good agreement was found between predicted and observed values. Implications of the binomial characteristics of speech-discrimination scores are discussed.
An Analysis of Cross Racial Identity Scale Scores Using Classical Test Theory and Rasch Item Response Models

Science.gov (United States)

Sussman, Joshua; Beaujean, A. Alexander; Worrell, Frank C.; Watson, Stevie

2013-01-01

Item response models (IRMs) were used to analyze Cross Racial Identity Scale (CRIS) scores. Rasch analysis scores were compared with classical test theory (CTT) scores. The partial credit model demonstrated a high goodness of fit and correlations between Rasch and CTT scores ranged from 0.91 to 0.99. CRIS scores are supported by both methods.…
Association testing for next-generation sequencing data using score statistics

DEFF Research Database (Denmark)

Skotte, Line; Korneliussen, Thorfinn Sand; Albrechtsen, Anders

2012-01-01

computationally feasible due to the use of score statistics. As part of the joint likelihood, we model the distribution of the phenotypes using a generalized linear model framework, which works for both quantitative and discrete phenotypes. Thus, the method presented here is applicable to case-control studies...... of genotype calls into account have been proposed; most require numerical optimization which for large-scale data is not always computationally feasible. We show that using a score statistic for the joint likelihood of observed phenotypes and observed sequencing data provides an attractive approach...... to association testing for next-generation sequencing data. The joint model accounts for the genotype classification uncertainty via the posterior probabilities of the genotypes given the observed sequencing data, which gives the approach higher power than methods based on called genotypes. This strategy remains...
Effects of Analytical and Holistic Scoring Patterns on Scorer Reliability in Biology Essay Tests

Science.gov (United States)

Ebuoh, Casmir N.

2018-01-01

Literature revealed that the patterns/methods of scoring essay tests had been criticized for not being reliable and this unreliability is more likely to be more in internal examinations than in the external examinations. The purpose of this study is to find out the effects of analytical and holistic scoring patterns on scorer reliability in…
Linkage analysis in nuclear families. 2: Relationship between affected sib-pair tests and lod score analysis.

Science.gov (United States)

Knapp, M; Seuchter, S A; Baur, M P

1994-01-01

It is believed that the main advantage of affected sib-pair tests is that their application requires no information about the underlying genetic mechanism of the disease. However, here it is proved that the mean test, which can be considered the most prominent of the affected sib-pair tests, is equivalent to lod score analysis for an assumed recessive mode of inheritance, irrespective of the true mode of the disease. Further relationships of certain sib-pair tests and lod score analysis under specific assumed genetic modes are investigated.
Your move: The effect of chess on mathematics test scores

DEFF Research Database (Denmark)

Rosholm, Michael; Mikkelsen, Mai Bjørnskov; Gumede, Kamilla Trille

2017-01-01

We analyse the effect of substituting a weekly mathematics lesson in primary school grades 1–3 with a lesson in mathematics based on chess instruction. We use data from the City of Aarhus in Denmark, combining test score data with a comprehensive data set obtained from administrative registers. We...... use two different methodological approaches to identify and estimate treatment effects and we tend to find positive effects, indicating that knowledge acquired through chess play can be transferred to the domain of mathematics. We also find larger impacts for unhappy children and children who...... are bored in school, perhaps because chess instruction facilitates learning by providing an alternative approach to mathematics for these children. The results are encouraging and suggest that chess may be an important and effective tool for improving mathematical capacity in young students....
Your move: The effect of chess on mathematics test scores.

Directory of Open Access Journals (Sweden)

Michael Rosholm

Full Text Available We analyse the effect of substituting a weekly mathematics lesson in primary school grades 1-3 with a lesson in mathematics based on chess instruction. We use data from the City of Aarhus in Denmark, combining test score data with a comprehensive data set obtained from administrative registers. We use two different methodological approaches to identify and estimate treatment effects and we tend to find positive effects, indicating that knowledge acquired through chess play can be transferred to the domain of mathematics. We also find larger impacts for unhappy children and children who are bored in school, perhaps because chess instruction facilitates learning by providing an alternative approach to mathematics for these children. The results are encouraging and suggest that chess may be an important and effective tool for improving mathematical capacity in young students.
Association of Health Sciences Reasoning Test scores with academic and experiential performance.

Science.gov (United States)

Cox, Wendy C; McLaughlin, Jacqueline E

2014-05-15

To assess the association of scores on the Health Sciences Reasoning Test (HSRT) with academic and experiential performance in a doctor of pharmacy (PharmD) curriculum. The HSRT was administered to 329 first-year (P1) PharmD students. Performance on the HSRT and its subscales was compared with academic performance in 29 courses throughout the curriculum and with performance in advanced pharmacy practice experiences (APPEs). Significant positive correlations were found between course grades in 8 courses and HSRT overall scores. All significant correlations were accounted for by pharmaceutical care laboratory courses, therapeutics courses, and a law and ethics course. There was a lack of moderate to strong correlation between HSRT scores and academic and experiential performance. The usefulness of the HSRT as a tool for predicting student success may be limited.
Do Standardized Tests Penalize Deep-Thinking, Creative, or Conscientious Students?: Some Personality Correlates of Graduate Record Examinations Test Scores

Science.gov (United States)

Powers, Donald E.; Kaufman, James C.

2004-01-01

The objective of the study reported here was to explore the relationship of Graduate Record Examinations (GRE) General Test scores to selected personality traits--conscientiousness, rationality, ingenuity, quickness, creativity, and depth. A sample of 342 GRE test takers completed short personality inventory scales for each trait. Analyses…
Intelligence Test Scores and Birth Order among Young Norwegian Men (Conscripts) Analyzed within and between Families

Science.gov (United States)

Bjerkedal, Tor; Kristensen, Petter; Skjeret, Geir A.; Brevik, John I.

2007-01-01

The present paper reports the results of a within and between family analysis of the relation between birth order and intelligence. The material comprises more than a quarter of a million test scores for intellectual performance of Norwegian male conscripts recorded during 1984-2004. Conscripts, mostly 18-19 years of age, were born to women for…
Interpreting force concept inventory scores: Normalized gain and SAT scores

Directory of Open Access Journals (Sweden)

Jeffrey J. Steinert

2007-05-01

Full Text Available Preinstruction SAT scores and normalized gains (G on the force concept inventory (FCI were examined for individual students in interactive engagement (IE courses in introductory mechanics at one high school (N=335 and one university (N=292 , and strong, positive correlations were found for both populations ( r=0.57 and r=0.46 , respectively. These correlations are likely due to the importance of cognitive skills and abstract reasoning in learning physics. The larger correlation coefficient for the high school population may be a result of the much shorter time interval between taking the SAT and studying mechanics, because the SAT may provide a more current measure of abilities when high school students begin the study of mechanics than it does for college students, who begin mechanics years after the test is taken. In prior research a strong correlation between FCI G and scores on Lawson’s Classroom Test of Scientific Reasoning for students from the same two schools was observed. Our results suggest that, when interpreting class average normalized FCI gains and comparing different classes, it is important to take into account the variation of students’ cognitive skills, as measured either by the SAT or by Lawson’s test. While Lawson’s test is not commonly given to students in most introductory mechanics courses, SAT scores provide a readily available alternative means of taking account of students’ reasoning abilities. Knowing the students’ cognitive level before instruction also allows one to alter instruction or to use an intervention designed to improve students’ cognitive level.
Interpreting force concept inventory scores: Normalized gain and SAT scores

Directory of Open Access Journals (Sweden)

Vincent P. Coletta

2007-05-01

Full Text Available Preinstruction SAT scores and normalized gains (G on the force concept inventory (FCI were examined for individual students in interactive engagement (IE courses in introductory mechanics at one high school (N=335 and one university (N=292, and strong, positive correlations were found for both populations (r=0.57 and r=0.46, respectively. These correlations are likely due to the importance of cognitive skills and abstract reasoning in learning physics. The larger correlation coefficient for the high school population may be a result of the much shorter time interval between taking the SAT and studying mechanics, because the SAT may provide a more current measure of abilities when high school students begin the study of mechanics than it does for college students, who begin mechanics years after the test is taken. In prior research a strong correlation between FCI G and scores on Lawson’s Classroom Test of Scientific Reasoning for students from the same two schools was observed. Our results suggest that, when interpreting class average normalized FCI gains and comparing different classes, it is important to take into account the variation of students’ cognitive skills, as measured either by the SAT or by Lawson’s test. While Lawson’s test is not commonly given to students in most introductory mechanics courses, SAT scores provide a readily available alternative means of taking account of students’ reasoning abilities. Knowing the students’ cognitive level before instruction also allows one to alter instruction or to use an intervention designed to improve students’ cognitive level.
CaPTHUS scoring model in primary hyperparathyroidism: can it eliminate the need for ioPTH testing?

Science.gov (United States)

Elfenbein, Dawn M; Weber, Sara; Schneider, David F; Sippel, Rebecca S; Chen, Herbert

2015-04-01

The CaPTHUS model was reported to have a positive predictive value of 100 % to correctly predict single-gland disease in patients with primary hyperparathyroidism, thus obviating the need for intraoperative parathyroid hormone (ioPTH) testing. We sought to apply the CaPTHUS scoring model in our patient population and assess its utility in predicting long-term biochemical cure. We retrospective reviewed all parathyroidectomies for primary hyperparathyroidism performed at our university hospital from 2003 to 2012. We routinely perform ioPTH testing. Biochemical cure was defined as a normal calcium level at 6 months. A total of 1,421 patients met the inclusion criteria: 78 % of patients had a single adenoma at the time of surgery, 98 % had a normal serum calcium at 1 week postoperatively, and 96 % had a normal serum calcium level 6 months postoperatively. Using the CaPTHUS scoring model, 307 patients (22.5 %) had a score of ≥ 3, with a positive predictive value of 91 % for single adenoma. A CaPTHUS score of ≥ 3 had a positive predictive value of 98 % for biochemical cure at 1 week as well as at 6 months. In our population, where ioPTH testing is used routinely to guide use of bilateral exploration, patients with a preoperative CaPTHUS score of ≥ 3 had good long-term biochemical cure rates. However, the model only predicted adenoma in 91 % of cases. If minimally invasive parathyroidectomy without ioPTH testing had been done for these patients, the cure rate would have dropped from 98 % to an unacceptable 89 %. Even in these patients with high CaPTHUS scores, multigland disease is present in almost 10 %, and ioPTH testing is necessary.
Score Gains on g-loaded Tests: No g

NARCIS (Netherlands)

te Nijenhuis, J.; van Vianen, A.E.M.; van der Flier, H.

2007-01-01

IQ scores provide the best general predictor of success in education, job training, and work. However, there are many ways in which IQ scores can be increased, for instance by means of retesting or participation in learning potential training programs. What is the nature of these score gains? Jensen
The Implementation of Role-Playing Model in Principles of Finance Accounting Learning to Improve Students’ Enjoyment and Students’ Test Scores

Directory of Open Access Journals (Sweden)

L. Saptono

2010-01-01

Full Text Available This research is a classroom action research. The goal of conducting this research is to improve students’ enjoyment level and their test scores by implementing role-playing method. The research is conducted in Accounting Education Study Program of Sanata Dharma University at odd semester on academic year 2010/2011. The participants were divided into two classes. The first class was the class that got the treatment, while the second class was the control class. The result of the study showed that there was an improvement of students’ enjoyment level and test scores in the class which implemented role-playing method.
Effects of Public Preschool Expenditures on the Test Scores of 4th Graders: Evidence from TIMSS

Science.gov (United States)

Waldfogel, Jane; Zhai, Fuhua

2011-01-01

This study examines the effects of public preschool expenditures on the math and science scores of 4th graders, holding constant child, family, and school characteristics, other relevant social expenditures, and country and year effects, in seven Organization for Economic Co-operation and Development (OECD) countries -- Australia, Japan, Netherlands, New Zealand, Norway, U.K., and U.S -- using data from the 1995 and 2003 Trends in International Mathematics and Science Study (TIMSS). Our results indicate that there are small but significant positive effects of public preschool expenditures on the math and science scores of 4th graders and preschool expenditures reduce the risk of children scoring at the low level of proficiency. We also find some evidence that children from low-resource homes and homes where the test language is not always spoken may tend to gain more from increased public preschool expenditures than other children,. PMID:21442008
Effect on intelligence test score of prenatal exposure to ionizing radiation in Hiroshima and Nagasaki

International Nuclear Information System (INIS)

Schull, W.J.; Otake, Masanori; Yoshimaru, Hiroshi.

1988-10-01

Analyses of intelligence test scores (Koga) at 10-11 years of age of individuals exposed prenatally to the atomic bombing of Hiroshima and Nagasaki using estimates of the uterine absorbed dose based on the recently introduced system of dosimetry, the Dosimetry System 1986 (DS86), reveal the following: 1) there is no evidence of a radiation-related effect on intelligence among those individuals exposed within 0-7 weeks after fertilization or in the 26th or subsequent weeks; 2) for individuals exposed at 8-15 weeks after fertilization, and to a lesser extent those exposed at 16-25 weeks, the mean tests scores but not the variances are significantly heterogeneous among exposure categories; 3) the cumulative distribution of test scores suggests a progressive shift downwards in individual scores with increasing exposure; and 4) within the group most sensitive to the occurrence of clinically recognizable severe mental retardation, individuals exposed 8 through 15 weeks after fertilization, the regression of intelligence score on estimated DS86 uterine absorbed dose is more linear than with T65DR fetal dose, the diminution in intelligence score under the linear model is 21-29 points at 1Gy. The effect is somewhat greater when the controls receiving less than 0.01 Gy are excluded, 24-33 points at 1 Gy. These findings are discussed in the light of the earlier analysis of the frequency of occurrence of mental retardation among the prenatally exposed survivors of the A-bombing of Hiroshima and Nagasaki. It is suggested that both are the consequences of the same underlying biological process or processes. (author)

Associations of maximal strength and muscular endurance test scores with cardiorespiratory fitness and body composition.

Science.gov (United States)

Vaara, Jani P; Kyröläinen, Heikki; Niemi, Jaakko; Ohrankämmen, Olli; Häkkinen, Arja; Kocay, Sheila; Häkkinen, Keijo

2012-08-01

The purpose of the present study was to assess the relationships between maximal strength and muscular endurance test scores additionally to previously widely studied measures of body composition and maximal aerobic capacity. 846 young men (25.5 ± 5.0 yrs) participated in the study. Maximal strength was measured using isometric bench press, leg extension and grip strength. Muscular endurance tests consisted of push-ups, sit-ups and repeated squats. An indirect graded cycle ergometer test was used to estimate maximal aerobic capacity (V(O2)max). Body composition was determined with bioelectrical impedance. Moreover, waist circumference (WC) and height were measured and body mass index (BMI) calculated. Maximal bench press was positively correlated with push-ups (r = 0.61, p strength (r = 0.34, p strength correlated positively (r = 0.36-0.44, p test scores were related to maximal aerobic capacity and body fat content, while fat free mass was associated with maximal strength test scores and thus is a major determinant for maximal strength. A contributive role of maximal strength to muscular endurance tests could be identified for the upper, but not the lower extremities. These findings suggest that push-up test is not only indicative of body fat content and maximal aerobic capacity but also maximal strength of upper body, whereas repeated squat test is mainly indicative of body fat content and maximal aerobic capacity, but not maximal strength of lower extremities.
Introducing Computer-Based Testing in High-Stakes Exams in Higher Education: Results of a Field Experiment.

Science.gov (United States)

Boevé, Anja J; Meijer, Rob R; Albers, Casper J; Beetsma, Yta; Bosker, Roel J

2015-01-01

The introduction of computer-based testing in high-stakes examining in higher education is developing rather slowly due to institutional barriers (the need of extra facilities, ensuring test security) and teacher and student acceptance. From the existing literature it is unclear whether computer-based exams will result in similar results as paper-based exams and whether student acceptance can change as a result of administering computer-based exams. In this study, we compared results from a computer-based and paper-based exam in a sample of psychology students and found no differences in total scores across the two modes. Furthermore, we investigated student acceptance and change in acceptance of computer-based examining. After taking the computer-based exam, fifty percent of the students preferred paper-and-pencil exams over computer-based exams and about a quarter preferred a computer-based exam. We conclude that computer-based exam total scores are similar as paper-based exam scores, but that for the acceptance of high-stakes computer-based exams it is important that students practice and get familiar with this new mode of test administration.
Student Test Scores: How the Sausage Is Made and Why You Should Care. Evidence Speaks Reports, Vol 1, #25

Science.gov (United States)

Jacob, Brian A.

2016-01-01

Contrary to popular belief, modern cognitive assessments--including the new Common Core tests--produce test scores based on sophisticated statistical models rather than the simple percent of items a student answers correctly. While there are good reasons for this, it means that reported test scores depend on many decisions made by test designers,…
Soetomo score: score model in early identification of acute haemorrhagic stroke

Directory of Open Access Journals (Sweden)

Moh Hasan Machfoed

2016-06-01

Full Text Available Aim of the study: On financial or facility constraints of brain imaging, score model is used to predict the occurrence of acute haemorrhagic stroke. Accordingly, this study attempts to develop a new score model, called Soetomo score. Material and methods: The researchers performed a cross-sectional study of 176 acute stroke patients with onset of ≤24 hours who visited emergency unit of Dr. Soetomo Hospital from July 14th to December 14th, 2014. The diagnosis of haemorrhagic stroke was confirmed by head computed tomography scan. There were seven predictors of haemorrhagic stroke which were analysed by using bivariate and multivariate analyses. Furthermore, a multiple discriminant analysis resulted in an equation of Soetomo score model. The receiver operating characteristic procedure resulted in the values of area under curve and intersection point identifying haemorrhagic stroke. Afterward, the diagnostic test value was determined. Results: The equation of Soetomo score model was (3 × loss of consciousness + (3.5 × headache + (4 × vomiting − 4.5. Area under curve value of this score was 88.5% (95% confidence interval = 83.3–93.7%. In the Soetomo score model value of ≥−0.75, the score reached the sensitivity of 82.9%, specificity of 83%, positive predictive value of 78.8%, negative predictive value of 86.5%, positive likelihood ratio of 4.88, negative likelihood ratio of 0.21, false negative of 17.1%, false positive of 17%, and accuracy of 83%. Conclusions: The Soetomo score model value of ≥−0.75 can identify acute haemorrhagic stroke properly on the financial or facility constrains of brain imaging.
NCACO-score: An effective main-chain dependent scoring function for structure modeling

Directory of Open Access Journals (Sweden)

Dong Xiaoxi

2011-05-01

Full Text Available Abstract Background Development of effective scoring functions is a critical component to the success of protein structure modeling. Previously, many efforts have been dedicated to the development of scoring functions. Despite these efforts, development of an effective scoring function that can achieve both good accuracy and fast speed still presents a grand challenge. Results Based on a coarse-grained representation of a protein structure by using only four main-chain atoms: N, Cα, C and O, we develop a knowledge-based scoring function, called NCACO-score, that integrates different structural information to rapidly model protein structure from sequence. In testing on the Decoys'R'Us sets, we found that NCACO-score can effectively recognize native conformers from their decoys. Furthermore, we demonstrate that NCACO-score can effectively guide fragment assembly for protein structure prediction, which has achieved a good performance in building the structure models for hard targets from CASP8 in terms of both accuracy and speed. Conclusions Although NCACO-score is developed based on a coarse-grained model, it is able to discriminate native conformers from decoy conformers with high accuracy. NCACO is a very effective scoring function for structure modeling.
College Math Assessment: SAT Scores vs. College Math Placement Scores

Science.gov (United States)

Foley-Peres, Kathleen; Poirier, Dawn

2008-01-01

Many colleges and university's use SAT math scores or math placement tests to place students in the appropriate math course. This study compares the use of math placement scores and SAT scores for 188 freshman students. The student's grades and faculty observations were analyzed to determine if the SAT scores and/or college math assessment scores…
A Comparison of Scores on the WISC-R and Lorge-Thorndike Intelligence Test for Disadvantaged Black Elementary School Children

Science.gov (United States)

Lowe, James D.; Karnes, Frances A.

1976-01-01

It is indicated that, although the scores [obtained on both tests] are significantly correlated, the tests yield significantly different scores with the Lorge-Thorndike consistently overestimating the WISC-R full scale I.Q. (Author)
The Mediating Effect of Listening Metacognitive Awareness between Test-Taking Motivation and Listening Test Score: An Expectancy-Value Theory Approach

OpenAIRE

Xu, Jian

2017-01-01

The present study investigated test-taking motivation in L2 listening testing context by applying Expectancy-Value Theory as the framework. Specifically, this study was intended to examine the complex relationships among expectancy, importance, interest, listening anxiety, listening metacognitive awareness, and listening test score using data from a large-scale and high-stakes language test among Chinese first-year undergraduates. Structural equation modeling was used to examine the mediating...
Reliability of ultrasound grading traditional score and new global OMERACT-EULAR score system (GLOESS): results from an inter- and intra-reading exercise by rheumatologists.

Science.gov (United States)

Ventura-Ríos, Lucio; Hernández-Díaz, Cristina; Ferrusquia-Toríz, Diana; Cruz-Arenas, Esteban; Rodríguez-Henríquez, Pedro; Alvarez Del Castillo, Ana Laura; Campaña-Parra, Alfredo; Canul, Efrén; Guerrero Yeo, Gerardo; Mendoza-Ruiz, Juan Jorge; Pérez Cristóbal, Mario; Sicsik, Sandra; Silva Luna, Karina

2017-12-01

This study aims to test the reliability of ultrasound to graduate synovitis in static and video images, evaluating separately grayscale and power Doppler (PD), and combined. Thirteen trained rheumatologist ultrasonographers participated in two separate rounds reading 42 images, 15 static and 27 videos, of the 7-joint count [wrist, 2nd and 3rd metacarpophalangeal (MCP), 2nd and 3rd interphalangeal (IPP), 2nd and 5th metatarsophalangeal (MTP) joints]. The images were from six patients with rheumatoid arthritis, performed by one ultrasonographer. Synovitis definition was according to OMERACT. Scoring system in grayscale, PD separately, and combined (GLOESS-Global OMERACT-EULAR Score System) were reviewed before exercise. Reliability intra- and inter-reading was calculated with Cohen's kappa weighted, according to Landis and Koch. Kappa values for inter-reading were good to excellent. The minor kappa was for GLOESS in static images, and the highest was for the same scoring in videos (k 0.59 and 0.85, respectively). Excellent values were obtained for static PD in 5th MTP joint and for PD video in 2nd MTP joint. Results for GLOESS in general were good to moderate. Poor agreement was observed in 3rd MCP and 3rd IPP in all kinds of images. Intra-reading agreement were greater in grayscale and GLOESS in static images than in videos (k 0.86 vs. 0.77 and k 0.86 vs. 0.71, respectively), but PD was greater in videos than in static images (k 1.0 vs. 0.79). The reliability of the synovitis scoring through static images and videos is in general good to moderate when using grayscale and PD separately or combined.
Parent Ratings of Impulsivity and Inhibition Predict State Testing Scores

Directory of Open Access Journals (Sweden)

Rebecca A. Lundwall

2018-03-01

Full Text Available One principle of cognitive development is that earlier intervention for educational difficulties tends to improve outcomes such as future educational and career success. One possible way to help students who struggle is to determine if they process information differently. Such determination might lead to clues for interventions. For example, early information processing requires attention before the information can be identified, encoded, and stored. The aim of the present study was to investigate whether parent ratings of inattention, inhibition, and impulsivity, and whether error rate on a reflexive attention task could be used to predict child scores on state standardized tests. Finding such an association could provide assistance to educators in identifying academically struggling children who might require targeted educational interventions. Children (N = 203 were invited to complete a peripheral cueing task (which measures the automatic reorienting of the brain’s attentional resources from one location to another. While the children completed the task, their parents completed a questionnaire. The questionnaire gathered information on broad indicators of child functioning, including observable behaviors of impulsivity, inattention, and inhibition, as well as state academic scores (which the parent retrieved online from their school. We used sequential regression to analyze contributions of error rate and parent-rated behaviors in predicting six academic scores. In one of the six analyses (for science, we found that the improvement was significant from the simplified model (with only family income, child age, and sex as predictors to the full model (adding error rate and three parent-rated behaviors. Two additional analyses (reading and social studies showed near significant improvement from simplified to full models. Parent-rated behaviors were significant predictors in all three of these analyses. In the reading score analysis
The Effects of Listening to Music Just Before Reading Test on Students’ Test Score

OpenAIRE

MAHDAVI, Mojtaba

2015-01-01

Abstract. In this study the researcher examined the effect of music on reading comprehension played just before the test . Because the emotional consequences of music listening are evident in stress and anxiety removal, it was used as a tool to pacify the mind of the tastes and boost their memory and the related cognitive processes. Experimental group did well with the mean score of) and control group (). This study confirmed that using multimedia devices such as music can not only i...
Identifying genetic marker sets associated with phenotypes via an efficient adaptive score test

KAUST Repository

Cai, T.; Lin, X.; Carroll, R. J.

2012-01-01

the overall effect of a marker-set have been actively studied in recent years. For example, score tests derived under an Empirical Bayes (EB) framework (Liu and others, 2007. Semiparametric regression of multidimensional genetic pathway data: least
Gender Gaps in High School GPA and ACT Scores: High School Grade Point Average and ACT Test Score by Subject and Gender. Information Brief 2014-12

Science.gov (United States)

ACT, Inc., 2014

2014-01-01

Female students who graduated from high school in 2013 averaged higher grades than their male counterparts in all subjects, but male graduates earned higher scores on the math and science sections of the ACT. This information brief looks at high school grade point average and ACT test score by subject and gender
A Persian version of the sustained auditory attention capacity test and its results in normal children

Directory of Open Access Journals (Sweden)

Sanaz Soltanparast

2013-03-01

Full Text Available Background and Aim: Sustained attention refers to the ability to maintain attention in target stimuli over a sustained period of time. This study was conducted to develop a Persian version of the sustained auditory attention capacity test and to study its results in normal children.Methods: To develop the Persian version of the sustained auditory attention capacity test, like the original version, speech stimuli were used. The speech stimuli consisted of one hundred monosyllabic words consisting of a 20 times random of and repetition of the words of a 21-word list of monosyllabic words, which were randomly grouped together. The test was carried out at comfortable hearing level using binaural, and diotic presentation modes on 46 normal children of 7 to 11 years of age of both gender.Results: There was a significant difference between age, and an average of impulsiveness error score (p=0.004 and total score of sustained auditory attention capacity test (p=0.005. No significant difference was revealed between age, and an average of inattention error score and attention reduction span index. Gender did not have a significant impact on various indicators of the test.Conclusion: The results of this test on a group of normal hearing children confirmed its ability to measure sustained auditory attention capacity through speech stimuli.
Results from the Astronomy Diagnostic Test: 3 Years at MiraCosta College

Science.gov (United States)

Sirbaugh French, Rica

2007-12-01

The Astronomy Diagnostic Test 2.0 (ADT) was administered to 26 sections of ASTR 101 at MiraCosta College (MCC) from fall 2004 through summer 2007. MiraCosta is a two-year community college located in Oceanside, CA, USA (roughly 40 miles north of San Diego) with an enrollment of approximately 11,000 students. ASTR 101 is MiraCosta's introductory astronomy survey course for non-science majors and has no math prerequisite. Class sizes ranged from 10 to 38 students. Comparison with the ADT National Project results indicates similar pre- and post-course averages: 31.8% (MCC) vs. 32.4% (National) for pre-course tests and 49.3% (MCC) vs. 47.3% (National) for post-course tests. The sample sizes are 709 and 530 students for the pre- and post-tests, respectively. The normalized gain for the entire data set is 0.256 and the effect size (ES) is 1.04, meaning that approximately 85% of the post-test scores are above the average of the pre-test scores. Previous studies have shown the ADT to be a reliable indicator of pre-course misconceptions while post-course scores are useful for comparing modes of instruction and assessing student learning on a limited number of concepts. Additional analyses of the MCC data probe for trends with variables such as gender, class size, and implementation of materials designed for a more "learner-centered” approach (such as Lecture Tutorials, Ranking Tasks, and Think-Pair-Share questions), as well as gains for a particular subset of concepts.
From Test Scores to Language Use: Emergent Bilinguals Using English to Accomplish Academic Tasks

Science.gov (United States)

Rodriguez-Mojica, Claudia

2018-01-01

Prominent discourses about emergent bilinguals' academic abilities tend to focus on performance as measured by test scores and perpetuate the message that emergent bilinguals trail far behind their peers. When we remove the constraints of formal testing situations, what can emergent bilinguals do in English as they engage in naturally occurring…
Introducing Computer-Based Testing in High-Stakes Exams in Higher Education: Results of a Field Experiment

Science.gov (United States)

Boevé, Anja J.; Meijer, Rob R.; Albers, Casper J.; Beetsma, Yta; Bosker, Roel J.

2015-01-01

The introduction of computer-based testing in high-stakes examining in higher education is developing rather slowly due to institutional barriers (the need of extra facilities, ensuring test security) and teacher and student acceptance. From the existing literature it is unclear whether computer-based exams will result in similar results as paper-based exams and whether student acceptance can change as a result of administering computer-based exams. In this study, we compared results from a computer-based and paper-based exam in a sample of psychology students and found no differences in total scores across the two modes. Furthermore, we investigated student acceptance and change in acceptance of computer-based examining. After taking the computer-based exam, fifty percent of the students preferred paper-and-pencil exams over computer-based exams and about a quarter preferred a computer-based exam. We conclude that computer-based exam total scores are similar as paper-based exam scores, but that for the acceptance of high-stakes computer-based exams it is important that students practice and get familiar with this new mode of test administration. PMID:26641632
Visual-Constructional Ability in Individuals with Severe Obesity: Rey Complex Figure Test Accuracy and the Q-Score

Directory of Open Access Journals (Sweden)

Hanna L. Sargénius

2017-09-01

Full Text Available The aims of this study were to investigate visual-construction and organizational strategy among individuals with severe obesity, as measured by the Rey Complex Figure Test (RCFT, and to examine the validity of the Q-score as a measure for the quality of performance on the RCFT. Ninety-six non-demented morbidly obese (MO patients and 100 healthy controls (HC completed the RCFT. Their performance was calculated by applying the standard scoring criteria. The quality of the copying process was evaluated per the directions of the Q-score scoring system. Results revealed that the MO did not perform significantly lower than the HC on Copy accuracy (mean difference −0.302, CI −1.374 to 0.769, p = 0.579. In contrast, the groups did statistically differ from each other, with MO performing poorer than the HC on the Q-score (mean −1.784, CI −3.237 to −0.331, p = 0.016 and the Unit points (mean −1.409, CI −2.291 to −0.528, p = 0.002, but not on the Order points score (mean −0.351, CI −0.994 to 0.293, p = 0.284. Differences on the Unit score and the Q-score were slightly reduced when adjusting for gender, age, and education. This study presents evidence supporting the presence of inefficiency in visuospatial constructional ability among MO patients. We believe we have found an indication that the Q-score captures a wider range of cognitive processes that are not described by traditional scoring methods. Rather than considering accuracy and placement of the different elements only, the Q-score focuses more on how the subject has approached the task.
Predisposing factors of pneumothorax in percutaneous transthoracic fine needle aspiration biopsy: comparison between CT emphysema score and pulmonary function test

Energy Technology Data Exchange (ETDEWEB)

Lee, Chang Ho; Park, Kyung Joo; Park, Dong Won; Jung, Kyung Il; Suh, Jung Ho [Ajou Univ. College of Medicine, Seoul (Korea, Republic of)

1997-11-01

To compare the CT emphysema score with various factors of pulmonary function test by simple spirometry and to use the result as a predictor of pneumothorax in percutaneous transthoracic fine needle aspiration biopsy. The CT scans of 106 patients who had undergone percutaneous transthoracic fine needle aspiration biopsy of lung lesions within the previous 18 months were retrospectively reviewed. In 75 of these 106 cases, the results of the pulmonary function test were also reviewed. On plain chest radiography, pneumothorax was noted in 20 cases (19%). Emphysema was blindly evaluated. We divided each lung into four segments and determined the severity and involved volume of emphysema, as seen on CT. Severity was classified as one of four grades, as follow : absence of emphysema=0 ; low attenuation area of less than 5mm=1 ; low attenuation area of more than 5mm, and vascular pruning with normal lung intervening=2 ; and diffuse low attenuation without intervening normal lung, and larger confluent low attenuation with vascular pruning and distortion of branching pattern occupying all or almost all the involved parenchyma=3. The involved area was also classified as one of four grades : less than 25%=1 ; 25 - 49%=2 ; 51 - 74%=3 ; and more than 75%=4. The CT emphysema score was defined as the average of the grade of severity multiplied by the grade of involved area. Pulmonary function tests, consisting of simple spirometry and a pulmonologist's interpretation, were evaluated. We also evaluated depth and size of lesion as known predisposing factors in postbioptic pneumothorax. Statistical analysis was performed using the chi-square test, Wilcoxon ranks sum W test and the student t test. A comparison between the two groups of occurrence(with or without pneumothorax) showed the emphysema scores to be 1.69{+-}2.0 and 1.11{+-}2.9, respectively ; there was thus no significant difference between the two groups (z= - 0.048, p>0.10). Nor were differences revealed by the
Predisposing factors of pneumothorax in percutaneous transthoracic fine needle aspiration biopsy: comparison between CT emphysema score and pulmonary function test

International Nuclear Information System (INIS)

Lee, Chang Ho; Park, Kyung Joo; Park, Dong Won; Jung, Kyung Il; Suh, Jung Ho

1997-01-01

To compare the CT emphysema score with various factors of pulmonary function test by simple spirometry and to use the result as a predictor of pneumothorax in percutaneous transthoracic fine needle aspiration biopsy. The CT scans of 106 patients who had undergone percutaneous transthoracic fine needle aspiration biopsy of lung lesions within the previous 18 months were retrospectively reviewed. In 75 of these 106 cases, the results of the pulmonary function test were also reviewed. On plain chest radiography, pneumothorax was noted in 20 cases (19%). Emphysema was blindly evaluated. We divided each lung into four segments and determined the severity and involved volume of emphysema, as seen on CT. Severity was classified as one of four grades, as follow : absence of emphysema=0 ; low attenuation area of less than 5mm=1 ; low attenuation area of more than 5mm, and vascular pruning with normal lung intervening=2 ; and diffuse low attenuation without intervening normal lung, and larger confluent low attenuation with vascular pruning and distortion of branching pattern occupying all or almost all the involved parenchyma=3. The involved area was also classified as one of four grades : less than 25%=1 ; 25 - 49%=2 ; 51 - 74%=3 ; and more than 75%=4. The CT emphysema score was defined as the average of the grade of severity multiplied by the grade of involved area. Pulmonary function tests, consisting of simple spirometry and a pulmonologist's interpretation, were evaluated. We also evaluated depth and size of lesion as known predisposing factors in postbioptic pneumothorax. Statistical analysis was performed using the chi-square test, Wilcoxon ranks sum W test and the student t test. A comparison between the two groups of occurrence(with or without pneumothorax) showed the emphysema scores to be 1.69±2.0 and 1.11±2.9, respectively ; there was thus no significant difference between the two groups (z= - 0.048, p>0.10). Nor were differences revealed by the pulmonary

The Disaggregation of Value-Added Test Scores to Assess Learning Outcomes in Economics Courses

Science.gov (United States)

Walstad, William B.; Wagner, Jamie

2016-01-01

This study disaggregates posttest, pretest, and value-added or difference scores in economics into four types of economic learning: positive, retained, negative, and zero. The types are derived from patterns of student responses to individual items on a multiple-choice test. The micro and macro data from the "Test of Understanding in College…
[Results of applying a paediatric early warning score system as a healthcare quality improvement plan].

Science.gov (United States)

Rivero-Martín, M J; Prieto-Martínez, S; García-Solano, M; Montilla-Pérez, M; Tena-Martín, E; Ballesteros-García, M M

2016-06-01

The aims of this study were to introduce a paediatric early warning score (PEWS) into our daily clinical practice, as well as to evaluate its ability to detect clinical deterioration in children admitted, and to train nursing staff to communicate the information and response effectively. An analysis was performed on the implementation of PEWS in the electronic health records of children (0-15 years) in our paediatric ward from February 2014 to September 2014. The maximum score was 6. Nursing staff reviewed scores >2, and if >3 medical and nursing staff reviewed it. Monitoring indicators: % of admissions with scoring; % of complete data capture; % of scores >3; % of scores >3 reviewed by medical staff, % of changes in treatment due to the warning system, and number of patients who needed Paediatric Intensive Care Unit (PICU) admission, or died without an increased warning score. The data were collected from all patients (931) admitted. The scale was measured 7,917 times, with 78.8% of them with complete data capture. Very few (1.9%) showed scores >3, and 14% of them with changes in clinical management (intensifying treatment or new diagnostic tests). One patient (scored 2) required PICU admission. There were no deaths. Parents or nursing staff concern was registered in 80% of cases. PEWS are useful to provide a standardised assessment of clinical status in the inpatient setting, using a unique scale and implementing data capture. Because of the lack of severe complications requiring PICU admission and deaths, we will have to use other data to evaluate these scales. Copyright © 2016 SECA. Published by Elsevier Espana. All rights reserved.
Reliability and validity of the new Tanaka B Intelligence Scale scores: a group intelligence test.

Directory of Open Access Journals (Sweden)

Yota Uno

Full Text Available OBJECTIVE: The present study evaluated the reliability and concurrent validity of the new Tanaka B Intelligence Scale, which is an intelligence test that can be administered on groups within a short period of time. METHODS: The new Tanaka B Intelligence Scale and Wechsler Intelligence Scale for Children-Third Edition were administered to 81 subjects (mean age ± SD 15.2 ± 0.7 years residing in a juvenile detention home; reliability was assessed using Cronbach's alpha coefficient, and concurrent validity was assessed using the one-way analysis of variance intraclass correlation coefficient. Moreover, receiver operating characteristic analysis for screening for individuals who have a deficit in intellectual function (an FIQ<70 was performed. In addition, stratum-specific likelihood ratios for detection of intellectual disability were calculated. RESULTS: The Cronbach's alpha for the new Tanaka B Intelligence Scale IQ (BIQ was 0.86, and the intraclass correlation coefficient with FIQ was 0.83. Receiver operating characteristic analysis demonstrated an area under the curve of 0.89 (95% CI: 0.85-0.96. In addition, the stratum-specific likelihood ratio for the BIQ≤65 stratum was 13.8 (95% CI: 3.9-48.9, and the stratum-specific likelihood ratio for the BIQ≥76 stratum was 0.1 (95% CI: 0.03-0.4. Thus, intellectual disability could be ruled out or determined. CONCLUSION: The present results demonstrated that the new Tanaka B Intelligence Scale score had high reliability and concurrent validity with the Wechsler Intelligence Scale for Children-Third Edition score. Moreover, the post-test probability for the BIQ could be calculated when screening for individuals who have a deficit in intellectual function. The new Tanaka B Intelligence Test is convenient and can be administered within a variety of settings. This enables evaluation of intellectual development even in settings where performing intelligence tests have previously been difficult.
Automated Scoring for the "TOEFL Junior"® Comprehensive Writing and Speaking Test. Research Report. ETS RR-15-09

Science.gov (United States)

Evanini, Keelan; Heilman, Michael; Wang, Xinhao; Blanchard, Daniel

2015-01-01

This report describes the initial automated scoring results that were obtained using the constructed responses from the Writing and Speaking sections of the pilot forms of the "TOEFL Junior"® Comprehensive test administered in late 2011. For all of the items except one (the edit item in the Writing section), existing automated scoring…
Balance index score as a predictive factor for lower sports results or anterior cruciate ligament knee injuries in Croatian female athletes--preliminary study.

Science.gov (United States)

Vrbanić, Tea Schnurrer-Luke; Ravlić-Gulan, Jagoda; Gulan, Gordan; Matovinović, Damir

2007-03-01

Female athletes participating in high-risk sports suffer anterior cruciate ligament (ACL) knee injury at a 4- to 6-fold greater rate than do male athletes. ACL injuries result either from contact mechanisms or from certain unexplained non-contact mechanisms occurring during daily professional sports activities. The occurrence of non-contact injuries points to the existence of certain factors intrinsic to the knee that can lead to ACL rupture. When knee joint movement overcomes the static and the dynamic constraint systems, non-contact ACL injury may occur. Certain recent results suggest that balance and neuromuscular control play a central role in knee joint stability, protection and prevention of ACL injuries. The purpose of this study is to evaluate balance neuromuscular skills in healthy Croatian female athletes by measuring their balance index score, as well as to estimate a possible correlation between their balance index score and balance effectiveness. This study is conducted in an effort to reduce the risk of future injuries and thus prevent female athletes from withdrawing from sports prematurely. We analysed fifty-two female athletes in the high-risk sports of handball and volleyball, measuring for their static and dynamic balance index scores, using the Sport KAT 2000 testing system. This method may be used to monitor balance and coordination systems and may help to develop simpler measurements of neuromuscular control, which can be used to estimate risk predictors in athletes who withdraw from sports due to lower sports results or ruptured anterior cruciate ligament and to direct female athletes to more effective, targeted preventive interventions. The tested Croatian female athletes with lower sports results and ACL knee injury incurred after the testing were found to have a higher balance index score compared to healthy athletes. We therefore suggest that a higher balance index score can be used as an effective risk predictor for lower sports results
The TSCA interagency testing committee`s approaches to screening and scoring chemicals and chemical groups: 1977-1983

Energy Technology Data Exchange (ETDEWEB)

Walker, J.D. [Environmental Protection Agency, Washington, DC (United States)

1990-12-31

This paper describes the TSCA interagency testing committee`s (ITC) approaches to screening and scoring chemicals and chemical groups between 1977 and 1983. During this time the ITC conducted five scoring exercises to select chemicals and chemical groups for detailed review and to determine which of these chemicals and chemical groups should be added to the TSCA Section 4(e) Priority Testing List. 29 refs., 1 fig., 2 tabs.
The NeBoP score - a clinical prediction test for evaluation of children with Lyme Neuroborreliosis in Europe.

Science.gov (United States)

Skogman, Barbro H; Sjöwall, Johanna; Lindgren, Per-Eric

2015-12-17

The diagnosis of Lyme neuroborreliosis (LNB) in Europe is based on clinical symptoms and laboratory data, such as pleocytosis and anti-Borrelia antibodies in serum and CSF according to guidelines. However, the decision to start antibiotic treatment on admission cannot be based on Borrelia serology since results are not available at the time of lumbar puncture. Therefore, an early prediction test would be useful in clinical practice. The aim of the study was to develop and evaluate a clinical prediction test for children with LNB in a relevant European setting. Clinical and laboratory data were collected retrospectively from a cohort of children being evaluated for LNB in Southeast Sweden. A clinical neuroborreliosis prediction test, the NeBoP score, was designed to differentiate between a high and a low risk of having LNB. The NeBoP score was then prospectively validated in a cohort of children being evaluated for LNB in Central and Southeast Sweden (n = 190) and controls with other specific diagnoses (n = 49). The sensitivity of the NeBoP score was 90 % (CI 95 %; 82-99 %) and the specificity was 90 % (CI 95 %; 85-96 %). Thus, the diagnostic accuracy (i.e. how the test correctly discriminates patients from controls) was 90 % and the area under the curve in a ROC analysis was 0.95. The positive predictive value (PPV) was 0.83 (CI 95 %; 0.75-0.93) and the negative predictive value (NPV) was 0.95 (CI 95 %; 0.90-0.99). The overall diagnostic performance of the NeBoP score is high (90 %) and the test is suggested to be useful for decision-making about early antibiotic treatment in children being evaluated for LNB in European Lyme endemic areas.
Changes in Student Populations and Average Test Scores of Dutch Primary Schools

Science.gov (United States)

Luyten, Hans; de Wolf, Inge

2011-01-01

This article focuses on the relation between student population characteristics and average test scores per school in the final grade of primary education from a dynamic perspective. Aggregated data of over 5,000 Dutch primary schools covering a 6-year period were used to study the relation between changes in school populations and shifts in mean…
Lower Quarter Y-Balance Test Scores and Lower Extremity Injury in NCAA Division I Athletes.

Science.gov (United States)

Lai, Wilson C; Wang, Dean; Chen, James B; Vail, Jeremy; Rugg, Caitlin M; Hame, Sharon L

2017-08-01

Functional movement tests that are predictive of injury risk in National Collegiate Athletic Association (NCAA) athletes are useful tools for sports medicine professionals. The Lower Quarter Y-Balance Test (YBT-LQ) measures single-leg balance and reach distances in 3 directions. To assess whether the YBT-LQ predicts the laterality and risk of sports-related lower extremity (LE) injury in NCAA athletes. Case-control study; Level of evidence, 3. The YBT-LQ was administered to 294 NCAA Division I athletes from 21 sports during preparticipation physical examinations at a single institution. Athletes were followed prospectively over the course of the corresponding season. Correlation analysis was performed between the laterality of reach asymmetry and composite scores (CS) versus the laterality of injury. Receiver operating characteristic (ROC) analysis was used to determine the optimal asymmetry cutoff score for YBT-LQ. A multivariate regression analysis adjusting for sex, sport type, body mass index, and history of prior LE surgery was performed to assess predictors of earlier and higher rates of injury. Neither the laterality of reach asymmetry nor the CS correlated with the laterality of injury. ROC analysis found optimal cutoff scores of 2, 9, and 3 cm for anterior, posteromedial, and posterolateral reach, respectively. All of these potential cutoff scores, along with a cutoff score of 4 cm used in the majority of prior studies, were associated with poor sensitivity and specificity. Furthermore, none of the asymmetric cutoff scores were associated with earlier or increased rate of injury in the multivariate analyses. YBT-LQ scores alone do not predict LE injury in this collegiate athlete population. Sports medicine professionals should be cautioned against using the YBT-LQ alone to screen for injury risk in collegiate athletes.
Depressive status explains a significant amount of the variance in COPD assessment test (CAT) scores.

Science.gov (United States)

Miravitlles, Marc; Molina, Jesús; Quintano, José Antonio; Campuzano, Anna; Pérez, Joselín; Roncero, Carlos

2018-01-01

COPD assessment test (CAT) is a short, easy-to-complete health status tool that has been incorporated into the multidimensional assessment of COPD in order to guide therapy; therefore, it is important to understand the factors determining CAT scores. This is a post hoc analysis of a cross-sectional, observational study conducted in respiratory medicine departments and primary care centers in Spain with the aim of identifying the factors determining CAT scores, focusing particularly on the cognitive status measured by the Mini-Mental State Examination (MMSE) and levels of depression measured by the short Beck Depression Inventory (BDI). A total of 684 COPD patients were analyzed; 84.1% were men, the mean age of patients was 68.7 years, and the mean forced expiratory volume in 1 second (%) was 55.1%. Mean CAT score was 21.8. CAT scores correlated with the MMSE score (Pearson's coefficient r =-0.371) and the BDI ( r =0.620), both p CAT scores and explained 45% of the variability. However, a model including only MMSE and BDI scores explained up to 40% and BDI alone explained 38% of the CAT variance. CAT scores are associated with clinical variables of severity of COPD. However, cognitive status and, in particular, the level of depression explain a larger percentage of the variance in the CAT scores than the usual COPD clinical severity variables.
A comparison of likelihood ratio tests and Rao's score test for three separable covariance matrix structures.

Science.gov (United States)

Filipiak, Katarzyna; Klein, Daniel; Roy, Anuradha

2017-01-01

The problem of testing the separability of a covariance matrix against an unstructured variance-covariance matrix is studied in the context of multivariate repeated measures data using Rao's score test (RST). The RST statistic is developed with the first component of the separable structure as a first-order autoregressive (AR(1)) correlation matrix or an unstructured (UN) covariance matrix under the assumption of multivariate normality. It is shown that the distribution of the RST statistic under the null hypothesis of any separability does not depend on the true values of the mean or the unstructured components of the separable structure. A significant advantage of the RST is that it can be performed for small samples, even smaller than the dimension of the data, where the likelihood ratio test (LRT) cannot be used, and it outperforms the standard LRT in a number of contexts. Monte Carlo simulations are then used to study the comparative behavior of the null distribution of the RST statistic, as well as that of the LRT statistic, in terms of sample size considerations, and for the estimation of the empirical percentiles. Our findings are compared with existing results where the first component of the separable structure is a compound symmetry (CS) correlation matrix. It is also shown by simulations that the empirical null distribution of the RST statistic converges faster than the empirical null distribution of the LRT statistic to the limiting χ 2 distribution. The tests are implemented on a real dataset from medical studies. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Detection of acute deterioration in health status visit among COPD patients by monitoring COPD assessment test score

Directory of Open Access Journals (Sweden)

Pothirat C

2015-02-01

Full Text Available Chaicharn Pothirat, Warawut Chaiwong, Atikun Limsukon, Athavudh Deesomchok, Chalerm Liwsrisakun, Chaiwat Bumroongkit, Theerakorn Theerakittikul, Nittaya PhetsukDivision of Pulmonary, Critical Care and Allergy, Department of Internal Medicine, Faculty of Medicine, Chiang Mai University, Chiang Mai, ThailandBackground: The Chronic Obstructive Pulmonary Disease Assessment Test (CAT could play a role in detecting acute deterioration in health status during monitoring visits in routine clinical practice.Objective: To evaluate the discriminative property of a change in CAT score from a stable baseline visit for detecting acute deterioration in health status visits of chronic obstructive pulmonary disease (COPD patients.Methods: The CAT questionnaire was administered to stable COPD patients routinely attending the chest clinic of Chiang Mai University Hospital who were monitored using the CAT score every 1–3 months for 15 months. Acute deterioration in health status was defined as worsening or exacerbation. CAT scores at baseline, and subsequent visits with acute deterioration in health status were analyzed using the t-test. The receiver operating characteristic curve was performed to evaluate the discriminative property of change in CAT score for detecting acute deterioration during a health status visit.Results: A total of 354 follow-up visits were made by 140 patients, aged 71.1±8.4 years, with a forced expiratory volume in 1 second of 47.49%±18.2% predicted, who were monitored for 15 months. The mean CAT score change between stable baseline visits, by patients’ and physicians’ global assessments, were 0.05 (95% confidence interval [CI], -0.37–0.46 and 0.18 (95% CI, -0.23–0.60, respectively. At worsening visits, as assessed by patients, there was significant increase in CAT score (6.07; 95% CI, 4.95–7.19. There were also significant increases in CAT scores at visits with mild and moderate exacerbation (5.51 [95% CI, 4.39–6
Refining Ovarian Cancer Test accuracy Scores (ROCkeTS): protocol for a prospective longitudinal test accuracy study to validate new risk scores in women with symptoms of suspected ovarian cancer

Science.gov (United States)

Sundar, Sudha; Rick, Caroline; Dowling, Francis; Au, Pui; Rai, Nirmala; Champaneria, Rita; Stobart, Hilary; Neal, Richard; Davenport, Clare; Mallett, Susan; Sutton, Andrew; Kehoe, Sean; Timmerman, Dirk; Bourne, Tom; Van Calster, Ben; Gentry-Maharaj, Aleksandra; Deeks, Jon

2016-01-01

Introduction Ovarian cancer (OC) is associated with non-specific symptoms such as bloating, making accurate diagnosis challenging: only 1 in 3 women with OC presents through primary care referral. National Institute for Health and Care Excellence guidelines recommends sequential testing with CA125 and routine ultrasound in primary care. However, these diagnostic tests have limited sensitivity or specificity. Improving accurate triage in women with vague symptoms is likely to improve mortality by streamlining referral and care pathways. The Refining Ovarian Cancer Test Accuracy Scores (ROCkeTS; HTA 13/13/01) project will derive and validate new tests/risk prediction models that estimate the probability of having OC in women with symptoms. This protocol refers to the prospective study only (phase III). Methods and analysis ROCkeTS comprises four parallel phases. The full ROCkeTS protocol can be found at http://www.birmingham.ac.uk/ROCKETS. Phase III is a prospective test accuracy study. The study will recruit 2450 patients from 15 UK sites. Recruited patients complete symptom and anxiety questionnaires, donate a serum sample and undergo ultrasound scored as per International Ovarian Tumour Analysis (IOTA) criteria. Recruitment is at rapid access clinics, emergency departments and elective clinics. Models to be evaluated include those based on ultrasound derived by the IOTA group and novel models derived from analysis of existing data sets. Estimates of sensitivity, specificity, c-statistic (area under receiver operating curve), positive predictive value and negative predictive value of diagnostic tests are evaluated and a calibration plot for models will be presented. ROCkeTS has received ethical approval from the NHS West Midlands REC (14/WM/1241) and is registered on the controlled trials website (ISRCTN17160843) and the National Institute of Health Research Cancer and Reproductive Health portfolios. PMID:27507231
Heart valve surgery: EuroSCORE vs. EuroSCORE II vs. Society of Thoracic Surgeons score

Directory of Open Access Journals (Sweden)

Muhammad Sharoz Rabbani

2014-12-01

Full Text Available Background This is a validation study comparing the European System for Cardiac Operative Risk Evaluation (EuroSCORE II with the previous additive (AES and logistic EuroSCORE (LES and the Society of Thoracic Surgeons’ (STS risk prediction algorithm, for patients undergoing valve replacement with or without bypass in Pakistan. Patients and Methods Clinical data of 576 patients undergoing valve replacement surgery between 2006 and 2013 were retrospectively collected and individual expected risks of death were calculated by all four risk prediction algorithms. Performance of these risk algorithms was evaluated in terms of discrimination and calibration. Results There were 28 deaths (4.8% among 576 patients, which was lower than the predicted mortality of 5.16%, 6.96% and 4.94% by AES, LES and EuroSCORE II but was higher than 2.13% predicted by STS scoring system. For single and double valve replacement procedures, EuroSCORE II was the best predictor of mortality with highest Hosmer and Lemmeshow test (H-L p value (0.346 to 0.689 and area under the receiver operating characteristic (ROC curve (0.637 to 0.898. For valve plus concomitant coronary artery bypass grafting (CABG patients actual mortality was 1.88%. STS calculator came out to be the best predictor of mortality for this subgroup with H-L p value (0.480 to 0.884 and ROC (0.657 to 0.775. Conclusions For Pakistani population EuroSCORE II is an accurate predictor for individual operative risk in patients undergoing isolated valve surgery, whereas STS performs better in the valve plus CABG group.
Using College Admission Test Scores to Clarify High School Placement. Leading Indicator Spotlight

Science.gov (United States)

Flug, Susanna

2010-01-01

In "Beyond Test Scores: Leading Indicators for Education," Foley and colleagues (2008) define leading indicators as those that "provide early signals of progress toward academic achievement" (p. 1) and stress that educators "need leading indicators to help them see the direction their efforts are going in and to take…
Differences of wells scores accuracy, caprini scores and padua scores in deep vein thrombosis diagnosis

Science.gov (United States)

Gatot, D.; Mardia, A. I.

2018-03-01

Deep Vein Thrombosis (DVT) is the venous thrombus in lower limbs. Diagnosis is by using venography or ultrasound compression. However, these examinations are not available yet in some health facilities. Therefore many scoring systems are developed for the diagnosis of DVT. The scoring method is practical and safe to use in addition to efficacy, and effectiveness in terms of treatment and costs. The existing scoring systems are wells, caprini and padua score. There have been many studies comparing the accuracy of this score but not in Medan. Therefore, we are interested in comparative research of wells, capriniand padua score in Medan.An observational, analytical, case-control study was conducted to perform diagnostic tests on the wells, caprini and padua score to predict the risk of DVT. The study was at H. Adam Malik Hospital in Medan.From a total of 72 subjects, 39 people (54.2%) are men and the mean age are 53.14 years. Wells score, caprini score and padua score has a sensitivity of 80.6%; 61.1%, 50% respectively; specificity of 80.65; 66.7%; 75% respectively, and accuracy of 87.5%; 64.3%; 65.7% respectively.Wells score has better sensitivity, specificity and accuracy than caprini and padua score in diagnosing DVT.
Distance learning training in genetics and genomics testing for Italian health professionals: results of a pre and post-test evaluation

Directory of Open Access Journals (Sweden)

Maria Benedetta Michelazzo

2015-09-01

Full Text Available BackgroundProgressive advances in technologies for DNA sequencing and decreasing costs are allowing an easier diffusion of genetic and genomic tests. Physicians’ knowledge and confidence on the topic is often low and not suitable for manage this challenge. Tailored educational programs are required to reach a more and more appropriate use of genetic technologies.MethodsA distance learning course has been created by experts from different Italian medical associations with the support of the Italian Ministry of Health. The course was directed to professional figures involved in prescription and interpretation of genetic tests. A pretest-post-test study design was used to assess knowledge improvement. We analyzed the proportion of correct answers for each question pre and post-test, as well as the mean score difference stratified by gender, age, professional status and medical specialty.ResultsWe reported an improvement in the proportion of correct answers for 12 over 15 questions of the test. The overall mean score to the questions significantly increased in the post-test, from 9.44 to 12.49 (p-value < 0.0001. In the stratified analysis we reported an improvement in the knowledge of all the groups except for geneticists; the pre-course mean score of this group was already very high and did not improve significantly.ConclusionDistance learning is effective in improving the level of genetic knowledge. In the future, it will be useful to analyze which specialists have more advantage from genetic education, in order to plan more tailored education for medical professionals.
A high COPD assessment test score may predict anxiety in COPD

Directory of Open Access Journals (Sweden)

Harryanto H

2018-03-01

Full Text Available Hilman Harryanto,1 Sally Burrows,2 Yuben Moodley1,2 1Department of Respiratory Medicine, Fiona Stanley Hospital, Perth, WA, Australia; 2Faculty of Health and Medical Sciences, Medical School, University of Western Australia, Perth, WA, AustraliaThe prevalence of anxiety is 55% in patients with COPD,1 and it is associated with worse disease control. Therefore, early recognition and institution of treatment of this comorbidity significantly improve patient’s quality of life. Recently, a questionnaire called the COPD assessment test (CAT has been incorporated into the Global Initiative for Chronic Obstructive Lung Disease (GOLD guidelines for the management of COPD, and a higher score is associated with increased COPD symptoms.2 Considering the regular use of CAT, it was evaluated whether this tool can also be used to identify anxiety. The CAT score was correlated with the Hospital Anxiety and Depression Scale (HADS to determine the level at which CAT may predict anxiety.
The conformity of BPP and vibroacoustic stimulation results in fetal non reactive non stress test

Directory of Open Access Journals (Sweden)

M. Modarres

2006-08-01

Full Text Available Background: The most frequently used test for evaluation of fetal health is the Non Stress Test (NST. Unfortunately it has a high incidence of false positive results. The combination of vibroacoustic stimulation with the NTS has been shown to reduce non reactive results. Methods: A tests assessment method was chosen with a simple randomized sampling. 40 pregnant women with non reactive NST in the first 20 minutes who received VAS in one of Tehran University's Hospitals were compared with BPP scores. A vibroacoustic stimulation was applied for a 3 seconds on the maternal abdomen and fallowed within 10 minutes.Data collection tools were NST, sonography instruments ,NST result paper, tooth brusher, watch, demographic questioner and check list. Data analysis was made by descriptive static and by using the Fisher's Exact Test (with level of significant at p<0/05. All statistical analysis were performed using an spss/win. Results: After VAS, 70% of non reactive tracing became reactive. All cases with fetal reactivity response after a VAS had a subsequent BPP score of 8 (negative predictive value of 100%. False positivity of VAS was lower than NST. Conclusion: VAS offers benefits, by decreasing the incidence of non reactive test and reducing test time. VAS lowers the rate of false positive NST. VAS is safe and allows more efficient of prenatal services. This test could be used as a rapid antepartum test to predict fetal well-being.
Comprehensive School Reform and Standardized Test Scores in Illinois Elementary and Middle Schools

Science.gov (United States)

McEnroe, James D.

2010-01-01

The study examined the effects of the federally funded Comprehensive School Reform (CSR) program on student performance on mandated standardized tests. The study focused on the mathematics and reading scores of Illinois public elementary and middle and junior high school students. The federal CSR program provided Illinois schools with an annual…

Associations between MMPI-2-RF validity scale scores and extra-test measures of personality and psychopathology.

Science.gov (United States)

Forbey, Johnathan D; Lee, Tayla T C; Ben-Porath, Yossef S; Arbisi, Paul A; Gartland, Diane

2013-08-01

The current study explored associations between two potentially invalidating self-report styles detected by the Validity scales of the Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF), over-reporting and under-reporting, and scores on the MMPI-2-RF substantive, as well as eight collateral self-report measures administered either at the same time or within 1 to 10 days of MMPI-2-RF administration. Analyses were conducted with data provided by college students, male prisoners, and male psychiatric outpatients from a Veterans Administration facility. Results indicated that if either an over- or under-reporting response style was suggested by the MMPI-2-RF Validity scales, scores on the majority of the MMPI-2-RF substantive scales, as well as a number of collateral measures, were significantly affected in all three groups in the expected directions. Test takers who were identified as potentially engaging in an over- or under-reporting response style by the MMPI-2-RF Validity scales appeared to approach extra-test measures similarly regardless of when these measures were administered in relation to the MMPI-2-RF. Limitations and suggestions for future study are discussed.
Reliability and validity of the new Tanaka B Intelligence Scale scores: a group intelligence test.

Science.gov (United States)

Uno, Yota; Mizukami, Hitomi; Ando, Masahiko; Yukihiro, Ryoji; Iwasaki, Yoko; Ozaki, Norio

2014-01-01

The present study evaluated the reliability and concurrent validity of the new Tanaka B Intelligence Scale, which is an intelligence test that can be administered on groups within a short period of time. The new Tanaka B Intelligence Scale and Wechsler Intelligence Scale for Children-Third Edition were administered to 81 subjects (mean age ± SD 15.2 ± 0.7 years) residing in a juvenile detention home; reliability was assessed using Cronbach's alpha coefficient, and concurrent validity was assessed using the one-way analysis of variance intraclass correlation coefficient. Moreover, receiver operating characteristic analysis for screening for individuals who have a deficit in intellectual function (an FIQIntelligence Scale IQ (BIQ) was 0.86, and the intraclass correlation coefficient with FIQ was 0.83. Receiver operating characteristic analysis demonstrated an area under the curve of 0.89 (95% CI: 0.85-0.96). In addition, the stratum-specific likelihood ratio for the BIQ≤65 stratum was 13.8 (95% CI: 3.9-48.9), and the stratum-specific likelihood ratio for the BIQ≥76 stratum was 0.1 (95% CI: 0.03-0.4). Thus, intellectual disability could be ruled out or determined. The present results demonstrated that the new Tanaka B Intelligence Scale score had high reliability and concurrent validity with the Wechsler Intelligence Scale for Children-Third Edition score. Moreover, the post-test probability for the BIQ could be calculated when screening for individuals who have a deficit in intellectual function. The new Tanaka B Intelligence Test is convenient and can be administered within a variety of settings. This enables evaluation of intellectual development even in settings where performing intelligence tests have previously been difficult.
A Novel Scoring System Approach to Assess Patients with Lyme Disease (Nutech Functional Score)

OpenAIRE

Geeta Shroff; Petra Hopf-Seidel

2018-01-01

Introduction: A bacterial infection by Borrelia burgdorferi referred to as Lyme disease (LD) or borreliosis is transmitted mostly by a bite of the tick Ixodes scapularis in the USA and Ixodes ricinus in Europe. Various tests are used for the diagnosis of LD, but their results are often unreliable. We compiled a list of clinically visible and patient-reported symptoms that are associated with LD. Based on this list, we developed a novel scoring system. Methodology: Nutech functional Score (NF...
Using Automated Essay Scores as an Anchor When Equating Constructed Response Writing Tests

Science.gov (United States)

Almond, Russell G.

2014-01-01

Assessments consisting of only a few extended constructed response items (essays) are not typically equated using anchor test designs as there are typically too few essay prompts in each form to allow for meaningful equating. This article explores the idea that output from an automated scoring program designed to measure writing fluency (a common…
Implications of Deployed and Nondeployed Fathers on Seventh Graders' California Achievement Test Scores during a Military Crisis.

Science.gov (United States)

Pisano, Mark C.

The differences in California Achievement Test (CAT) scores from 1990 to 1991 in seventh graders, currently enrolled in Albritton Junior High School in the Fort Bragg Schools, of deployed and nondeployed fathers were analyzed. CAT percentile scores from 1990 and 1991 (1991 being the year of "Desert Storm") were obtained in reading, math…
How Well Does the Sum Score Summarize the Test? Summability as a Measure of Internal Consistency

NARCIS (Netherlands)

Goeman, J.J.; De, Jong N.H.

2018-01-01

Many researchers use Cronbach's alpha to demonstrate internal consistency, even though it has been shown numerous times that Cronbach's alpha is not suitable for this. Because the intention of questionnaire and test constructers is to summarize the test by its overall sum score, we advocate
Rey's Auditory Verbal Learning Test scores can be predicted from whole brain MRI in Alzheimer's disease

Directory of Open Access Journals (Sweden)

Elaheh Moradi

2017-01-01

Full Text Available Rey's Auditory Verbal Learning Test (RAVLT is a powerful neuropsychological tool for testing episodic memory, which is widely used for the cognitive assessment in dementia and pre-dementia conditions. Several studies have shown that an impairment in RAVLT scores reflect well the underlying pathology caused by Alzheimer's disease (AD, thus making RAVLT an effective early marker to detect AD in persons with memory complaints. We investigated the association between RAVLT scores (RAVLT Immediate and RAVLT Percent Forgetting and the structural brain atrophy caused by AD. The aim was to comprehensively study to what extent the RAVLT scores are predictable based on structural magnetic resonance imaging (MRI data using machine learning approaches as well as to find the most important brain regions for the estimation of RAVLT scores. For this, we built a predictive model to estimate RAVLT scores from gray matter density via elastic net penalized linear regression model. The proposed approach provided highly significant cross-validated correlation between the estimated and observed RAVLT Immediate (R = 0.50 and RAVLT Percent Forgetting (R = 0.43 in a dataset consisting of 806 AD, mild cognitive impairment (MCI or healthy subjects. In addition, the selected machine learning method provided more accurate estimates of RAVLT scores than the relevance vector regression used earlier for the estimation of RAVLT based on MRI data. The top predictors were medial temporal lobe structures and amygdala for the estimation of RAVLT Immediate and angular gyrus, hippocampus and amygdala for the estimation of RAVLT Percent Forgetting. Further, the conversion of MCI subjects to AD in 3-years could be predicted based on either observed or estimated RAVLT scores with an accuracy comparable to MRI-based biomarkers.
Clinical score and rapid antigen detection test to guide antibiotic use for sore throats: randomised controlled trial of PRISM (primary care streptococcal management).

Science.gov (United States)

Little, Paul; Hobbs, F D Richard; Moore, Michael; Mant, David; Williamson, Ian; McNulty, Cliodna; Cheng, Ying Edith; Leydon, Geraldine; McManus, Richard; Kelly, Joanne; Barnett, Jane; Glasziou, Paul; Mullee, Mark

2013-10-10

To determine the effect of clinical scores that predict streptococcal infection or rapid streptococcal antigen detection tests compared with delayed antibiotic prescribing. Open adaptive pragmatic parallel group randomised controlled trial. Primary care in United Kingdom. Patients aged ≥ 3 with acute sore throat. An internet programme randomised patients to targeted antibiotic use according to: delayed antibiotics (the comparator group for analyses), clinical score, or antigen test used according to clinical score. During the trial a preliminary streptococcal score (score 1, n=1129) was replaced by a more consistent score (score 2, n=631; features: fever during previous 24 hours; purulence; attends rapidly (within three days after onset of symptoms); inflamed tonsils; no cough/coryza (acronym FeverPAIN). Symptom severity reported by patients on a 7 point Likert scale (mean severity of sore throat/difficulty swallowing for days two to four after the consultation (primary outcome)), duration of symptoms, use of antibiotics. For score 1 there were no significant differences between groups. For score 2, symptom severity was documented in 80% (168/207 (81%) in delayed antibiotics group; 168/211 (80%) in clinical score group; 166/213 (78%) in antigen test group). Reported severity of symptoms was lower in the clinical score group (-0.33, 95% confidence interval -0.64 to -0.02; P=0.04), equivalent to one in three rating sore throat a slight versus moderate problem, with a similar reduction for the antigen test group (-0.30, -0.61 to -0.00; P=0.05). Symptoms rated moderately bad or worse resolved significantly faster in the clinical score group (hazard ratio 1.30, 95% confidence interval 1.03 to 1.63) but not the antigen test group (1.11, 0.88 to 1.40). In the delayed antibiotics group, 75/164 (46%) used antibiotics. Use of antibiotics in the clinical score group (60/161) was 29% lower (adjusted risk ratio 0.71, 95% confidence interval 0.50 to 0.95; P=0.02) and in the
Polytrauma Defined by the New Berlin Definition: A Validation Test Based on Propensity-Score Matching Approach.

Science.gov (United States)

Rau, Cheng-Shyuan; Wu, Shao-Chun; Kuo, Pao-Jen; Chen, Yi-Chun; Chien, Peng-Chen; Hsieh, Hsiao-Yun; Hsieh, Ching-Hua

2017-09-11

Background: Polytrauma patients are expected to have a higher risk of mortality than that obtained by the summation of expected mortality owing to their individual injuries. This study was designed to investigate the outcome of patients with polytrauma, which was defined using the new Berlin definition, as cases with an Abbreviated Injury Scale (AIS) ≥ 3 for two or more different body regions and one or more additional variables from five physiologic parameters (hypotension [systolic blood pressure ≤ 90 mmHg], unconsciousness [Glasgow Coma Scale score ≤ 8], acidosis [base excess ≤ -6.0], coagulopathy [partial thromboplastin time ≥ 40 s or international normalized ratio ≥ 1.4], and age [≥70 years]). Methods: We retrieved detailed data on 369 polytrauma patients and 1260 non-polytrauma patients with an overall Injury Severity Score (ISS) ≥ 18 who were hospitalized between 1 January 2009 and 31 December 2015 for the treatment of all traumatic injuries, from the Trauma Registry System at a level I trauma center. Patients with burn injury or incomplete registered data were excluded. Categorical data were compared with two-sided Fisher exact or Pearson chi-square tests. The unpaired Student t -test and the Mann-Whitney U -test was used to analyze normally distributed continuous data and non-normally distributed data, respectively. Propensity-score matched cohort in a 1:1 ratio was allocated using the NCSS software with logistic regression to evaluate the effect of polytrauma on patient outcomes. Results: The polytrauma patients had a significantly higher ISS than non-polytrauma patients (median (interquartile range Q1-Q3), 29 (22-36) vs. 24 (20-25), respectively; p Polytrauma patients had a 1.9-fold higher odds of mortality than non-polytrauma patients (95% CI 1.38-2.49; p polytrauma patients, polytrauma patients had a substantially longer hospital length of stay (LOS). In addition, a higher proportion of polytrauma patients were admitted to the intensive
Persian competing word test: Development and preliminary results in normal children

Directory of Open Access Journals (Sweden)

Mohammad Ebrahim Mahdavi

2008-12-01

Full Text Available Background and Aim: Assessment of central auditory processing skills needs various behavioral tests in format of a test battery. There is a few Persian speech tests for documenting central auditory processing disorders. The purpose of this study was developing a dichotic test formed of one-syllabic words suitable for evaluation of central auditory processing in Persian language children and reporting its preliminary results in a group of normal children.Materials and Methods: Persian words in competing manner test was developed utilizing most frequent monosyllabic words in children storybooks reported in the previous researches. The test was performed at MCL on forty-five normal children (39 right-handed and 6 left-handed aged 5-11 years. The children did not show any obvious problem in hearing, speech, language and learning. Free (n=28 and directed listening (n=17 tasks were investigated.Results: The results show that in directed listening task, there is significant advantage for performance of pre-cued ear relative to opposite side. Right ear advantage is evident in free recall condition. Average performance of the children in directed recall is significantly better than free recall. Average row score of the test increases with the children age.Conclusion: Persian words in competing manner test as a dichotic test, can show major characteristics of dichotic listening and effect of maturation of central auditory system on it in normal children.
Unexplained Graft Dysfunction after Heart Transplantation—Role of Novel Molecular Expression Test Score and QTc-Interval: A Case Report

Directory of Open Access Journals (Sweden)

Khurram Shahzad

2010-01-01

Full Text Available In the current era of immunosuppressive medications there is increased observed incidence of graft dysfunction in the absence of known histological criteria of rejection after heart transplantation. A noninvasive molecular expression diagnostic test was developed and validated to rule out histological acute cellular rejection. In this paper we present for the first time, longitudinal pattern of changes in this novel diagnostic test score along with QTc-interval in a patient who was admitted with unexplained graft dysfunction. Patient presented with graft failure with negative findings on all known criteria of rejection including acute cellular rejection, antibody mediated rejection and cardiac allograft vasculopathy. The molecular expression test score showed gradual increase and QTc-interval showed gradual prolongation with the gradual decline in graft function. This paper exemplifies that in patients presenting with unexplained graft dysfunction, GEP test score and QTc-interval correlate with the changes in the graft function.
Refining Ovarian Cancer Test accuracy Scores (ROCkeTS): protocol for a prospective longitudinal test accuracy study to validate new risk scores in women with symptoms of suspected ovarian cancer.

Science.gov (United States)

Sundar, Sudha; Rick, Caroline; Dowling, Francis; Au, Pui; Snell, Kym; Rai, Nirmala; Champaneria, Rita; Stobart, Hilary; Neal, Richard; Davenport, Clare; Mallett, Susan; Sutton, Andrew; Kehoe, Sean; Timmerman, Dirk; Bourne, Tom; Van Calster, Ben; Gentry-Maharaj, Aleksandra; Menon, Usha; Deeks, Jon

2016-08-09

Ovarian cancer (OC) is associated with non-specific symptoms such as bloating, making accurate diagnosis challenging: only 1 in 3 women with OC presents through primary care referral. National Institute for Health and Care Excellence guidelines recommends sequential testing with CA125 and routine ultrasound in primary care. However, these diagnostic tests have limited sensitivity or specificity. Improving accurate triage in women with vague symptoms is likely to improve mortality by streamlining referral and care pathways. The Refining Ovarian Cancer Test Accuracy Scores (ROCkeTS; HTA 13/13/01) project will derive and validate new tests/risk prediction models that estimate the probability of having OC in women with symptoms. This protocol refers to the prospective study only (phase III). ROCkeTS comprises four parallel phases. The full ROCkeTS protocol can be found at http://www.birmingham.ac.uk/ROCKETS. Phase III is a prospective test accuracy study. The study will recruit 2450 patients from 15 UK sites. Recruited patients complete symptom and anxiety questionnaires, donate a serum sample and undergo ultrasound scored as per International Ovarian Tumour Analysis (IOTA) criteria. Recruitment is at rapid access clinics, emergency departments and elective clinics. Models to be evaluated include those based on ultrasound derived by the IOTA group and novel models derived from analysis of existing data sets. Estimates of sensitivity, specificity, c-statistic (area under receiver operating curve), positive predictive value and negative predictive value of diagnostic tests are evaluated and a calibration plot for models will be presented. ROCkeTS has received ethical approval from the NHS West Midlands REC (14/WM/1241) and is registered on the controlled trials website (ISRCTN17160843) and the National Institute of Health Research Cancer and Reproductive Health portfolios. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted
A Novel Scoring System Approach to Assess Patients with Lyme Disease (Nutech Functional Score

Directory of Open Access Journals (Sweden)

Geeta Shroff

2018-01-01

Full Text Available Introduction: A bacterial infection by Borrelia burgdorferi referred to as Lyme disease (LD or borreliosis is transmitted mostly by a bite of the tick Ixodes scapularis in the USA and Ixodes ricinus in Europe. Various tests are used for the diagnosis of LD, but their results are often unreliable. We compiled a list of clinically visible and patient-reported symptoms that are associated with LD. Based on this list, we developed a novel scoring system. Methodology: Nutech functional Score (NFS, which is a 43 point positional (every symptom is subgraded and each alternative gets some points according to its position and directional (moves in direction bad to good scoring system that assesses the patient's condition. Results: The grades of the scoring system have been converted into numeric values for conducting probability based studies. Each symptom is graded from 1 to 5 that runs in direction BAD → GOOD. Conclusion: NFS is a unique tool that can be used universally to assess the condition of patients with LD.
Test-retest reliability and minimal detectable change scores for sit-to-stand-to-sit tests, the six-minute walk test, the one-leg heel-rise test, and handgrip strength in people undergoing hemodialysis.

Science.gov (United States)

Segura-Ortí, Eva; Martínez-Olmos, Francisco José

2011-08-01

Determining the relative and absolute reliability of outcomes of physical performance tests for people undergoing hemodialysis is necessary to discriminate between the true effects of exercise interventions and the inherent variability of this cohort. The aims of this study were to assess the relative reliability of sit-to-stand-to-sit tests (the STS-10, which measures the time [in seconds] required to complete 10 full stands from a sitting position, and the STS-60, which measures the number of repetitions achieved in 60 seconds), the Six-Minute Walk Test (6MWT), the one-leg heel-rise test, and the handgrip strength test and to calculate minimal detectable change (MDC) scores in people undergoing hemodialysis. This study was a prospective, nonexperimental investigation. Thirty-nine people undergoing hemodialysis at 2 clinics in Spain were contacted. Study participants performed the STS-10 (n=37), the STS-60 (n=37), and the 6MWT (n=36). At one of the settings, the participants also performed the one-leg heel-rise test (n=21) and the handgrip strength test (n=12) on both the right and the left sides. Participants attended 2 testing sessions 1 to 2 weeks apart. High intraclass correlation coefficients (≥.88) were found for all tests, suggesting good relative reliability. The MDC scores at 90% confidence intervals were as follows: 8.4 seconds for the STS-10, 4 repetitions for the STS-60, 66.3 m for the 6MWT, 3.4 kg for handgrip strength (force-generating capacity), 3.7 repetitions for the one-leg heel-rise test with the right leg, and 5.2 repetitions for the one-leg heel-rise test with the left leg. Limitations A limited sample of patients was used in this study. The STS-16, STS-60, 6MWT, one-leg heel rise test, and handgrip strength test are reliable outcome measures. The MDC scores at 90% confidence intervals for these tests will help to determine whether a change is due to error or to an intervention.
Rugby versus Soccer in South Africa: Content Familiarity Contributes to Cross-Cultural Differences in Cognitive Test Scores

Science.gov (United States)

Malda, Maike; van de Vijver, Fons J. R.; Temane, Q. Michael

2010-01-01

In this study, cross-cultural differences in cognitive test scores are hypothesized to depend on a test's cultural complexity (Cultural Complexity Hypothesis: CCH), here conceptualized as its content familiarity, rather than on its cognitive complexity (Spearman's Hypothesis: SH). The content familiarity of tests assessing short-term memory,…
Survival analysis of colorectal cancer patients with tumor recurrence using global score test methodology

Energy Technology Data Exchange (ETDEWEB)

Zain, Zakiyah, E-mail: zac@uum.edu.my; Ahmad, Yuhaniz, E-mail: yuhaniz@uum.edu.my [School of Quantitative Sciences, Universiti Utara Malaysia, UUM Sintok 06010, Kedah (Malaysia); Azwan, Zairul, E-mail: zairulazwan@gmail.com, E-mail: farhanaraduan@gmail.com, E-mail: drisagap@yahoo.com; Raduan, Farhana, E-mail: zairulazwan@gmail.com, E-mail: farhanaraduan@gmail.com, E-mail: drisagap@yahoo.com; Sagap, Ismail, E-mail: zairulazwan@gmail.com, E-mail: farhanaraduan@gmail.com, E-mail: drisagap@yahoo.com [Surgery Department, Universiti Kebangsaan Malaysia Medical Centre, Jalan Yaacob Latif, 56000 Bandar Tun Razak, Kuala Lumpur (Malaysia); Aziz, Nazrina, E-mail: nazrina@uum.edu.my

2014-12-04

Colorectal cancer is the third and the second most common cancer worldwide in men and women respectively, and the second in Malaysia for both genders. Surgery, chemotherapy and radiotherapy are among the options available for treatment of patients with colorectal cancer. In clinical trials, the main purpose is often to compare efficacy between experimental and control treatments. Treatment comparisons often involve several responses or endpoints, and this situation complicates the analysis. In the case of colorectal cancer, sets of responses concerned with survival times include: times from tumor removal until the first, the second and the third tumor recurrences, and time to death. For a patient, the time to recurrence is correlated to the overall survival. In this study, global score test methodology is used in combining the univariate score statistics for comparing treatments with respect to each survival endpoint into a single statistic. The data of tumor recurrence and overall survival of colorectal cancer patients are taken from a Malaysian hospital. The results are found to be similar to those computed using the established Wei, Lin and Weissfeld method. Key factors such as ethnic, gender, age and stage at diagnose are also reported.
Comparison of multianalyte proficiency test results by sum of ranking differences, principal component analysis, and hierarchical cluster analysis.

Science.gov (United States)

Škrbić, Biljana; Héberger, Károly; Durišić-Mladenović, Nataša

2013-10-01

Sum of ranking differences (SRD) was applied for comparing multianalyte results obtained by several analytical methods used in one or in different laboratories, i.e., for ranking the overall performances of the methods (or laboratories) in simultaneous determination of the same set of analytes. The data sets for testing of the SRD applicability contained the results reported during one of the proficiency tests (PTs) organized by EU Reference Laboratory for Polycyclic Aromatic Hydrocarbons (EU-RL-PAH). In this way, the SRD was also tested as a discriminant method alternative to existing average performance scores used to compare mutlianalyte PT results. SRD should be used along with the z scores--the most commonly used PT performance statistics. SRD was further developed to handle the same rankings (ties) among laboratories. Two benchmark concentration series were selected as reference: (a) the assigned PAH concentrations (determined precisely beforehand by the EU-RL-PAH) and (b) the averages of all individual PAH concentrations determined by each laboratory. Ranking relative to the assigned values and also to the average (or median) values pointed to the laboratories with the most extreme results, as well as revealed groups of laboratories with similar overall performances. SRD reveals differences between methods or laboratories even if classical test(s) cannot. The ranking was validated using comparison of ranks by random numbers (a randomization test) and using seven folds cross-validation, which highlighted the similarities among the (methods used in) laboratories. Principal component analysis and hierarchical cluster analysis justified the findings based on SRD ranking/grouping. If the PAH-concentrations are row-scaled, (i.e., z scores are analyzed as input for ranking) SRD can still be used for checking the normality of errors. Moreover, cross-validation of SRD on z scores groups the laboratories similarly. The SRD technique is general in nature, i.e., it can
Gaze Stabilization Test Asymmetry Score as an Indicator of Previous Concussion in a Cohort of Collegiate Football Players.

Science.gov (United States)

Honaker, Julie A; Criter, Robin E; Patterson, Jessie N; Jones, Sherri M

2015-07-01

Vestibular dysfunction may lead to decreased visual acuity with head movements, which may impede athletic performance and result in injury. The purpose of this study was to test the hypothesis that athletes with history of concussion would have differences in gaze stabilization test (GST) as compared with those without a history of concussion. Cross-sectional, descriptive. University Athletic Medicine Facility. Fifteen collegiate football players with a history of concussion, 25 collegiate football players without a history of concussion. Participants completed the dizziness handicap inventory (DHI), static visual acuity, perception time test, active yaw plane GST, stability evaluation test (SET), and a bedside oculomotor examination. Independent samples t test was used to compare GST, SET, and DHI scores per group, with Bonferroni-adjusted alpha at P history of concussion. The results support further research on the use of GST for sport-related concussion evaluation and monitoring. Inclusion of objective vestibular tests in the concussion protocol may reveal the presence of peripheral vestibular or visual-vestibular deficits. Therefore, the GST may add an important perspective on the effects of concussion.
Differences in distribution of T-scores and Z-scores among bone densitometry tests in postmenopausal women (a comparative study)

International Nuclear Information System (INIS)

Wendlova, J.

2002-01-01

To determine the character of T-score and Z-score value distribution in individually selected methods of bone densitometry and to compare them using statistical analysis. We examined 56 postmenopausal women with an age between 43 and 68 years with osteopenia or osteoporosis according to the WHO classification. The following measurements were made in each patient: T-score and Z-score for: 1) Stiffness index (S) of the left heel bone, USM (index). 2) Bone mineral density of the left heel bone (BMDh), DEXA (g of Ca hydroxyapatite per cm 2 ). 3) Bone mineral density of trabecular bone of the L1 vertebra (BMDL1). QCT (mg of Ca hydroxyapatite per cm 3 ). The densitometers used in the study were: ultrasonometer to measure heel bone, Achilles plus LUNAR, USA: DEXA to measure heel bone, PIXl, LUNAR, USA: QCT to measure the L1 vertebra, CT, SOMATOM Plus, Siemens, Germany. Statistical analysis: differences between measured values of T-scores (Z-scores) were evaluated by parametric or non-parametric methods of determining the 95 % confidence intervals (C.I.). Differences between Z-score and T-score values for compared measurements were statistically significant; however, these differences were lower for Z-scores. Largest differences in 95 % C.I., characterizing individual measurements of T-score values (in comparison with Z-scores), were found for those densitometers whose age range of the reference groups of young adults differed the most, and conversely, the smallest differences in T-score values were found when the differences between the age ranges of reference groups were smallest. The higher variation in T-score values in comparison to Z-scores is also caused by a non-standard selection of the reference groups of young adults for the QCT, PIXI and Achilles Plus densitometers used in the study. Age characteristics of the reference group for T-scores should be standardized for all types of densitometers. (author)
A Study on Variables that Affect Class Scores of Primary Education Students in Placement Test

OpenAIRE

Yavuz, Mustafa

2010-01-01

This study aims to determine the variables that predict class scores which are obtained by adding 70 % of the Placement Test (PT) scores of the primary education sixth and seventh grade students who took it for the first time in the 2007-2008 academic year within the framework of the system of passing to secondary education reorganized by the MNE, 25 % of their end-of-the-year passing grades. The study is of general survey model. The study group consists of students who took the PT in the 200...

Cognitive Learning Strategy as a Partial Effect on Major Field Test in Business Results

Science.gov (United States)

Strang, Kenneth David

2014-01-01

An experiment was developed to determine if cognitive learning strategies improved standardized university business exam results. Previous studies revealed that factors such as prior ability, age, gender, and culture predicted a student's Major Field Test in Business (MFTB) score better than course content. The experiment control consisted of…
Linkage between company scores and stock returns

Directory of Open Access Journals (Sweden)

Saban Celik

2017-12-01

Full Text Available Previous studies on company scores conducted at firm-level, generally concluded that there exists a positive relation between company scores and stock returns. Motivated by these studies, this study examines the relationship between company scores (Corporate Governance Score, Economic Score, Environmental Score, and Social Score and stock returns, both at portfolio-level analysis and firm-level cross-sectional regressions. In portfolio-level analysis, stocks are sorted based on each company scores and quintile portfolio are formed with different levels of company scores. Then, existence and significance of raw returns and risk-adjusted returns difference between portfolios with the extreme company scores (portfolio 10 and portfolio 1 is tested. In addition, firm-level cross-sectional regression is performed to examine the significance of company scores effects with control variables. While portfolio-level analysis results indicate that there is no significant relation between company scores and stock returns; firm-level analysis indicates that economic, environmental, and social scores have effect on stock returns, however, significance and direction of these effects change, depending on the included control variables in the cross-sectional regression.
An analysis of aviation test scores to characterize Student Naval Aviator disqualification

OpenAIRE

Wahl, Erich J.

1998-01-01

Approved for public release; distribution is unlimited The U.S. Navy uses the Aviation Selection Test Battery (ASTh) to identify those Student Naval Aviator (SNA) applicants most likely to succeed in flight training. Using classification and regression trees, this thesis concludes that individual answers to an ASTh subtest, the Biographical Inventory, are not good predictors of SNA primary flight grades. It also concludes that those SNA who score less than a 6 on the Pilot Biographical Inv...
Psychometric Evaluation of the Lower Extremity Computerized Adaptive Test, the Modified Harris Hip Score, and the Hip Outcome Score.

Science.gov (United States)

Hung, Man; Hon, Shirley D; Cheng, Christine; Franklin, Jeremy D; Aoki, Stephen K; Anderson, Mike B; Kapron, Ashley L; Peters, Christopher L; Pelt, Christopher E

2014-12-01

The applicability and validity of many patient-reported outcome measures in the high-functioning population are not well understood. To compare the psychometric properties of the modified Harris Hip Score (mHHS), the Hip Outcome Score activities of daily living subscale (HOS-ADL) and sports (HOS-sports), and the Lower Extremity Computerized Adaptive Test (LE CAT). The hypotheses was that all instruments would perform well but that the LE CAT would show superiority psychometrically because a combination of CAT and a large item bank allows for a high degree of measurement precision. Cohort study (diagnosis); Level of evidence, 2. Data were collected from 472 advanced-age, active participants from the Huntsman World Senior Games in 2012. Validity evidences were examined through item fit, dimensionality, monotonicity, local independence, differential item functioning, person raw score to measure correlation, and instrument coverage (ie, ceiling and floor effects), and reliability evidences were examined through Cronbach alpha and person separation index. All instruments demonstrated good item fit, unidimensionality, monotonicity, local independence, and person raw score to measure correlations. The HOS-ADL had high ceiling effects of 36.02%, and the mHHS had ceiling effects of 27.54%. The LE CAT had ceiling effects of 8.47%, and the HOS-sports had no ceiling effects. None of the instruments had any floor effects. The mHHS had a very low Cronbach alpha of 0.41 and an extremely low person separation index of 0.08. Reliabilities for the LE CAT were excellent and for the HOS-ADL and HOS-sports were good. The LE CAT showed better psychometric properties overall than the HOS-ADL, HOS-sports, and mHHS for the senior population. The mHHS demonstrated pronounced ceiling effects and poor reliabilities that should be of concern. The high ceiling effects for the HOS-ADL were also of concern. The LE CAT was superior in all psychometric aspects examined in this study. Future
Validating Score Interpretations and Uses: Messick Lecture, Language Testing Research Colloquium, Cambridge, April 2010

Science.gov (United States)

Kane, Michael

2012-01-01

The argument-based approach to validation involves two steps; specification of the proposed interpretations and uses of the test scores as an interpretive argument, and the evaluation of the plausibility of the proposed interpretive argument. More ambitious interpretations and uses tend to involve an extended network of inferences and assumptions…
GalaxyDock BP2 score: a hybrid scoring function for accurate protein-ligand docking

Science.gov (United States)

Baek, Minkyung; Shin, Woong-Hee; Chung, Hwan Won; Seok, Chaok

2017-07-01

Protein-ligand docking is a useful tool for providing atomic-level understanding of protein functions in nature and design principles for artificial ligands or proteins with desired properties. The ability to identify the true binding pose of a ligand to a target protein among numerous possible candidate poses is an essential requirement for successful protein-ligand docking. Many previously developed docking scoring functions were trained to reproduce experimental binding affinities and were also used for scoring binding poses. However, in this study, we developed a new docking scoring function, called GalaxyDock BP2 Score, by directly training the scoring power of binding poses. This function is a hybrid of physics-based, empirical, and knowledge-based score terms that are balanced to strengthen the advantages of each component. The performance of the new scoring function exhibits significant improvement over existing scoring functions in decoy pose discrimination tests. In addition, when the score is used with the GalaxyDock2 protein-ligand docking program, it outperformed other state-of-the-art docking programs in docking tests on the Astex diverse set, the Cross2009 benchmark set, and the Astex non-native set. GalaxyDock BP2 Score and GalaxyDock2 with this score are freely available at http://galaxy.seoklab.org/softwares/galaxydock.html.
The Changes of Students’ Toefl Score After One Year Learning

Directory of Open Access Journals (Sweden)

Ienneke Indra Dewi

2015-10-01

Full Text Available BINUS students are supposed to increase their English competence indicated by their TOEFL scores. This paper aims to observe the differences between studens TOEFL scores obtained when they entered BINUS and the scores after they joined TOEFL courses at BINUS for one year. The participants were 121 students. The data for the entrance test were taken from the BINUS data center and the final test data were taken from their final test at English class. The data were analysed using statistics especially the descriptive statistics, comparing means, and correlation. To support the quantative data, a set of questionnaires was distributed to those 121 students. The results show that the students’ TOEFL scores have increased significantly in the final test compared to those in the entrance test. The low achiever students showed a better performance than the higher ones. Students’ motivation and background support their English study. Students proved to have the most problem in listening. The results of the research are expected to be the input for English lecturers to improve their teaching especially the existence of SALLC (Self Access Language Learning Center.
Utilizing the Six Realms of Meaning in Improving Campus Standardized Test Scores through Team Teaching and Strategic Planning

Science.gov (United States)

Stevenson, Rosnisha D.; Kritsonis, William Allan

2009-01-01

This article will seek to utilize Dr. William Allan Kritsonis' book "Ways of Knowing Through the Realms of Meaning" (2007) as a framework to improve a campus's standardized test scores, more specifically, their TAKS (Texas Assessment of Knowledge and Skills) scores. Many campuses have an improvement plan, also known as a Campus…
Integrating GIS in the Middle School Curriculum: Impacts on Diverse Students' Standardized Test Scores

Science.gov (United States)

Goldstein, Donna; Alibrandi, Marsha

2013-01-01

This case study conducted with 1,425 middle school students in Palm Beach County, Florida, included a treatment group receiving GIS instruction (256) and a control group without GIS instruction (1,169). Quantitative analyses on standardized test scores indicated that inclusion of GIS in middle school curriculum had a significant effect on student…
Virginia tech freshman class becoming more competitive; Rise in grades and test scores noted

OpenAIRE

Virginia Tech News

2004-01-01

Admission to Virginia Tech continues to become more competitive as applicants report higher grade point averages and test scores than previous years. The incoming class of 4,975 students has an average grade point average (GPA) of 3.68 and SAT 1203, up from 3.60 GPA and 1197 SAT in 2003.
Science Teacher Efficacy and Outcome Expectancy as Predictors of Students' End-of-Instruction (EOI) Biology I Test Scores

Science.gov (United States)

Angle, Julie; Moseley, Christine

2009-01-01

The purpose of this study was to compare teacher efficacy beliefs of secondary Biology I teachers whose students' mean scores on the statewide End-of-Instruction (EOI) Biology I test met or exceeded the state academic proficiency level (Proficient Group) to teacher efficacy beliefs of secondary Biology I teachers whose students' mean scores on the…
A Novel Scoring System Approach to Assess Patients with Lyme Disease (Nutech Functional Score).

Science.gov (United States)

Shroff, Geeta; Hopf-Seidel, Petra

2018-01-01

A bacterial infection by Borrelia burgdorferi referred to as Lyme disease (LD) or borreliosis is transmitted mostly by a bite of the tick Ixodes scapularis in the USA and Ixodes ricinus in Europe. Various tests are used for the diagnosis of LD, but their results are often unreliable. We compiled a list of clinically visible and patient-reported symptoms that are associated with LD. Based on this list, we developed a novel scoring system. Nutech functional Score (NFS), which is a 43 point positional (every symptom is subgraded and each alternative gets some points according to its position) and directional (moves in direction bad to good) scoring system that assesses the patient's condition. The grades of the scoring system have been converted into numeric values for conducting probability based studies. Each symptom is graded from 1 to 5 that runs in direction BAD → GOOD. NFS is a unique tool that can be used universally to assess the condition of patients with LD.
Similar predictions of etravirine sensitivity regardless of genotypic testing method used: comparison of available scoring systems.

Science.gov (United States)

Vingerhoets, Johan; Nijs, Steven; Tambuyzer, Lotke; Hoogstoel, Annemie; Anderson, David; Picchio, Gaston

2012-01-01

The aims of this study were to compare various genotypic scoring systems commonly used to predict virological outcome to etravirine, and examine their concordance with etravirine phenotypic susceptibility. Six etravirine genotypic scoring systems were assessed: Tibotec 2010 (based on 20 mutations; TBT 20), Monogram, Stanford HIVdb, ANRS, Rega (based on 37, 30, 27 and 49 mutations, respectively) and virco(®)TYPE HIV-1 (predicted fold change based on genotype). Samples from treatment-experienced patients who participated in the DUET trials and with both genotypic and phenotypic data (n=403) were assessed using each scoring system. Results were retrospectively correlated with virological response in DUET. κ coefficients were calculated to estimate the degree of correlation between the different scoring systems. Correlation between the five scoring systems and the TBT 20 system was approximately 90%. Virological response by etravirine susceptibility was comparable regardless of which scoring system was utilized, with 70-74% of DUET patients determined as susceptible to etravirine by the different scoring systems achieving plasma viral load <50 HIV-1 RNA copies/ml. In samples classed as phenotypically susceptible to etravirine (fold change in 50% effective concentration ≤3), correlations with genotypic score were consistently high across scoring systems (≥70%). In general, the etravirine genotypic scoring systems produced similar results, and genotype-phenotype concordance was high. As such, phenotypic interpretations, and in their absence all genotypic scoring systems investigated, may be used to reliably predict the activity of etravirine.
Item response theory scoring and the detection of curvilinear relationships.

Science.gov (United States)

Carter, Nathan T; Dalal, Dev K; Guan, Li; LoPilato, Alexander C; Withrow, Scott A

2017-03-01

Psychologists are increasingly positing theories of behavior that suggest psychological constructs are curvilinearly related to outcomes. However, results from empirical tests for such curvilinear relations have been mixed. We propose that correctly identifying the response process underlying responses to measures is important for the accuracy of these tests. Indeed, past research has indicated that item responses to many self-report measures follow an ideal point response process-wherein respondents agree only to items that reflect their own standing on the measured variable-as opposed to a dominance process, wherein stronger agreement, regardless of item content, is always indicative of higher standing on the construct. We test whether item response theory (IRT) scoring appropriate for the underlying response process to self-report measures results in more accurate tests for curvilinearity. In 2 simulation studies, we show that, regardless of the underlying response process used to generate the data, using the traditional sum-score generally results in high Type 1 error rates or low power for detecting curvilinearity, depending on the distribution of item locations. With few exceptions, appropriate power and Type 1 error rates are achieved when dominance-based and ideal point-based IRT scoring are correctly used to score dominance and ideal point response data, respectively. We conclude that (a) researchers should be theory-guided when hypothesizing and testing for curvilinear relations; (b) correctly identifying whether responses follow an ideal point versus dominance process, particularly when items are not extreme is critical; and (c) IRT model-based scoring is crucial for accurate tests of curvilinearity. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Do medical students’ scores using different assessment instruments predict their scores in clinical reasoning using a computer-based simulation?

Directory of Open Access Journals (Sweden)

Fida M

2015-02-01

Full Text Available Mariam Fida,1 Salah Eldin Kassab2 1Department of Molecular Medicine, College of Medicine and Medical Sciences, Arabian Gulf University, Manama, Bahrain; 2Department of Medical Education, Faculty of Medicine, Suez Canal University, Ismailia, Egypt Purpose: The development of clinical problem-solving skills evolves over time and requires structured training and background knowledge. Computer-based case simulations (CCS have been used for teaching and assessment of clinical reasoning skills. However, previous studies examining the psychometric properties of CCS as an assessment tool have been controversial. Furthermore, studies reporting the integration of CCS into problem-based medical curricula have been limited. Methods: This study examined the psychometric properties of using CCS software (DxR Clinician for assessment of medical students (n=130 studying in a problem-based, integrated multisystem module (Unit IX during the academic year 2011–2012. Internal consistency reliability of CCS scores was calculated using Cronbach's alpha statistics. The relationships between students' scores in CCS components (clinical reasoning, diagnostic performance, and patient management and their scores in other examination tools at the end of the unit including multiple-choice questions, short-answer questions, objective structured clinical examination (OSCE, and real patient encounters were analyzed using stepwise hierarchical linear regression. Results: Internal consistency reliability of CCS scores was high (α=0.862. Inter-item correlations between students' scores in different CCS components and their scores in CCS and other test items were statistically significant. Regression analysis indicated that OSCE scores predicted 32.7% and 35.1% of the variance in clinical reasoning and patient management scores, respectively (P<0.01. Multiple-choice question scores, however, predicted only 15.4% of the variance in diagnostic performance scores (P<0.01, while
The Effect of Computer-Based Self-Access Learning on Weekly Vocabulary Test Scores

Directory of Open Access Journals (Sweden)

Jordan Dreyer

2014-09-01

Full Text Available This study sets out to clarify the effectiveness of using an online vocabulary study tool, Quizlet, in an urban high school language arts class. Previous similar studies have mostly dealt with English Language Learners in college settings (Chui, 2013, and were therefore not directed at the issue self-efficacy that is at the heart of the problem of urban high school students in America entering remedial writing programs (Rose, 1989. The study involves 95 students over the course of 14 weeks. Students were tested weekly and were asked to use the Quizlet program in their own free time. The result of this optional involvement was that many students did not participate in the treatment and therefore acted as an elective control group. The resultant data collected shows a strong correlation between the use of an online vocabulary review program and short-term vocabulary retention. The study also showed that students who paced themselves and spread out their study sessions outperformed those students who used the program only for last minute “cram sessions.” The implications of the study are that students who take advantage of tools outside of the classroom are able to out perform their peers. The results are also in line with the call to include technology in the Basic Writing classroom not simply as a tool, but as a “form of discourse” (Jonaitis, 2012. Weekly vocabulary tests, combined with the daily online activity as reported by Quizlet, show that: 1 utilizing the review software improved the scores of most students, 2 those students who used Quizlet to review more than a single time (i.e., several days before the test outperformed those who only used the product once, and 3 students who professed proficiency with the “notebook” system of vocabulary learning appeared not to need the treatment.
CCTF CORE I test results

International Nuclear Information System (INIS)

Murao, Yoshio; Sudoh, Takashi; Akimoto, Hajime; Iguchi, Tadashi; Sugimoto, Jun; Fujiki, Kazuo; Hirano, Kenmei

1982-07-01

This report presents the results of the following CCTF CORE I tests conducted in FY. 1980. (1) Multi-dimensional effect test, (2) Evaluation model test, (3) FLECHT coupling test. On the first test, one-dimensional treatment of the core thermohydrodynamics was discussed. On the second and third tests, the test results were compared with the results calculated by the evaluation model codes and the results of the corresponding FLECHT-SET test (Run 2714B), respectively. The work was performed under contracts with the Atomic Energy Bureau of Science and Technology Agency of Japan. (author)
The Validity of Graduate Management Admission Test Scores: A Summary of Studies Conducted from 1997 to 2004

Science.gov (United States)

Talento-Miller, Eileen; Rudner, Lawrence M.

2008-01-01

The validity of Graduate Management Admission Test (GMAT) scores is examined by summarizing 273 studies conducted between 1997 and 2004. Each of the studies was conducted through the Validity Study Service of the test sponsor and contained identical variables and statistical methods. Validity coefficients from each of the studies were corrected…
What does my patient's coronary artery calcium score mean? Combining information from the coronary artery calcium score with information from conventional risk factors to estimate coronary heart disease risk

Directory of Open Access Journals (Sweden)

Pletcher Mark J

2004-08-01

Full Text Available Abstract Background The coronary artery calcium (CAC score is an independent predictor of coronary heart disease. We sought to combine information from the CAC score with information from conventional cardiac risk factors to produce post-test risk estimates, and to determine whether the score may add clinically useful information. Methods We measured the independent cross-sectional associations between conventional cardiac risk factors and the CAC score among asymptomatic persons referred for non-contrast electron beam computed tomography. Using the resulting multivariable models and published CAC score-specific relative risk estimates, we estimated post-test coronary heart disease risk in a number of different scenarios. Results Among 9341 asymptomatic study participants (age 35–88 years, 40% female, we found that conventional coronary heart disease risk factors including age, male sex, self-reported hypertension, diabetes and high cholesterol were independent predictors of the CAC score, and we used the resulting multivariable models for predicting post-test risk in a variety of scenarios. Our models predicted, for example, that a 60-year-old non-smoking non-diabetic women with hypertension and high cholesterol would have a 47% chance of having a CAC score of zero, reducing her 10-year risk estimate from 15% (per Framingham to 6–9%; if her score were over 100, however (a 17% chance, her risk estimate would be markedly higher (25–51% in 10 years. In low risk scenarios, the CAC score is very likely to be zero or low, and unlikely to change management. Conclusion Combining information from the CAC score with information from conventional risk factors can change assessment of coronary heart disease risk to an extent that may be clinically important, especially when the pre-test 10-year risk estimate is intermediate. The attached spreadsheet makes these calculations easy.
The Dysexecutive Questionnaire advanced: item and test score characteristics, 4-factor solution, and severity classification.

Science.gov (United States)

Bodenburg, Sebastian; Dopslaff, Nina

2008-01-01

The Dysexecutive Questionnaire (DEX, , Behavioral assessment of the dysexecutive syndrome, 1996) is a standardized instrument to measure possible behavioral changes as a result of the dysexecutive syndrome. Although initially intended only as a qualitative instrument, the DEX has also been used increasingly to address quantitative problems. Until now there have not been more fundamental statistical analyses of the questionnaire's testing quality. The present study is based on an unselected sample of 191 patients with acquired brain injury and reports on the data relating to the quality of the items, the reliability and the factorial structure of the DEX. Item 3 displayed too great an item difficulty, whereas item 11 was not sufficiently discriminating. The DEX's reliability in self-rating is r = 0.85. In addition to presenting the statistical values of the tests, a clinical severity classification of the overall scores of the 4 found factors and of the questionnaire as a whole is carried out on the basis of quartile standards.

Hematoma Shape, Hematoma Size, Glasgow Coma Scale Score and ICH Score: Which Predicts the 30-Day Mortality Better for Intracerebral Hematoma?

Science.gov (United States)

Wang, Chih-Wei; Liu, Yi-Jui; Lee, Yi-Hsiung; Hueng, Dueng-Yuan; Fan, Hueng-Chuen; Yang, Fu-Chi; Hsueh, Chun-Jen; Kao, Hung-Wen; Juan, Chun-Jung; Hsu, Hsian-He

2014-01-01

Purpose To investigate the performance of hematoma shape, hematoma size, Glasgow coma scale (GCS) score, and intracerebral hematoma (ICH) score in predicting the 30-day mortality for ICH patients. To examine the influence of the estimation error of hematoma size on the prediction of 30-day mortality. Materials and Methods This retrospective study, approved by a local institutional review board with written informed consent waived, recruited 106 patients diagnosed as ICH by non-enhanced computed tomography study. The hemorrhagic shape, hematoma size measured by computer-assisted volumetric analysis (CAVA) and estimated by ABC/2 formula, ICH score and GCS score was examined. The predicting performance of 30-day mortality of the aforementioned variables was evaluated. Statistical analysis was performed using Kolmogorov-Smirnov tests, paired t test, nonparametric test, linear regression analysis, and binary logistic regression. The receiver operating characteristics curves were plotted and areas under curve (AUC) were calculated for 30-day mortality. A P value less than 0.05 was considered as statistically significant. Results The overall 30-day mortality rate was 15.1% of ICH patients. The hematoma shape, hematoma size, ICH score, and GCS score all significantly predict the 30-day mortality for ICH patients, with an AUC of 0.692 (P = 0.0018), 0.715 (P = 0.0008) (by ABC/2) to 0.738 (P = 0.0002) (by CAVA), 0.877 (Phematoma shape, hematoma size, ICH scores and GCS score all significantly predict the 30-day mortality in an increasing order of AUC. The effect of overestimation of hematoma size by ABC/2 formula in predicting the 30-day mortality could be remedied by using ICH score. PMID:25029592
An ultrasound score for knee osteoarthritis

DEFF Research Database (Denmark)

Riecke, B F; Christensen, R.; Torp-Pedersen, S

2014-01-01

OBJECTIVE: To develop standardized musculoskeletal ultrasound (MUS) procedures and scoring for detecting knee osteoarthritis (OA) and test the MUS score's ability to discern various degrees of knee OA, in comparison with plain radiography and the 'Knee injury and Osteoarthritis Outcome Score' (KOOS......) domains as comparators. METHOD: A cross-sectional study of MUS examinations in 45 patients with knee OA. Validity, reliability, and reproducibility were evaluated. RESULTS: MUS examination for knee OA consists of five separate domains assessing (1) predominantly morphological changes in the medial...... coefficients ranging from 0.75 to 0.97 for the five domains. Construct validity was confirmed with statistically significant correlation coefficients (0.47-0.81, P knee OA. In comparison with standing radiographs...
The Sinonasal Outcome Test 22 score in persons without chronic rhinosinusitis

DEFF Research Database (Denmark)

Lange, Bibi; Thilsing, T; Baelum, J

2016-01-01

-67 with a mean score of 10.5 (CI: 9.1 - 11.9) and the median score was 7. Persons with allergic rhinitis and blue collar workers had a significant higher score. CONCLUSION: The median value of 7 is taken as the normal SNOT 22 score in persons without CRS and can be used as a reference in clinical settings...... and research. Allergic rhinitis and occupation affects SNOT 22 in persons without CRS. This article is protected by copyright. All rights reserved....
Classroom Organizational Structure in Fifth Grade Math Classrooms and the Effect on Standardized Test Scores

Science.gov (United States)

Lane, Dallas Marie

2017-01-01

The purpose of this study was to determine if there is a relationship between the classroom organizational structure and MCT2 test scores of fifth-grade math students. The researcher gained insight regarding which structure teachers believe is most beneficial to them and students, and whether or not their belief of classroom organizational…
A risk score for predicting coronary artery disease in women with angina pectoris and abnormal stress test finding.

Science.gov (United States)

Lo, Monica Y; Bonthala, Nirupama; Holper, Elizabeth M; Banks, Kamakki; Murphy, Sabina A; McGuire, Darren K; de Lemos, James A; Khera, Amit

2013-03-15

Women with angina pectoris and abnormal stress test findings commonly have no epicardial coronary artery disease (CAD) at catheterization. The aim of the present study was to develop a risk score to predict obstructive CAD in such patients. Data were analyzed from 337 consecutive women with angina pectoris and abnormal stress test findings who underwent cardiac catheterization at our center from 2003 to 2007. Forward selection multivariate logistic regression analysis was used to identify the independent predictors of CAD, defined by ≥50% diameter stenosis in ≥1 epicardial coronary artery. The independent predictors included age ≥55 years (odds ratio 2.3, 95% confidence interval 1.3 to 4.0), body mass index stress imaging (odds ratio 2.8, 95% confidence interval 1.5 to 5.5), and exercise capacity statistic of 0.745 (95% confidence interval 0.70 to 0.79), and an optimized cutpoint of a score of ≤2 included 62% of the subjects and had a negative predictive value of 80%. In conclusion, a simple clinical risk score of 7 characteristics can help differentiate those more or less likely to have CAD among women with angina pectoris and abnormal stress test findings. This tool, if validated, could help to guide testing strategies in women with angina pectoris. Copyright © 2013 Elsevier Inc. All rights reserved.
Analysis of Baseline Computerized Neurocognitive Testing Results among 5–11-Year-Old Male and Female Children Playing Sports in Recreational Leagues in Florida

Directory of Open Access Journals (Sweden)

Karen D. Liller

2017-09-01

Full Text Available There is a paucity of data related to sports injuries, concussions, and computerized neurocognitive testing (CNT among very young athletes playing sports in recreational settings. The purpose of this study was to report baseline CNT results among male and female children, ages 5–11, playing sports in Hillsborough County, Florida using ImPACT Pediatric, which is specifically designed for this population. Data were collected from 2016 to 2017. The results show that 657 baseline tests were conducted and t-tests and linear regression were used to assess mean significant differences in composite scores with sex and age. Results showed that females scored better on visual memory and in general as age increased, baseline scores improved. The results can be used to build further studies on the use of CNT in recreational settings and their role in concussion treatment, management, and interventions.
Testing measurement invariance of the schizotypal personality questionnaire-brief scores across Spanish and Swiss adolescents.

Directory of Open Access Journals (Sweden)

Javier Ortuño-Sierra

Full Text Available BACKGROUND: Schizotypy is a complex construct intimately related to psychosis. Empirical evidence indicates that participants with high scores on schizotypal self-report are at a heightened risk for the later development of psychotic disorders. Schizotypal experiences represent the behavioural expression of liability for psychotic disorders. Previous factorial studies have shown that schizotypy is a multidimensional construct similar to that found in patients with schizophrenia. Specifically, using the Schizotypal Personality Questionnaire-Brief (SPQ-B, the three-dimensional model has been widely replicated. However, there has been no in-depth investigation of whether the dimensional structure underlying the SPQ-B scores is invariant across countries. METHODS: The main goal of this study was to examine the measurement invariance of the SPQ-B scores across Spanish and Swiss adolescents. The final sample was made up of 261 Spanish participants (51.7% men; M = 16.04 years and 241 Swiss participants (52.3% men; M = 15.94 years. RESULTS: The results indicated that Raine et al.'s three-factor model presented adequate goodness-of-fit indices. Moreover, the results supported the measurement invariance (configural and partial strong invariance of the SPQ-B scores across the two samples. Spanish participants scored higher on Interpersonal dimension than Swiss when latent means were compared. DISCUSSION: The study of measurement equivalence across countries provides preliminary evidence for the Raine et al.'s three-factor model and of the cross-cultural validity of the SPQ-B scores in adolescent population. Future studies should continue to examine the measurement invariance of the schizotypy and psychosis-risk syndromes across cultures.
Association of fall history with the Timed Up and Go test score and the dual task cost: A cross-sectional study among independent community-dwelling older adults.

Science.gov (United States)

Asai, Tsuyoshi; Oshima, Kensuke; Fukumoto, Yoshihiro; Yonezawa, Yuri; Matsuo, Asuka; Misu, Shogo

2018-05-21

To investigate the associations between fall history and the Timed Up and Go (TUG) test (single-TUG test), TUG test while counting aloud backwards from 100 (dual-TUG test) and the dual-task cost (DTC) among independent community-dwelling older adults. This cross-sectional study included 537 older adults who lived independently in the community. Data on fall history in the previous year were obtained by self-administrated questionnaire. The single- and dual-TUG tests were carried out, and the DTC value was computed from these results. Associations between fall history and these TUG-related values were analyzed using multivariate logistic regression models. The participants were divided into fall risk groups using the cut-off values of those significantly associated with falling, and the odds ratios (OR) were computed. Slower single-TUG test scores and lower DTC values were significantly associated with fall history after adjusting for potential confounders (single-TUG test score: OR 1.133, 95% CI 1.029-1.249; DTC value: OR 0.984, 95% CI 0.968-0.998). Older adults with slower single-TUG test scores and lower DTC values reported a fall history more often than those in other categories (OR compared with the lower-risk single-TUG and lower-risk DTC groups: 3.474, 95% CI 1.881-6.570). Slower single-TUG test scores and lower DTC values are associated with fall history among independent community-dwelling older adults. To some extent, dual task performance might provide added value for fall assessment, compared with administering the TUG test alone. Geriatr Gerontol Int 2018; ••: ••-••. © 2018 Japan Geriatrics Society.
The Effects of Teacher and Teacher-librarian High-end Collaboration on Inquiry-based Project Reports and School Monthly Test Scores of Fifth-grade Students

Directory of Open Access Journals (Sweden)

Hai-Hon Chen

2015-07-01

Full Text Available The purpose of this study was twofold. The first purpose was to establish the high level collaboration of integrated instruction model between social studies teacher and teacher-librarian. The second purpose was to investigate the effects of high-end collaboration on the individual and groups’ inquiry-based project reports, as well as monthly test scores of fifth-grade students. A quasi-experimental method was adopted, two classes of elementary school fifth graders in Tainan Municipal city, Taiwan were used as samples. Students were randomly assigned to experimental conditions by class. Twenty eight students of the experimental group were taught by the collaboration of social studies teacher and teacher-librarian; while 27 students of the controlled group were taught separately by teacher in didactic teaching method. Inquiry-Based Project Record, Inquiry-Based Project Rubrics, and school monthly test scores were used as instruments for collecting data. A t-test and correlation were used to analyze the data. The results indicate that: (1 High-end collaboration model between social studies teacher and teacher-librarian was established and implemented well in the classroom. (2There was a significant difference between the experimental group and the controlled group in individual and groups’ inquiry-based project reports. Students that were taught by the collaborative teachers got both higher inquiry-based project reports’ scores than those that were taught separately by the teachers. Experimental group’s students got higher school monthly test scores than controlled groups. Suggestions for teachers’ high-end collaboration and future researcher are provided in this paper.
Use of e-rater[R] in Scoring of the TOEFL iBT[R] Writing Test. Research Report. ETS RR-11-25

Science.gov (United States)

Haberman, Shelby J.

2011-01-01

Alternative approaches are discussed for use of e-rater[R] to score the TOEFL iBT[R] Writing test. These approaches involve alternate criteria. In the 1st approach, the predicted variable is the expected rater score of the examinee's 2 essays. In the 2nd approach, the predicted variable is the expected rater score of 2 essay responses by the…
Modeling Floor Effects in Standardized Vocabulary Test Scores in a Sample of Low SES Hispanic Preschool Children under the Multilevel Structural Equation Modeling Framework

Directory of Open Access Journals (Sweden)

Leina Zhu

2017-12-01

Full Text Available Researchers and practitioners often use standardized vocabulary tests such as the Peabody Picture Vocabulary Test-4 (PPVT-4; Dunn and Dunn, 2007 and its companion, the Expressive Vocabulary Test-2 (EVT-2; Williams, 2007, to assess English vocabulary skills as an indicator of children's school readiness. Despite their psychometric excellence in the norm sample, issues arise when standardized vocabulary tests are used to asses children from culturally, linguistically and ethnically diverse backgrounds (e.g., Spanish-speaking English language learners or delayed in some manner. One of the biggest challenges is establishing the appropriateness of these measures with non-English or non-standard English speaking children as often they score one to two standard deviations below expected levels (e.g., Lonigan et al., 2013. This study re-examines the issues in analyzing the PPVT-4 and EVT-2 scores in a sample of 4-to-5-year-old low SES Hispanic preschool children who were part of a larger randomized clinical trial on the effects of a supplemental English shared-reading vocabulary curriculum (Pollard-Durodola et al., 2016. It was found that data exhibited strong floor effects and the presence of floor effects made it difficult to differentiate the invention group and the control group on their vocabulary growth in the intervention. A simulation study is then presented under the multilevel structural equation modeling (MSEM framework and results revealed that in regular multilevel data analysis, ignoring floor effects in the outcome variables led to biased results in parameter estimates, standard error estimates, and significance tests. Our findings suggest caution in analyzing and interpreting scores of ethnically and culturally diverse children on standardized vocabulary tests (e.g., floor effects. It is recommended appropriate analytical methods that take into account floor effects in outcome variables should be considered.
Test Score Gaps between Private and Government Sector Students at School Entry Age in India

Science.gov (United States)

Singh, Abhijeet

2014-01-01

Various studies have noted that students enrolled in private schools in India perform better on average than students in government schools. In this paper, I show that large gaps in the test scores of children in private and public sector education are evident even at the point of initial enrollment in formal schooling and are associated with…
Test anxiety and performance-avoidance goals explain gender differences in SAT-V, SAT-M, and overall SAT scores.

Science.gov (United States)

Hannon, Brenda

2012-11-01

This study uses analysis of co-variance in order to determine which cognitive/learning (working memory, knowledge integration, epistemic belief of learning) or social/personality factors (test anxiety, performance-avoidance goals) might account for gender differences in SAT-V, SAT-M, and overall SAT scores. The results revealed that none of the cognitive/learning factors accounted for gender differences in SAT performance. However, the social/personality factors of test anxiety and performance-avoidance goals each separately accounted for all of the significant gender differences in SAT-V, SAT-M, and overall SAT performance. Furthermore, when the influences of both of these factors were statistically removed simultaneously, all non-significant gender differences reduced further to become trivial by Cohen's (1988) standards. Taken as a whole, these results suggest that gender differences in SAT-V, SAT-M, and overall SAT performance are a consequence of social/learning factors.
The Effects of Video Game Experience on Computer-Based Air Traffic Controller Specialist, Air Traffic Scenario Test Scores.

Science.gov (United States)

1997-02-01

application with a strong resemblance to a video game , concern has been raised that prior video game experience might have a moderating effect on scores. Much...such as spatial ability. The effects of computer or video game experience on work sample scores have not been systematically investigated. The purpose...of this study was to evaluate the incremental validity of prior video game experience over that of general aptitude as a predictor of work sample test
Percentiles of the null distribution of 2 maximum lod score tests.

Science.gov (United States)

Ulgen, Ayse; Yoo, Yun Joo; Gordon, Derek; Finch, Stephen J; Mendell, Nancy R

2004-01-01

We here consider the null distribution of the maximum lod score (LOD-M) obtained upon maximizing over transmission model parameters (penetrance values, dominance, and allele frequency) as well as the recombination fraction. Also considered is the lod score maximized over a fixed choice of genetic model parameters and recombination-fraction values set prior to the analysis (MMLS) as proposed by Hodge et al. The objective is to fit parametric distributions to MMLS and LOD-M. Our results are based on 3,600 simulations of samples of n = 100 nuclear families ascertained for having one affected member and at least one other sibling available for linkage analysis. Each null distribution is approximately a mixture p(2)(0) + (1 - p)(2)(v). The values of MMLS appear to fit the mixture 0.20(2)(0) + 0.80chi(2)(1.6). The mixture distribution 0.13(2)(0) + 0.87chi(2)(2.8). appears to describe the null distribution of LOD-M. From these results we derive a simple method for obtaining critical values of LOD-M and MMLS. Copyright 2004 S. Karger AG, Basel
Performance of machine-learning scoring functions in structure-based virtual screening.

Science.gov (United States)

Wójcikowski, Maciej; Ballester, Pedro J; Siedlecki, Pawel

2017-04-25

Classical scoring functions have reached a plateau in their performance in virtual screening and binding affinity prediction. Recently, machine-learning scoring functions trained on protein-ligand complexes have shown great promise in small tailored studies. They have also raised controversy, specifically concerning model overfitting and applicability to novel targets. Here we provide a new ready-to-use scoring function (RF-Score-VS) trained on 15 426 active and 893 897 inactive molecules docked to a set of 102 targets. We use the full DUD-E data sets along with three docking tools, five classical and three machine-learning scoring functions for model building and performance assessment. Our results show RF-Score-VS can substantially improve virtual screening performance: RF-Score-VS top 1% provides 55.6% hit rate, whereas that of Vina only 16.2% (for smaller percent the difference is even more encouraging: RF-Score-VS top 0.1% achieves 88.6% hit rate for 27.5% using Vina). In addition, RF-Score-VS provides much better prediction of measured binding affinity than Vina (Pearson correlation of 0.56 and -0.18, respectively). Lastly, we test RF-Score-VS on an independent test set from the DEKOIS benchmark and observed comparable results. We provide full data sets to facilitate further research in this area (http://github.com/oddt/rfscorevs) as well as ready-to-use RF-Score-VS (http://github.com/oddt/rfscorevs_binary).
IMPACT OF SHOTS ON FINAL SCORE OF A FOOTBALL MATCH

Directory of Open Access Journals (Sweden)

Miroslav Radoman

2008-08-01

Full Text Available The research has been done on a sample of 64 played games on the World championship FIFA, World Cup Germany 2006 and 128 results of the games divided in three integrals according to the score (win, defeat and unresolved score . The analysis is done according to the total number of shots during the game. Considering the results that are got and their interpretations, we could conclude that the results of data analysis in which is used the multi-method of MANOVA analysis and discriminative analysis, has shown that there are significant difference in frequency of the games result (win, defeat or unresolved score in shots element during the game. Even thou the noticed difference in frequency are not equally expressed, the results that are got have insinuated that there are significant differences in followed elements of the football game. Implemented analysis (royev test i T-test have confirmed that in every analyzed elements of the shot there are statistically significant differences in the result of the game (win, defeat, unresolved score and that the differences in shot’s elements are consequence different selection of the tactics and techniques also the ability of their realization in the stage of at tack and defense.
The Impact of Scholastic Instrumental Music and Scholastic Chess Study on the Standardized Test Scores of Students in Grades Three, Four, and Five

Science.gov (United States)

Martinez, Edwin E.

2012-01-01

This study examines the impact of instrumental music study and group chess lessons on the standardized test scores of suburban elementary public school students (grades three through five) in Levittown, New York. The study divides the students into the following groups and compares the standardized test scores of each: a) instrumental music…
Comparing long-term results of PASAT and SDMT scores in relation to neuropsychological testing in multiple sclerosis

NARCIS (Netherlands)

Sonder, J.M.; Burggraaff, J.; Knol, D.L.; Polman, C.H.; Uitdehaag, B.M.J.

2014-01-01

Background and objectives: The Symbol Digit Modalities Test (SDMT) shows advantages over the Paced Auditory Serial Addition Test (PASAT) as a cognitive test in patients with multiple sclerosis (MS). To determine which of these tests is most valid and reliable over time as an indicator of the
Equating error in observed-score equating

NARCIS (Netherlands)

van der Linden, Willem J.

2006-01-01

Traditionally, error in equating observed scores on two versions of a test is defined as the difference between the transformations that equate the quantiles of their distributions in the sample and population of test takers. But it is argued that if the goal of equating is to adjust the scores of

Use of Automated Scoring in Spoken Language Assessments for Test Takers with Speech Impairments. Research Report. ETS RR-17-42

Science.gov (United States)

Loukina, Anastassia; Buzick, Heather

2017-01-01

This study is an evaluation of the performance of automated speech scoring for speakers with documented or suspected speech impairments. Given that the use of automated scoring of open-ended spoken responses is relatively nascent and there is little research to date that includes test takers with disabilities, this small exploratory study focuses…
How Well Do Customers of Direct-to-Consumer Personal Genomic Testing Services Comprehend Genetic Test Results? Findings from the Impact of Personal Genomics Study.

Science.gov (United States)

Ostergren, Jenny E; Gornick, Michele C; Carere, Deanna Alexis; Kalia, Sarah S; Uhlmann, Wendy R; Ruffin, Mack T; Mountain, Joanna L; Green, Robert C; Roberts, J Scott

2015-01-01

To assess customer comprehension of health-related personal genomic testing (PGT) results. We presented sample reports of genetic results and examined responses to comprehension questions in 1,030 PGT customers (mean age: 46.7 years; 59.9% female; 79.0% college graduates; 14.9% non-White; 4.7% of Hispanic/Latino ethnicity). Sample reports presented a genetic risk for Alzheimer's disease and type 2 diabetes, carrier screening summary results for >30 conditions, results for phenylketonuria and cystic fibrosis, and drug response results for a statin drug. Logistic regression was used to identify correlates of participant comprehension. Participants exhibited high overall comprehension (mean score: 79.1% correct). The highest comprehension (range: 81.1-97.4% correct) was observed in the statin drug response and carrier screening summary results, and lower comprehension (range: 63.6-74.8% correct) on specific carrier screening results. Higher levels of numeracy, genetic knowledge, and education were significantly associated with greater comprehension. Older age (≥ 60 years) was associated with lower comprehension scores. Most customers accurately interpreted the health implications of PGT results; however, comprehension varied by demographic characteristics, numeracy and genetic knowledge, and types and format of the genetic information presented. Results suggest a need to tailor the presentation of PGT results by test type and customer characteristics. © 2015 S. Karger AG, Basel.
REPRODUCIBILITY OF THE MODIFIED STAR EXCURSION BALANCE TEST COMPOSITE AND SPECIFIC REACH DIRECTION SCORES.

Science.gov (United States)

van Lieshout, Remko; Reijneveld, Elja A E; van den Berg, Sandra M; Haerkens, Gijs M; Koenders, Niek H; de Leeuw, Arina J; van Oorsouw, Roel G; Paap, Davy; Scheffer, Else; Weterings, Stijn; Stukstette, Mirelle J

2016-06-01

The mSEBT is a screening tool used to evaluate dynamic balance. Most research investigating measurement properties focused on intrarater reliability and was done in small samples. To know whether the mSEBT is useful to discriminate dynamic balance between persons and to evaluate changes in dynamic balance, more research into intra- and interrater reliability and smallest detectable change (synonymous with minimal detectable change) is needed. To estimate intra- and interrater reliability and smallest detectable change of the mSEBT in adults at risk for ankle sprain. Cross-sectional, test-retest design. Fifty-five healthy young adults participating in sports at risk for ankle sprain participated (mean ± SD age, 24.0 ± 2.9 years). Each participant performed three test sessions within one hour and was rated by two physical therapists (session 1, rater 1; session 2, rater 2; session 3, rater 1). Participants and raters were blinded for previous measurements. Normalized composite and reach direction scores for the right and left leg were collected. Analysis of variance was used to calculate intraclass correlation coefficient values for intra- and interrater reliability. Smallest detectable change values were calculated based on the standard error of measurement. Intra- and interrater reliability for both legs was good to excellent (intraclass correlation coefficient ranging from 0.87 to 0.94). The intrarater smallest detectable change for the composite score of the right leg was 7.2% and for the left 6.2%. The interrater smallest detectable change for the composite score of the right leg was 6.9% and for the left 5.0%. The mSEBT is a reliable measurement instrument to discriminate dynamic balance between persons. Most smallest detectable change values of the mSEBT appear to be large. More research is needed to investigate if the mSEBT is usable for evaluative purposes. Level 2.
Reliability Generalization: Exploring Variation of Reliability Coefficients of MMPI Clinical Scales Scores.

Science.gov (United States)

Vacha-Haase, Tammi; Kogan, Lori R.; Tani, Crystal R.; Woodall, Renee A.

2001-01-01

Used reliability generalization to explore the variance of scores on 10 Minnesota Multiphasic Personality Inventory (MMPI) clinical scales drawing on 1,972 articles in the literature on the MMPI. Results highlight the premise that scores, not tests, are reliable or unreliable, and they show that study characteristics do influence scores on the…
Refinement of immunohistologic parameters for Her2/neu scoring validation by FISH and CISH.

Science.gov (United States)

Leong, Anthony S-Y; Formby, Mark; Haffajee, Zenobia; Clarke, Megan; Morey, Adrienne

2006-12-01

The conventional method of scoring Her2/neu immunostaining is recognized to result in a high false-positive rate among 2+ cases when compared with results obtained with fluorescence in situ hybridization (FISH); however, costs and convenience dictates that immunohistochemistry remains the screening test for Her2/neu status in patients with breast cancer. We describe refined criteria for scoring of Her2/neu on the basis of anatomic localization rather than the subjective assessment of intensity. The presence of a circumferential tram track pattern that results from the staining of apposing cell membranes in >25% of the tumor cells was necessary for a 3+ score (Her2/neu overexpressed) and the presence of the tram track pattern in CISH testing in selected cases from the other categories validated the revised scoring method. These criteria reduced the numbers of equivocal staining cases that required FISH testing.
Relations between VSRAD-based parahippocampal atrophy and results of neuropsychological tests in patients with Alzheimer's disease and in those with mild cognitive impairment

International Nuclear Information System (INIS)

Shimizu, Satoru

2008-01-01

The objective of this study was to clarify the utility of VSRAD (Voxel-based Specific Regional analysis system for Alzheimer's Disease) for the diagnosis of Alzheimer's Disease (AD) or AD-related studies, correlations between VSRAD-based parahippocampal atrophy and results of neuropsychological tests were investigated. Subjects comprised 18 and 12 patients with probable AD and those with mild cognitive impairment (MCI) due to a cerebral degeneration near AD, respectively. Neuropsychological tests consisted of Alzheimer's Disease Assessment Scale (ADAS)-J cog., Hasegawa Dementia Scale-Revised (HDS-R), Wechsler Adult Intelligent Scale-Revised or -IIIrd (WAIS-R/-III) and Wechsler Memory Scale-Revised (WMS-R). Subjects received these tests within one month before or after cranial MRI scans, and correlations between Z-scores of VSRAD reflecting the parahippocampal atrophy and results of these neuropsychological tests were statistically examined. The Z-scores had a significant positive correlation with scores of ADAS (p=.0129) and an inverse correlation with scores of ''Information'' as a subtest of WAIS-R/-III (p=.0294). Further, the Z-scores showed a tendency to weak, inverse correlations with the scores of HDS-R, ''Similarity'' as a subtest of WAIS-R/-III and ''Visual Reproduction II'' as a subtest of WMS-R (p=.0532, .0635, and .0609, respectively). Usefulness of VSRAD for the diagnosis of AD was indicated by the significant correlations hoted with ADAS and HDS-R, and it was further suggested that parahippocampal atrophy was related to semantic and visual memory impairments of AD, judging from the correlations with subtests of WAIS-R/-III and WMS-R. (author)
Dose Uniformity of Scored and Unscored Tablets: Application of the FDA Tablet Scoring Guidance for Industry.

Science.gov (United States)

Ciavarella, Anthony B; Khan, Mansoor A; Gupta, Abhay; Faustino, Patrick J

This U.S. Food and Drug Administration (FDA) laboratory study examines the impact of tablet splitting, the effect of tablet splitters, and the presence of a tablet score on the dose uniformity of two model drugs. Whole tablets were purchased from five manufacturers for amlodipine and six for gabapentin. Two splitters were used for each drug product, and the gabapentin tablets were also split by hand. Whole and split amlodipine tablets were tested for content uniformity following the general chapter of the United States Pharmacopeia (USP) Uniformity of Dosage Units , which is a requirement of the new FDA Guidance for Industry on tablet scoring. The USP weight variation method was used for gabapentin split tablets based on the recommendation of the guidance. All whole tablets met the USP acceptance criteria for the Uniformity of Dosage Units. Variation in whole tablet content ranged from 0.5 to 2.1 standard deviation (SD) of the percent label claim. Splitting the unscored amlodipine tablets resulted in a significant increase in dose variability of 6.5-25.4 SD when compared to whole tablets. Split tablets from all amlodipine drug products did not meet the USP acceptance criteria for content uniformity. Variation in the weight for gabapentin split tablets was greater than the whole tablets, ranging from 1.3 to 9.3 SD. All fully scored gabapentin products met the USP acceptance criteria for weight variation. Size, shape, and the presence or absence of a tablet score can affect the content uniformity and weight variation of amlodipine and gabapentin tablets. Tablet splitting produced higher variability. Differences in dose variability and fragmentation were observed between tablet splitters and hand splitting. These results are consistent with the FDA's concerns that tablet splitting can have an effect on the amount of drug present in a split tablet and available for absorption. Tablet splitting has become a very common practice in the United States and throughout the
The quantitative LOD score: test statistic and sample size for exclusion and linkage of quantitative traits in human sibships.

Science.gov (United States)

Page, G P; Amos, C I; Boerwinkle, E

1998-04-01

We present a test statistic, the quantitative LOD (QLOD) score, for the testing of both linkage and exclusion of quantitative-trait loci in randomly selected human sibships. As with the traditional LOD score, the boundary values of 3, for linkage, and -2, for exclusion, can be used for the QLOD score. We investigated the sample sizes required for inferring exclusion and linkage, for various combinations of linked genetic variance, total heritability, recombination distance, and sibship size, using fixed-size sampling. The sample sizes required for both linkage and exclusion were not qualitatively different and depended on the percentage of variance being linked or excluded and on the total genetic variance. Information regarding linkage and exclusion in sibships larger than size 2 increased as approximately all possible pairs n(n-1)/2 up to sibships of size 6. Increasing the recombination (theta) distance between the marker and the trait loci reduced empirically the power for both linkage and exclusion, as a function of approximately (1-2theta)4.
Predicting Freshman Grade Point Average From College Admissions Test Scores and State High School Test Scores

Directory of Open Access Journals (Sweden)

Daniel Koretz

2016-09-01

Full Text Available The current focus on assessing “college and career readiness” raises an empirical question: How do high school tests compare with college admissions tests in predicting performance in college? We explored this using data from the City University of New York and public colleges in Kentucky. These two systems differ in the choice of college admissions test, the stakes for students on the high school test, and demographics. We predicted freshman grade point average (FGPA from high school GPA and both college admissions and high school tests in mathematics and English. In both systems, the choice of tests had only trivial effects on the aggregate prediction of FGPA. Adding either test to an equation that included the other had only trivial effects on prediction. Although the findings suggest that the choice of test might advantage or disadvantage different students, it had no substantial effect on the over- and underprediction of FGPA for students classified by race-ethnicity or poverty.
Clinical use of the ABO-Scoring Index: reliability and subtraction frequency.

Science.gov (United States)

Lieber, William S; Carlson, Sean K; Baumrind, Sheldon; Poulton, Donald R

2003-10-01

This study tested the reliability and subtraction frequency of the study model-scoring system of the American Board of Orthodontists (ABO). We used a sample of 36 posttreatment study models that were selected randomly from six different orthodontic offices. Intrajudge and interjudge reliability was calculated using nonparametric statistics (Spearman rank coefficient, Wilcoxon, Kruskal-Wallis, and Mann-Whitney tests). We found differences ranging from 3 to 6 subtraction points (total score) for intrajudge scoring between two sessions. For overall total ABO score, the average correlation was .77. Intrajudge correlation was greatest for occlusal relationships and least for interproximal contacts. Interjudge correlation for ABO score averaged r = .85. Correlation was greatest for buccolingual inclination and least for overjet. The data show that some judges, on average, were much more lenient than others and that this resulted in a range of total scores between 19.7 and 27.5. Most of the deductions were found in the buccal segments and most were related to the second molars. We present these findings in the context of clinicians preparing for the ABO phase III examination and for orthodontists in their ongoing evaluation of clinical results.
The Score Reliability of Draw-a-Person Intellectual Ability Test (DAP: IQ) for Rural Malawi Students

Science.gov (United States)

Khasu, Denis S.; Williams, Thomas O., Jr.

2016-01-01

In this brief article, the reliability of scores for the Draw-A-Person Intellectual Ability Test for Children, Adolescents, and Adults (DAP: IQ; Reynolds & Hickman, 2004) was examined through several analyses with a sample of 147 children from rural Malawi, Africa using a Chichewa translation of instructions. Cronbach alpha coefficients for…
Hippocampal dose volume histogram predicts Hopkins Verbal Learning Test scores after brain irradiation

Directory of Open Access Journals (Sweden)

Catherine Okoukoni, PhD

2017-10-01

Full Text Available Purpose: Radiation-induced cognitive decline is relatively common after treatment for primary and metastatic brain tumors; however, identifying dosimetric parameters that are predictive of radiation-induced cognitive decline is difficult due to the heterogeneity of patient characteristics. The memory function is especially susceptible to radiation effects after treatment. The objective of this study is to correlate volumetric radiation doses received by critical neuroanatomic structures to post–radiation therapy (RT memory impairment. Methods and materials: Between 2008 and 2011, 53 patients with primary brain malignancies were treated with conventionally fractionated RT in prospectively accrued clinical trials performed at our institution. Dose-volume histogram analysis was performed for the hippocampus, parahippocampus, amygdala, and fusiform gyrus. Hopkins Verbal Learning Test-Revised scores were obtained at least 6 months after RT. Impairment was defined as an immediate recall score ≤15. For each anatomic region, serial regression was performed to correlate volume receiving a given dose (VD(Gy with memory impairment. Results: Hippocampal V53.4Gy to V60.9Gy significantly predicted post-RT memory impairment (P < .05. Within this range, the hippocampal V55Gy was the most significant predictor (P = .004. Hippocampal V55Gy of 0%, 25%, and 50% was associated with tumor-induced impairment rates of 14.9% (95% confidence interval [CI], 7.2%-28.7%, 45.9% (95% CI, 24.7%-68.6%, and 80.6% (95% CI, 39.2%-96.4%, respectively. Conclusions: The hippocampal V55Gy is a significant predictor for impairment, and a limiting dose below 55 Gy may minimize radiation-induced cognitive impairment.
Timed up & go test score in patients with hip fracture is related to the type of walking aid

DEFF Research Database (Denmark)

Kristensen, Morten T; Bandholm, Thomas; Holm, Bente

2009-01-01

Kristensen MT, Bandholm T, Holm B, Ekdahl C, Kehlet H. Timed Up & Go test score in patients with hip fracture is related to the type of walking aid. OBJECTIVE: To determine the relationship between Timed Up & Go (TUG) test scores and type of walking aid used during the test, and to determine...... the feasibility of using the rollator as a standardized walking aid during the TUG in patients with hip fracture who were allowed full weight-bearing (FWB). DESIGN: Prospective methodological study. SETTING: An acute orthopedic hip fracture unit at a university hospital. PARTICIPANTS: Patients (N=126; 90 women......, 36 men) with hip fracture with a mean age +/- SD of 74.8+/-12.7 years performed the TUG the day before discharge from the orthopedic ward. INTERVENTIONS: Not applicable. MAIN OUTCOME MEASURES: The TUG was performed with the walking aid the patient was to be discharged with: a walker (n=88) or elbow...
Zero Calcium Score as a Filter for Further Testing in Patients Admitted to the Coronary Care Unit with Chest Pain.

Science.gov (United States)

Correia, Luis Cláudio Lemos; Esteves, Fábio P; Carvalhal, Manuela; Souza, Thiago Menezes Barbosa de; Sá, Nicole de; Correia, Vitor Calixto de Almeida; Alexandre, Felipe Kalil Beirão; Lopes, Fernanda; Ferreira, Felipe; Noya-Rabelo, Márcia

2017-06-12

The accuracy of zero coronary calcium score as a filter in patients with chest pain has been demonstrated at the emergency room and outpatient clinics, populations with low prevalence of coronary artery disease (CAD). To test the gatekeeping role of zero calcium score in patients with chest pain admitted to the coronary care unit (CCU), where the pretest probability of CAD is higher than that of other populations. Patients underwent computed tomography for calcium scoring, and obstructive CAD was defined by a minimum 70% stenosis on invasive angiography. In 146 patients studied, the prevalence of CAD was 41%. A zero calcium score was present in 35% of the patients. The sensitivity and specificity of zero calcium score yielded a negative likelihood ratio of 0.16. After logistic regression adjustment for pretest probability, zero calcium score was independently associated with lower odds of CAD (OR = 0.12, 95%CI = 0.04-0.36), increasing the area under the ROC curve of the clinical model from 0.76 to 0.82 (p = 0.006). Zero calcium score provided a net reclassification improvement of 0.20 (p = 0.0018) over the clinical model when using a pretest probability threshold of 10% for discharging without further testing. In patients with pretest probability zero calcium score had a negative predictive value of 95% (95%CI = 83%-99%), with a number needed to test of 2.1 for obtaining one additional discharge. Zero calcium score substantially reduces the pretest probability of obstructive CAD in patients admitted to the CCU with acute chest pain. (Arq Bras Cardiol. 2017; [online].ahead print, PP.0-0). A acurácia do escore de cálcio coronário zero como um filtro nos pacientes com dor torácica aguda tem sido demonstrada na sala de emergência e nos ambulatórios, populações com baixa prevalência de doença arterial coronariana (DAC). Testar o papel do escore de cálcio zero como filtro nos pacientes com dor torácica admitidos numa unidade coronariana intensiva (UCI), na
Extension of the lod score: the mod score.

Science.gov (United States)

Clerget-Darpoux, F

2001-01-01

In 1955 Morton proposed the lod score method both for testing linkage between loci and for estimating the recombination fraction between them. If a disease is controlled by a gene at one of these loci, the lod score computation requires the prior specification of an underlying model that assigns the probabilities of genotypes from the observed phenotypes. To address the case of linkage studies for diseases with unknown mode of inheritance, we suggested (Clerget-Darpoux et al., 1986) extending the lod score function to a so-called mod score function. In this function, the variables are both the recombination fraction and the disease model parameters. Maximizing the mod score function over all these parameters amounts to maximizing the probability of marker data conditional on the disease status. Under the absence of linkage, the mod score conforms to a chi-square distribution, with extra degrees of freedom in comparison to the lod score function (MacLean et al., 1993). The mod score is asymptotically maximum for the true disease model (Clerget-Darpoux and Bonaïti-Pellié, 1992; Hodge and Elston, 1994). Consequently, the power to detect linkage through mod score will be highest when the space of models where the maximization is performed includes the true model. On the other hand, one must avoid overparametrization of the model space. For example, when the approach is applied to affected sibpairs, only two constrained disease model parameters should be used (Knapp et al., 1994) for the mod score maximization. It is also important to emphasize the existence of a strong correlation between the disease gene location and the disease model. Consequently, there is poor resolution of the location of the susceptibility locus when the disease model at this locus is unknown. Of course, this is true regardless of the statistics used. The mod score may also be applied in a candidate gene strategy to model the potential effect of this gene in the disease. Since, however, it
Evaluation of Two Methods for Modeling Measurement Errors When Testing Interaction Effects with Observed Composite Scores

Science.gov (United States)

Hsiao, Yu-Yu; Kwok, Oi-Man; Lai, Mark H. C.

2018-01-01

Path models with observed composites based on multiple items (e.g., mean or sum score of the items) are commonly used to test interaction effects. Under this practice, researchers generally assume that the observed composites are measured without errors. In this study, we reviewed and evaluated two alternative methods within the structural…
Efficiency of unenhanced MRI in the diagnosis of acute appendicitis: Comparison with Alvarado scoring system and histopathological results

Energy Technology Data Exchange (ETDEWEB)

Inci, Ercan, E-mail: ercan_inci@mynet.com [Department of Radiology, Istanbul Bakirkoy Dr. Sadi Konuk Training and Research Hospital, Incirli-Bakirkoy, Istanbul (Turkey); Hocaoglu, Elif; Aydin, Sibel; Palabiyik, Figen; Cimilli, Tan [Department of Radiology, Istanbul Bakirkoy Dr. Sadi Konuk Training and Research Hospital, Incirli-Bakirkoy, Istanbul (Turkey); Turhan, Ahmet Nuray; Ayguen, Ersan [Department of Surgery, Istanbul Bakirkoy Dr. Sadi Konuk Training and Research Hospital, Istanbul (Turkey)

2011-11-15

Purpose: The purpose of this study was to assess the diagnostic value of unenhanced magnetic resonance imaging (MRI) in the diagnosis of acute appendicitis and compare with Alvarado scores and histopathological results. Materials and methods: The study included 85 consecutive patients (mean age, 26.5 {+-} 11.3 years) who were clinically suspected of having acute appendicitis. Each patients Alvarado scores were recorded and unenhanced MRI was performed, consisting of T1-weighted, T2-weighted and fat-suppressed T2-weighted fast spin-echo sequences. The MR images were prospectively reviewed in consensus for the presence of acute appendicitis by two radiologists who were blinded to the results of the Alvarado scores. The study population were divided into three subgroups based on the MRI findings: Group I: definitely not appendicitis, Group II: probably appendicitis, Group III: definitely appendicitis. All patients were divided into two subgroups according to Alvarado scores as Group A (low: 1-6), and Group B (high: 7-10). MR findings were compared with Alvarado scores and histopathological findings. Results: Sixty-six (77.6%) of the 85 patients with clinically suspected acute appendicitis, had undergone surgery. The diagnosis of appendicitis could be correctly achieved with MRI in 55 (83.3%) of 57 (86.4%) patients with histopathologically proven acute appendicitis. The sensitivity, specificity, positive predictive value and negative predictive value of MRI examination and Alvarado scoring system in the diagnosis of acute appendicitis were 96.49%, 66.67%, 94.83%, 75.0% and 84.21%, 66.67%, 94.12%, 40.0%, respectively. Conclusions: MRI is a valuable technique for detecting acute appendicitis even in the cases with low Alvarado scores. To increase the diagnostic accuracy and preventing unnecessary laparotomies for suspected appendicitis, shorter and cheaper unenhanced basic MRI may be performed.
Efficiency of unenhanced MRI in the diagnosis of acute appendicitis: Comparison with Alvarado scoring system and histopathological results

International Nuclear Information System (INIS)

Inci, Ercan; Hocaoglu, Elif; Aydin, Sibel; Palabiyik, Figen; Cimilli, Tan; Turhan, Ahmet Nuray; Ayguen, Ersan

2011-01-01

Purpose: The purpose of this study was to assess the diagnostic value of unenhanced magnetic resonance imaging (MRI) in the diagnosis of acute appendicitis and compare with Alvarado scores and histopathological results. Materials and methods: The study included 85 consecutive patients (mean age, 26.5 ± 11.3 years) who were clinically suspected of having acute appendicitis. Each patients Alvarado scores were recorded and unenhanced MRI was performed, consisting of T1-weighted, T2-weighted and fat-suppressed T2-weighted fast spin-echo sequences. The MR images were prospectively reviewed in consensus for the presence of acute appendicitis by two radiologists who were blinded to the results of the Alvarado scores. The study population were divided into three subgroups based on the MRI findings: Group I: definitely not appendicitis, Group II: probably appendicitis, Group III: definitely appendicitis. All patients were divided into two subgroups according to Alvarado scores as Group A (low: 1-6), and Group B (high: 7-10). MR findings were compared with Alvarado scores and histopathological findings. Results: Sixty-six (77.6%) of the 85 patients with clinically suspected acute appendicitis, had undergone surgery. The diagnosis of appendicitis could be correctly achieved with MRI in 55 (83.3%) of 57 (86.4%) patients with histopathologically proven acute appendicitis. The sensitivity, specificity, positive predictive value and negative predictive value of MRI examination and Alvarado scoring system in the diagnosis of acute appendicitis were 96.49%, 66.67%, 94.83%, 75.0% and 84.21%, 66.67%, 94.12%, 40.0%, respectively. Conclusions: MRI is a valuable technique for detecting acute appendicitis even in the cases with low Alvarado scores. To increase the diagnostic accuracy and preventing unnecessary laparotomies for suspected appendicitis, shorter and cheaper unenhanced basic MRI may be performed.
Outlier removal, sum scores, and the inflation of the Type I error rate in independent samples t tests: the power of alternatives and recommendations.

Science.gov (United States)

Bakker, Marjan; Wicherts, Jelte M

2014-09-01

In psychology, outliers are often excluded before running an independent samples t test, and data are often nonnormal because of the use of sum scores based on tests and questionnaires. This article concerns the handling of outliers in the context of independent samples t tests applied to nonnormal sum scores. After reviewing common practice, we present results of simulations of artificial and actual psychological data, which show that the removal of outliers based on commonly used Z value thresholds severely increases the Type I error rate. We found Type I error rates of above 20% after removing outliers with a threshold value of Z = 2 in a short and difficult test. Inflations of Type I error rates are particularly severe when researchers are given the freedom to alter threshold values of Z after having seen the effects thereof on outcomes. We recommend the use of nonparametric Mann-Whitney-Wilcoxon tests or robust Yuen-Welch tests without removing outliers. These alternatives to independent samples t tests are found to have nominal Type I error rates with a minimal loss of power when no outliers are present in the data and to have nominal Type I error rates and good power when outliers are present. PsycINFO Database Record (c) 2014 APA, all rights reserved.
[Formula: see text]Determination of the smoking gun of intent: significance testing of forced choice results in social security claimants.

Science.gov (United States)

Binder, Laurence M; Chafetz, Michael D

2018-01-01

Significantly below-chance findings on forced choice tests have been described as revealing "the smoking gun of intent" that proved malingering. The issues of probability levels, one-tailed vs. two-tailed tests, and the combining of PVT scores on significantly below-chance findings were addressed in a previous study, with a recommendation of a probability level of .20 to test the significance of below-chance results. The purpose of the present study was to determine the rate of below-chance findings in a Social Security Disability claimant sample using the previous recommendations. We compared the frequency of below-chance results on forced choice performance validity tests (PVTs) at two levels of significance, .05 and .20, and when using significance testing on individual subtests of the PVTs compared with total scores in claimants for Social Security Disability in order to determine the rate of the expected increase. The frequency of significant results increased with the higher level of significance for each subtest of the PVT and when combining individual test sections to increase the number of test items, with up to 20% of claimants showing significantly below-chance results at the higher p-value. These findings are discussed in light of Social Security Administration policy, showing an impact on policy issues concerning child abuse and neglect, and the importance of using these techniques in evaluations for Social Security Disability.

Validity and Reliability of Nintendo Wii Fit Balance Scores

Science.gov (United States)

Wikstrom, Erik A.

2012-01-01

Context: Interactive gaming systems have the potential to help rehabilitate patients with musculoskeletal conditions. The Nintendo Wii Balance Board, which is part of the Wii Fit game, could be an effective tool to monitor progress during rehabilitation because the board and game can provide objective measures of balance. However, the validity and reliability of Wii Fit balance scores remain unknown. Objective: To determine the concurrent validity of balance scores produced by the Wii Fit game and the intrasession and intersession reliability of Wii Fit balance scores. Design: Descriptive laboratory study. Setting: Sports medicine research laboratory. Patients or Other Participants: Forty-five recreationally active participants (age = 27.0 ± 9.8 years, height = 170.9 ± 9.2 cm, mass = 72.4 ± 11.8 kg) with a heterogeneous history of lower extremity injury. Intervention(s): Participants completed a single-limb–stance task on a force plate and the Star Excursion Balance Test (SEBT) during the first test session. Twelve Wii Fit balance activities were completed during 2 test sessions separated by 1 week. Main Outcome Measure(s): Postural sway in the anteroposterior (AP) and mediolateral (ML) directions and the AP, ML, and resultant center-of-pressure (COP) excursions were calculated from the single-limb stance. The normalized reach distance was recorded for the anterior, posteromedial, and posterolateral directions of the SEBT. Wii Fit balance scores that the game software generated also were recorded. Results: All 96 of the calculated correlation coefficients among Wii Fit activity outcomes and established balance outcomes were interpreted as poor (r Wii Fit balance activity scores ranged from good (intraclass correlation coefficient [ICC] = 0.80) to poor (ICC = 0.39), with 8 activities having poor intrasession reliability. Similarly, 11 of the 12 Wii Fit balance activity scores demonstrated poor intersession reliability, with
Dichotomous scoring of Trails B in patients referred for a dementia evaluation.

Science.gov (United States)

Schmitt, Andrew L; Livingston, Ronald B; Smernoff, Eric N; Waits, Bethany L; Harris, James B; Davis, Kent M

2010-04-01

The Trail Making Test is a popular neuropsychological test and its interpretation has traditionally used time-based scores. This study examined an alternative approach to scoring that is simply based on the examinees' ability to complete the test. If an examinee is able to complete Trails B successfully, they are coded as "completers"; if not, they are coded as "noncompleters." To assess this approach to scoring Trails B, the performance of 97 diagnostically heterogeneous individuals referred for a dementia evaluation was examined. In this sample, 55 individuals successfully completed Trails B and 42 individuals were unable to complete it. Point-biserial correlations indicated a moderate-to-strong association (r(pb)=.73) between the Trails B completion variable and the Total Scale score of the Repeatable Battery for the Assessment of Neurological Status (RBANS), which was larger than the correlation between the Trails B time-based score and the RBANS Total Scale score (r(pb)=.60). As a screen for dementia status, Trails B completion showed a sensitivity of 69% and a specificity of 100% in this sample. These results suggest that dichotomous scoring of Trails B might provide a brief and clinically useful measure of dementia status.
Do candidate reactions relate to job performance or affect criterion-related validity? A multistudy investigation of relations among reactions, selection test scores, and job performance.

Science.gov (United States)

McCarthy, Julie M; Van Iddekinge, Chad H; Lievens, Filip; Kung, Mei-Chuan; Sinar, Evan F; Campion, Michael A

2013-09-01

Considerable evidence suggests that how candidates react to selection procedures can affect their test performance and their attitudes toward the hiring organization (e.g., recommending the firm to others). However, very few studies of candidate reactions have examined one of the outcomes organizations care most about: job performance. We attempt to address this gap by developing and testing a conceptual framework that delineates whether and how candidate reactions might influence job performance. We accomplish this objective using data from 4 studies (total N = 6,480), 6 selection procedures (personality tests, job knowledge tests, cognitive ability tests, work samples, situational judgment tests, and a selection inventory), 5 key candidate reactions (anxiety, motivation, belief in tests, self-efficacy, and procedural justice), 2 contexts (industry and education), 3 continents (North America, South America, and Europe), 2 study designs (predictive and concurrent), and 4 occupational areas (medical, sales, customer service, and technological). Consistent with previous research, candidate reactions were related to test scores, and test scores were related to job performance. Further, there was some evidence that reactions affected performance indirectly through their influence on test scores. Finally, in no cases did candidate reactions affect the prediction of job performance by increasing or decreasing the criterion-related validity of test scores. Implications of these findings and avenues for future research are discussed. PsycINFO Database Record (c) 2013 APA, all rights reserved
Lord-Wingersky Algorithm Version 2.0 for Hierarchical Item Factor Models with Applications in Test Scoring, Scale Alignment, and Model Fit Testing.

Science.gov (United States)

Cai, Li

2015-06-01

Lord and Wingersky's (Appl Psychol Meas 8:453-461, 1984) recursive algorithm for creating summed score based likelihoods and posteriors has a proven track record in unidimensional item response theory (IRT) applications. Extending the recursive algorithm to handle multidimensionality is relatively simple, especially with fixed quadrature because the recursions can be defined on a grid formed by direct products of quadrature points. However, the increase in computational burden remains exponential in the number of dimensions, making the implementation of the recursive algorithm cumbersome for truly high-dimensional models. In this paper, a dimension reduction method that is specific to the Lord-Wingersky recursions is developed. This method can take advantage of the restrictions implied by hierarchical item factor models, e.g., the bifactor model, the testlet model, or the two-tier model, such that a version of the Lord-Wingersky recursive algorithm can operate on a dramatically reduced set of quadrature points. For instance, in a bifactor model, the dimension of integration is always equal to 2, regardless of the number of factors. The new algorithm not only provides an effective mechanism to produce summed score to IRT scaled score translation tables properly adjusted for residual dependence, but leads to new applications in test scoring, linking, and model fit checking as well. Simulated and empirical examples are used to illustrate the new applications.
Comparison of physical therapy anatomy performance and anxiety scores in timed and untimed practical tests.

Science.gov (United States)

Schwartz, Sarah M; Evans, Cathy; Agur, Anne M R

2015-01-01

Students in health care professional programs face many stressful tests that determine successful completion of their program. Test anxiety during these high stakes examinations can affect working memory and lead to poor outcomes. Methods of decreasing test anxiety include lengthening the time available to complete examinations or evaluating students using untimed examinations. There is currently no consensus in the literature regarding whether untimed examinations provide a benefit to test performance in clinical anatomy. This study aimed to determine the impact of timed versus untimed practical tests on Master of Physical Therapy student anatomy performance and test anxiety. Test anxiety was measured using the State-Trait Anxiety Inventory (STAI). Differences in performance, anxiety scores, and time taken were compared using paired sample Student's t-tests. Eighty-one of the 84 students completed the study and provided feedback. Students performed significantly higher on the untimed test (P = 0.005), with a significant reduction in test anxiety (P anxiety. If the intended goal of evaluating health care professional students is to determine fundamental competencies, these factors should be considered when designing future curricula. © 2014 American Association of Anatomists.
Predicting Freshman Grade Point Average From College Admissions Test Scores and State High School Test Scores

OpenAIRE

Koretz, Daniel; Yu, C; Mbekeani, Preeya Pandya; Langi, M.; Dhaliwal, Tasminda Kaur; Braslow, David Arthur

2016-01-01

The current focus on assessing “college and career readiness” raises an empirical question: How do high school tests compare with college admissions tests in predicting performance in college? We explored this using data from the City University of New York and public colleges in Kentucky. These two systems differ in the choice of college admissions test, the stakes for students on the high school test, and demographics. We predicted freshman grade point average (FGPA) from high school GPA an...
Evaluation of the validity of osteoporosis and fracture risk assessment tools (IOF One Minute Test, SCORE, and FRAX) in postmenopausal Palestinian women.

Science.gov (United States)

Kharroubi, Akram; Saba, Elias; Ghannam, Ibrahim; Darwish, Hisham

2017-12-01

The need for simple self-assessment tools is necessary to predict women at high risk for developing osteoporosis. In this study, tools like the IOF One Minute Test, Fracture Risk Assessment Tool (FRAX), and Simple Calculated Osteoporosis Risk Estimation (SCORE) were found to be valid for Palestinian women. The threshold for predicting women at risk for each tool was estimated. The purpose of this study is to evaluate the validity of the updated IOF (International Osteoporosis Foundation) One Minute Osteoporosis Risk Assessment Test, FRAX, SCORE as well as age alone to detect the risk of developing osteoporosis in postmenopausal Palestinian women. Three hundred eighty-two women 45 years and older were recruited including 131 women with osteoporosis and 251 controls following bone mineral density (BMD) measurement, 287 completed questionnaires of the different risk assessment tools. Receiver operating characteristic (ROC) curves were evaluated for each tool using bone BMD as the gold standard for osteoporosis. The area under the ROC curve (AUC) was the highest for FRAX calculated with BMD for predicting hip fractures (0.897) followed by FRAX for major fractures (0.826) with cut-off values ˃1.5 and ˃7.8%, respectively. The IOF One Minute Test AUC (0.629) was the lowest compared to other tested tools but with sufficient accuracy for predicting the risk of developing osteoporosis with a cut-off value ˃4 total yes questions out of 18. SCORE test and age alone were also as good predictors of risk for developing osteoporosis. According to the ROC curve for age, women ≥64 years had a higher risk of developing osteoporosis. Higher percentage of women with low BMD (T-score ≤-1.5) or osteoporosis (T-score ≤-2.5) was found among women who were not exposed to the sun, who had menopause before the age of 45 years, or had lower body mass index (BMI) compared to controls. Women who often fall had lower BMI and approximately 27% of the recruited postmenopausal
Pre-test probability risk scores and their use in contemporary management of patients with chest pain: One year stress echo cohort study

Science.gov (United States)

Demarco, Daniela Cassar; Papachristidis, Alexandros; Roper, Damian; Tsironis, Ioannis; Byrne, Jonathan; Monaghan, Mark

2015-01-01

Objectives To compare how patients with chest pain would be investigated, based on the two guidelines available for UK cardiologists, on the management of patients with stable chest pain. The UK National Institute of Clinical Excellence (NICE) guideline which was published in 2010 and the European society of cardiology (ESC) guideline published in 2013. Both guidelines utilise pre-test probability risk scores, to guide the choice of investigation. Design We undertook a large retrospective study to investigate the outcomes of stress echocardiography. Setting A large tertiary centre in the UK in a contemporary clinical practice. Participants Two thirds of the patients in the cohort were referred from our rapid access chest pain clinics. Results We found that the NICE risk score overestimates risk by 20% compared to the ESC Risk score. We also found that based on the NICE guidelines, 44% of the patients presenting with chest pain, in this cohort, would have been investigated invasively, with diagnostic coronary angiography. Using the ESC guidelines, only 0.3% of the patients would be investigated invasively. Conclusion The large discrepancy between the two guidelines can be easily reduced if NICE adopted the ESC risk score. PMID:26673458
Correlation of Head Impacts to Change in Balance Error Scoring System Scores in Division I Men's Lacrosse Players.

Science.gov (United States)

Miyashita, Theresa L; Diakogeorgiou, Eleni; Marrie, Kaitlyn

Investigation into the effect of cumulative subconcussive head impacts has yielded various results in the literature, with many supporting a link to neurological deficits. Little research has been conducted on men's lacrosse and associated balance deficits from head impacts. (1) Athletes will commit more errors on the postseason Balance Error Scoring System (BESS) test. (2) There will be a positive correlation to change in BESS scores and head impact exposure data. Prospective longitudinal study. Level 3. Thirty-four Division I men's lacrosse players (age, 19.59 ± 1.42 years) wore helmets instrumented with a sensor to collect head impact exposure data over the course of a competitive season. Players completed a BESS test at the start and end of the competitive season. The number of errors from pre- to postseason increased during the double-leg stance on foam ( P impacts sustained over the course of 1 lacrosse season, as measured by average linear acceleration, head injury criteria, and Gadd Severity Index scores. If there is microtrauma to the vestibular system due to repetitive subconcussive impacts, only an assessment that highly stresses the vestibular system may be able to detect these changes. Cumulative subconcussive impacts may result in neurocognitive dysfunction, including balance deficits, which are associated with an increased risk for injury. The development of a strategy to reduce total number of head impacts may curb the associated sequelae. Incorporation of a modified BESS test, firm surface only, may not be recommended as it may not detect changes due to repetitive impacts over the course of a competitive season.
Use of Verbal Descriptors, Thermal Scores and Electrical Pulp Testing Scores as Predictors of Tooth Pain Before and After Application of Benzocaine Gels into Cavities of Teeth with Pulpitis

Science.gov (United States)

Gangarosa, Louis P.; Ciarlone, Alfred E.; Neaverth, Elmer J.; Johnston, Carey A.; Snowden, J. Douglas; Thompson, William O.

1989-01-01

A double-blind pilot study was conducted on 27 consenting human volunteers who had irreversible pulpitis associated with persistent toothache pain from open carious lesions. Formulations tested contained either 0, 10%, or 20% benzocaine and were identified only by a numbered code. Before the experiment started, a small amount of a known 5% benzocaine gel was placed for 1 minute on the tongue of each patient to assure a sensation of numbness within the oral cavity. Then the test tooth was washed with a gentle stream of warm water and dried with gauze. A randomly selected test medication was placed into the open cavity and around the gingival margins for 5 minutes. Pre- and posttreatment tests were conducted at the following timed intervals: 0, 5, 15, 30, 45, 60, 75 and 90 minutes. The tests included degree of pain (rated: 0 = none, 1 = mild, 2 = moderate, 3 = severe); electrical pulp testing (EPT) by a modified, voltage-ramping instrument; and ice water testing (0.5 mL directed quickly onto sound enamel of the tooth and rated: 0 to 4, with 4 being intolerable). After testing, or when pain returned to baseline, endodontic procedures were performed. There was a significant increase (p pulpitis and control teeth, 3) there were no correlations between direction of EPT scores and pain relief, 4) cold water testing was a good predictor of whether or not a tooth had pulpitis, and 5) changes in cold water testing scores after treatment could not be correlated to relief of pain according to verbal descriptors. The effectiveness of benzocaine in relieving toothache pain verifies previous studies; however, a difference between 10% and 20% benzocaine could not be demonstrated probably because of two factors: 1) the present experiment had a small sample size, and 2) there was no direct measurement of duration of local anesthesia. PMID:2490060
Standardized computer-based organized reporting of EEG SCORE - Second version

DEFF Research Database (Denmark)

Beniczky, Sándor; Aurlien, Harald; Brøgger, Jan C

2017-01-01

Standardized terminology for computer-based assessment and reporting of EEG has been previously developed in Europe. The International Federation of Clinical Neurophysiology established a taskforce in 2013 to develop this further, and to reach international consensus. This work resulted in the se......Standardized terminology for computer-based assessment and reporting of EEG has been previously developed in Europe. The International Federation of Clinical Neurophysiology established a taskforce in 2013 to develop this further, and to reach international consensus. This work resulted...... in the second, revised version of SCORE (Standardized Computer-based Organized Reporting of EEG), which is presented in this paper. The revised terminology was implemented in a software package (SCORE EEG), which was tested in clinical practice on 12,160 EEG recordings. Standardized terms implemented in SCORE....... In the end, the diagnostic significance is scored, using a standardized list of terms. SCORE has specific modules for scoring seizures (including seizure semiology and ictal EEG patterns), neonatal recordings (including features specific for this age group), and for Critical Care EEG Terminology. SCORE...
Re-Scoring the Game’s Score

DEFF Research Database (Denmark)

Gasselseder, Hans-Peter

2014-01-01

This study explores immersive presence as well as emotional valence and arousal in the context of dynamic and non-dynamic music scores in the 3rd person action-adventure video game genre while also considering relevant personality traits of the player. 60 subjects answered self-report questionnai......This study explores immersive presence as well as emotional valence and arousal in the context of dynamic and non-dynamic music scores in the 3rd person action-adventure video game genre while also considering relevant personality traits of the player. 60 subjects answered self......-temporal alignment in the resulting emotional congruency of nondiegetic music. Whereas imaginary aspects of immersive presence are systemically affected by the presentation of dynamic music, sensory spatial aspects show higher sensitivity towards the arousal potential of the music score. It is argued...
Investigating the Value of Section Scores for the "TOEFL iBT"® Test. "TOEFL iBT"® Research Report. TOEFL iBT-21. ETS Research Report RR-13-35

Science.gov (United States)

Sawaki, Yasuyo; Sinharay, Sandip

2013-01-01

This study investigates the value of reporting the reading, listening, speaking, and writing section scores for the "TOEFL iBT"® test, focusing on 4 related aspects of the psychometric quality of the TOEFL iBT section scores: reliability of the section scores, dimensionality of the test, presence of distinct score profiles, and the…
Predicting Pre-Service Classroom Teachers' Civil Servant Recruitment Examination's Educational Sciences Test Scores Using Artificial Neural Networks

Science.gov (United States)

Demir, Metin

2015-01-01

This study predicts the number of correct answers given by pre-service classroom teachers in Civil Servant Recruitment Examination's (CSRE) educational sciences test based on their high school grade point averages, university entrance scores, and grades (mid-term and final exams) from their undergraduate educational courses. This study was…
[Prognostic scores for pulmonary embolism].

Science.gov (United States)

Junod, Alain

2016-03-23

Nine prognostic scores for pulmonary embolism (PE), based on retrospective and prospective studies, published between 2000 and 2014, have been analyzed and compared. Most of them aim at identifying PE cases with a low risk to validate their ambulatory care. Important differences in the considered outcomes: global mortality, PE-specific mortality, other complications, sizes of low risk groups, exist between these scores. The most popular score appears to be the PESI and its simplified version. Few good quality studies have tested the applicability of these scores to PE outpatient care, although this approach tends to already generalize in the medical practice.
Conceptual Scoring and Classification Accuracy of Vocabulary Testing in Bilingual Children

Science.gov (United States)

Anaya, Jissel B.; Peña, Elizabeth D.; Bedore, Lisa M.

2018-01-01

Purpose: This study examined the effects of single-language and conceptual scoring on the vocabulary performance of bilingual children with and without specific language impairment. We assessed classification accuracy across 3 scoring methods. Method: Participants included Spanish-English bilingual children (N = 247) aged 5;1 (years;months) to…
A semiquantitative MRI-Score can predict loss of lung function in patients with cystic fibrosis: Preliminary results

Energy Technology Data Exchange (ETDEWEB)

Schaefer, Juergen F.; Schmidt, Katharina; Teufel, Matthias; Fleischer, Sabrina; Gatidis, Sergios; Schaefer, Susanne; Nikolaou, Konstantin; Tsiflikas, Ilias [University Hospital of Tuebingen, Department of Diagnostic and Interventional Radiology, Tuebingen (Germany); Hector, Andreas; Graepler-Mainka, Ute; Riethmueller, Joachim; Hartl, Dominik [University Children' s Hospital of Tuebingen, Department of Paediatrics I, Tuebingen (Germany)

2018-01-15

To evaluate the applicability of a semiquantitative MRI scoring system (MR-CF-S) as a prognostic marker for clinical course of cystic fibrosis (CF) lung disease. This observational study of a single-centre CF cohort included a group of 61 patients (mean age 12.9 ± 4.7 years) receiving morphological and functional pulmonary MRI, pulmonary function testing (PFT) and follow-up of 2 years. MRI was analysed by three raters using MR-CF-S. The inter-rater agreement, correlation of score categories with forced expiratory volume in 1 s (FEV{sub 1}) at baseline, and the predictive value of clinical parameters, and score categories was assessed for the whole cohort and a subgroup of 40 patients with moderately impaired lung function. The inter-rater agreement of MR-CF-S was sufficient (mean intraclass correlation coefficient 0.92). MR-CF-S (-0.62; p < 0.05) and most of the categories significantly correlated with FEV{sub 1}. Differences between patients with relevant loss of FEV{sub 1} (>3%/year) and normal course were only significant for MR-CF-S (p < 0.05) but not for clinical parameters. Centrilobular opacity (CO) was the most promising score category for prediction of a decline of FEV{sub 1} (area under curve: whole cohort 0.69; subgroup 0.86). MR-CF-S is promising to predict a loss of lung function. CO seems to be a particular finding in CF patients with an abnormal course. (orig.)
Differences in physical-fitness test scores between actively and passively recruited older adults : Consequences for norm-based classification

NARCIS (Netherlands)

van Heuvelen, M.J.G.; Stevens, M.; Kempen, G.I.J.M.

This study investigated differences in physical-fitness test scores between actively and passively recruited older adults and the consequences thereof for norm-based classification of individuals. Walking endurance, grip strength, hip flexibility, balance, manual dexterity, and reaction time were
Treatment for Schistosoma japonicum, reduction of intestinal parasite load, and cognitive test score improvements in school-aged children.

Directory of Open Access Journals (Sweden)

Amara E Ezeamama

Full Text Available To determine whether treatment of intestinal parasitic infections improves cognitive function in school-aged children, we examined changes in cognitive testscores over 18 months in relation to: (i treatment-related Schistosoma japonicum intensity decline, (ii spontaneous reduction of single soil-transmitted helminth (STH species, and (iii ≥2 STH infections among 253 S. japonicum-infected children.Helminth infections were assessed at baseline and quarterly by the Kato-Katz method. S. japonicum infection was treated at baseline using praziquantel. An intensity-based indicator of lower vs. no change/higher infection was defined separately for each helminth species and joint intensity declines of ≥2 STH species. In addition, S. japonicum infection-free duration was defined in four categories based on time of schistosome re-infection: >18 (i.e. cured, >12 to ≤18, 6 to ≤12 and ≤6 (persistently infected months. There was no baseline treatment for STHs but their intensity varied possibly due to spontaneous infection clearance/acquisition. Four cognitive tests were administered at baseline, 6, 12, and 18 months following S. japonicum treatment: learning and memory domains of Wide Range Assessment of Memory and Learning (WRAML, verbal fluency (VF, and Philippine nonverbal intelligence test (PNIT. Linear regression models were used to relate changes in respective infections to test performance with adjustment for sociodemographic confounders and coincident helminth infections.Children cured (β = 5.8; P = 0.02 and those schistosome-free for >12 months (β = 1.5; P = 0.03 scored higher in WRAML memory and VF tests compared to persistently infected children independent of STH infections. A decline vs. no change/increase of any individual STH species (β:11.5-14.5; all P12 months post-treatment and those who experienced declines of ≥2 STH species scored higher in three of four cognitive tests. Our result suggests that sustained
Anthropometric and Athletic Performance Combine Test Results Among Positions Within Grade Levels of High School-Aged American Football Players.

Science.gov (United States)

Leutzinger, Todd J; Gillen, Zachary M; Miramonti, Amelia M; McKay, Brianna D; Mendez, Alegra I; Cramer, Joel T

2018-05-01

Leutzinger, TJ, Gillen, ZM, Miramonti, AM, McKay, BD, Mendez, AI, and Cramer, JT. Anthropometric and athletic performance combine test results among positions within grade levels of high school-aged American football players. J Strength Cond Res 32(5): 1288-1296, 2018-The purpose of this study was to investigate differences among player positions at 3 grade levels in elite, collegiate-prospective American football players. Participants' data (n = 7,160) were analyzed for this study (mean height [Ht] ± SD = 178 ± 7 cm, mass [Bm] = 86 ± 19 kg). Data were obtained from 12 different high school American football recruiting combines hosted by Zybek Sports (Boulder, Colorado). Eight 2-way (9 × 3) mixed factorial analysis of variances {position (defensive back [DB], defensive end, defensive lineman, linebacker, offensive lineman [OL], quarterback, running back, tight end, and wide receiver [WR]) × grade (freshmen, sophomores, and juniors)} were used to test for differences among the mean test scores for each combine measure (Ht, Bm, 40-yard [40 yd] dash, proagility [PA] drill, L-cone [LC] drill, vertical jump [VJ], and broad jump [BJ]). There were position-related differences (p ≤ 0.05) for Ht, 40 yd dash, and BJ, within each grade level and for Bm, PA, LC, and VJ independent of grade level. Generally, the results showed that OL were the tallest, weighed the most, and exhibited the lowest performance scores among positions. Running backs were the shortest, whereas DBs and WRs weighed the least and exhibited the highest performance scores among positions. These results demonstrate the value of classifying high school-aged American football players according to their specific position rather than categorical groupings such as "line" vs. "skill" vs. "big skill" when evaluating anthropometric and athletic performance combine test results.

Addressing criticisms of existing predictive bias research: cognitive ability test scores still overpredict African Americans' job performance.

Science.gov (United States)

Berry, Christopher M; Zhao, Peng

2015-01-01

Predictive bias studies have generally suggested that cognitive ability test scores overpredict job performance of African Americans, meaning these tests are not predictively biased against African Americans. However, at least 2 issues call into question existing over-/underprediction evidence: (a) a bias identified by Aguinis, Culpepper, and Pierce (2010) in the intercept test typically used to assess over-/underprediction and (b) a focus on the level of observed validity instead of operational validity. The present study developed and utilized a method of assessing over-/underprediction that draws on the math of subgroup regression intercept differences, does not rely on the biased intercept test, allows for analysis at the level of operational validity, and can use meta-analytic estimates as input values. Therefore, existing meta-analytic estimates of key parameters, corrected for relevant statistical artifacts, were used to determine whether African American job performance remains overpredicted at the level of operational validity. African American job performance was typically overpredicted by cognitive ability tests across levels of job complexity and across conditions wherein African American and White regression slopes did and did not differ. Because the present study does not rely on the biased intercept test and because appropriate statistical artifact corrections were carried out, the present study's results are not affected by the 2 issues mentioned above. The present study represents strong evidence that cognitive ability tests generally overpredict job performance of African Americans. (c) 2015 APA, all rights reserved.
Relationship Between Broiler Body Weights, Eimeria maxima Gross Lesion Scores, and Microscores in Three Anticoccidial Sensitivity Tests.

Science.gov (United States)

Barrios, Miguel A; Da Costa, Manuel; Kimminau, Emily; Fuller, Lorraine; Clark, Steven; Pesti, Gene; Beckstead, Robert

2017-06-01

Anticoccidial sensitivity tests (ASTs) serve to determine the efficacy of anticoccidial drugs against Eimeria field isolates in a controlled laboratory setting. The most commonly measured parameters are body weight gain, feed conversion ratio, gross intestinal lesion scores, and mortality. Due to the difficulty in reliably scoring gross lesion scores of Eimeria maxima , microscopic analysis of intestinal scrapings (microscores) can be used in the field to indicate the presence of this particular Eimeria. The goal of this study was to determine the relationship between E. maxima microscores and broiler body weights and gross E. maxima lesion scores in three ASTs. Day-old broiler chicks were raised for 12 days on a standard corn-soy diet. On Day 12, chicks were placed in Petersime batteries and treatment diets were provided. There were six birds per pen, four pens per treatment, and 12 treatments, for a total of 288 chicks per AST. The treatments were as follows: 1) nonmedicated, noninfected; 2) nonmedicated, infected; 3) lasalocid, infected; 4) salinomycin, infected; 5) diclazuril, infected; 6) monensin, infected; 7) decoquinate, infected; 8) narasin + nicarbazin, infected; 9) narasin, infected; 10) nicarbazin, infected; 11) robenidine, infected; and 12) zoalene, infected. On Day 14, chicks were challenged with an Eimeria field isolate by oral gavage. On Day 20, broilers were weighed, and gross lesion scores and microscores were classified from 0 to 4 depending on the severity of the gross lesion scores and E. maxima microscores. Data from three trials using different field isolates were statistically analyzed using a logarithmic regression model. There was no relationship (P = 0.1224) between microscores and body weight gain. There was a positive relationship between microscores and gross lesion scores (P = 0.004). However, there was also an interaction between isolate and treatment (P Eimeria or the amount of E. maxima in the inoculum.
Validity and reliability of Nintendo Wii Fit balance scores.

Science.gov (United States)

Wikstrom, Erik A

2012-01-01

Interactive gaming systems have the potential to help rehabilitate patients with musculoskeletal conditions. The Nintendo Wii Balance Board, which is part of the Wii Fit game, could be an effective tool to monitor progress during rehabilitation because the board and game can provide objective measures of balance. However, the validity and reliability of Wii Fit balance scores remain unknown. To determine the concurrent validity of balance scores produced by the Wii Fit game and the intrasession and intersession reliability of Wii Fit balance scores. Descriptive laboratory study. Sports medicine research laboratory. Forty-five recreationally active participants (age = 27.0 ± 9.8 years, height = 170.9 ± 9.2 cm, mass = 72.4 ± 11.8 kg) with a heterogeneous history of lower extremity injury. Participants completed a single-limb-stance task on a force plate and the Star Excursion Balance Test (SEBT) during the first test session. Twelve Wii Fit balance activities were completed during 2 test sessions separated by 1 week. Postural sway in the anteroposterior (AP) and mediolateral (ML) directions and the AP, ML, and resultant center-of-pressure (COP) excursions were calculated from the single-limb stance. The normalized reach distance was recorded for the anterior, posteromedial, and posterolateral directions of the SEBT. Wii Fit balance scores that the game software generated also were recorded. All 96 of the calculated correlation coefficients among Wii Fit activity outcomes and established balance outcomes were interpreted as poor (r Wii Fit balance activity scores ranged from good (intraclass correlation coefficient [ICC] = 0.80) to poor (ICC = 0.39), with 8 activities having poor intrasession reliability. Similarly, 11 of the 12 Wii Fit balance activity scores demonstrated poor intersession reliability, with scores ranging from fair (ICC = 0.74) to poor (ICC = 0.29). Wii Fit balance activity scores had poor concurrent validity relative to COP outcomes and SEBT
New clinical score to diagnose nonalcoholic steatohepatitis in obese patients

Directory of Open Access Journals (Sweden)

Pulzi Fernanda BU

2011-02-01

Full Text Available Abstract Background Nonalcoholic fatty liver disease (NAFLD is the most frequent disease associated with abnormal liver tests that is characterized by a wide spectrum of liver damage, ranging from simple macro vesicular steatosis to steatohepatitis (NASH, cirrhosis or liver carcinoma. Liver biopsy is the most precise test to differentiate NASH from other stages of NAFLD, but it is an invasive and expensive method. This study aimed to create a clinical laboratory score capable of identify individual with NASH in severely obese patients submitted to bariatric surgery. Methods The medical records from 66 patients submitted to gastroplasty were reviewed. Their chemistry profile, abdominal ultrasound (US and liver biopsy done during the surgical procedure were analyzed. Patients were classified into 2 groups according to liver biopsy: Non-NASH group - those patients without NAFLD or with grade I, II or III steatosis; and NASH group - those with steatohepatitis or fibrosis. The t-test was used to compare each variable with normal distribution between NASH and Non-NASH groups. When comparing proportions of categorical variables, we used chi-square or z-test, where appropriate. A p-value Results 83% of patients with obesity grades II or III showed NAFLD, and the majority was asymptomatic. Total Cholesterol (TC≥200 mg/dL, alanine aminotransferase (ALT ≥30, AST/ALT ratio (AAR≤ 1, gammaglutaril-transferase (γGT≥30 U/L and abdominal US, compatible with steatosis, showed association with NASH group. We proposed 2 scores: Complete score (TC, ALT, AAR, γGT and US and the simplified score, where US was not included. The combination of biochemical and imaging results improved accuracy to 84.4% the recognition of NASH (sensitivity 70%, specificity 88.6%, NPV 91.2%, PPV 63. 6%. Conclusion Alterations in TC, ALT, AAR, γGT and US are related to the most risk for NASH. The combination of biochemical and imaging results improved accuracy to 84.4% the
Continuous equilibrium scores: factoring in the time before a fall.

Science.gov (United States)

Wood, Scott J; Reschke, Millard F; Owen Black, F

2012-07-01

The equilibrium (EQ) score commonly used in computerized dynamic posturography is normalized between 0 and 100, with falls assigned a score of 0. The resulting mixed discrete-continuous distribution limits certain statistical analyses and treats all trials with falls equally. We propose a simple modification of the formula in which peak-to-peak sway data from trials with falls is scaled according the percent of the trial completed to derive a continuous equilibrium (cEQ) score. The cEQ scores for trials without falls remain unchanged from the original methodology. The cEQ factors in the time before a fall and results in a continuous variable retaining the central tendencies of the original EQ distribution. A random set of 5315 Sensory Organization Test trials were pooled that included 81 falls. A comparison of the original and cEQ distributions and their rank ordering demonstrated that trials with falls continue to constitute the lower range of scores with the cEQ methodology. The area under the receiver operating characteristic curve (0.997) demonstrates that the cEQ retained near-perfect discrimination between trials with and without falls. We conclude that the cEQ score provides the ability to discriminate between ballistic falls from falls that occur later in the trial. This approach of incorporating time and sway magnitude can be easily extended to enhance other balance tests that include fall data or incomplete trials. Copyright © 2012 Elsevier B.V. All rights reserved.
Some new results on correlation-preserving factor scores prediction methods

NARCIS (Netherlands)

Ten Berge, J.M.F.; Krijnen, W.P.; Wansbeek, T.J.; Shapiro, A.

1999-01-01

Anderson and Rubin and McDonald have proposed a correlation-preserving method of factor scores prediction which minimizes the trace of a residual covariance matrix for variables. Green has proposed a correlation-preserving method which minimizes the trace of a residual covariance matrix for factors.
Target-specific support vector machine scoring in structure-based virtual screening: computational validation, in vitro testing in kinases, and effects on lung cancer cell proliferation.

Science.gov (United States)

Li, Liwei; Khanna, May; Jo, Inha; Wang, Fang; Ashpole, Nicole M; Hudmon, Andy; Meroueh, Samy O

2011-04-25

We assess the performance of our previously reported structure-based support vector machine target-specific scoring function across 41 targets, 40 among them from the Directory of Useful Decoys (DUD). The area under the curve of receiver operating characteristic plots (ROC-AUC) revealed that scoring with SVM-SP resulted in consistently better enrichment over all target families, outperforming Glide and other scoring functions, most notably among kinases. In addition, SVM-SP performance showed little variation among protein classes, exhibited excellent performance in a test case using a homology model, and in some cases showed high enrichment even with few structures used to train a model. We put SVM-SP to the test by virtual screening 1125 compounds against two kinases, EGFR and CaMKII. Among the top 25 EGFR compounds, three compounds (1-3) inhibited kinase activity in vitro with IC₅₀ of 58, 2, and 10 μM. In cell cultures, compounds 1-3 inhibited nonsmall cell lung carcinoma (H1299) cancer cell proliferation with similar IC₅₀ values for compound 3. For CaMKII, one compound inhibited kinase activity in a dose-dependent manner among 20 tested with an IC₅₀ of 48 μM. These results are encouraging given that our in-house library consists of compounds that emerged from virtual screening of other targets with pockets that are different from typical ATP binding sites found in kinases. In light of the importance of kinases in chemical biology, these findings could have implications in future efforts to identify chemical probes of kinases within the human kinome.
Factors Affecting Result in Chinese Proficiency Test (HSK Level 6: Reading Section and Preparation Strategies

Directory of Open Access Journals (Sweden)

Sri Haryanti

2013-11-01

Full Text Available Chinese Proficiency Test (HSK is an internationally standardized exam which tests and rates Chinese language proficiency. The highest level in this test is level 6. The writing part of the test consists of 3 (three parts, namely, (1 listening, (2 reading, (3 writing. Furthermore, the reading part is made of 4 components. Level 6 of this test implies a high degree of difficulty. This paper specifically looked on how to prepare effectively for participants to be able to work on the reading part in order to achieve best result. This article used the methods of literature review and observational study as well as field research and would also incorporate the author’s personal experience in taking the test into recommending strategies for doing the reading part in a level 6 HSK test. Finally, research suggested several techniques and tips that might assist participants in achieving maximum scores in handling the reading part of level 6 HSK test.
Differential Predictive Validity of High School GPA and College Entrance Test Scores for University Students in Yemen

Science.gov (United States)

Al-Hattami, Abdulghani Ali Dawod

2012-01-01

High school grade point average and college entrance test scores are two admission criteria that are currently used by most colleges in Yemen to select their prospective students. Given their widespread use, it is important to investigate their predictive validity to ensure the accuracy of the admission decisions in these institutions. This study…
Chronic obstructive pulmonary disease (COPD) assessment test scores corresponding to modified Medical Research Council grades among COPD patients.

Science.gov (United States)

Lee, Chang-Hoon; Lee, Jinwoo; Park, Young Sik; Lee, Sang-Min; Yim, Jae-Joon; Kim, Young Whan; Han, Sung Koo; Yoo, Chul-Gyu

2015-09-01

In assigning patients with chronic obstructive pulmonary disease (COPD) to subgroups according to the updated guidelines of the Global Initiative for Chronic Obstructive Lung Disease, discrepancies have been noted between the COPD assessment test (CAT) criteria and modified Medical Research Council (mMRC) criteria. We investigated the determinants of symptom and risk groups and sought to identify a better CAT criterion. This retrospective study included COPD patients seen between June 20, 2012, and December 5, 2012. The CAT score that can accurately predict an mMRC grade ≥ 2 versus COPD patients, the percentages of patients classified into subgroups A, B, C, and D were 24.5%, 47.2%, 4.2%, and 24.1% based on CAT criteria and 49.3%, 22.4%, 8.9%, and 19.4% based on mMRC criteria, respectively. More than 90% of the patients who met the mMRC criteria for the 'more symptoms group' also met the CAT criteria. AUROC and CART analyses suggested that a CAT score ≥ 15 predicted an mMRC grade ≥ 2 more accurately than the current CAT score criterion. During follow-up, patients with CAT scores of 10 to 14 did not have a different risk of exacerbation versus those with CAT scores COPD patients.
In Vitro Testing of Scaffolds for Mesenchymal Stem Cell-Based Meniscus Tissue Engineering—Introducing a New Biocompatibility Scoring System

Directory of Open Access Journals (Sweden)

Felix P. Achatz

2016-04-01

Full Text Available A combination of mesenchymal stem cells (MSCs and scaffolds seems to be a promising approach for meniscus repair. To facilitate the search for an appropriate scaffold material a reliable and objective in vitro testing system is essential. This paper introduces a new scoring for this purpose and analyzes a hyaluronic acid (HA gelatin composite scaffold and a polyurethane scaffold in combination with MSCs for tissue engineering of meniscus. The pore quality and interconnectivity of pores of a HA gelatin composite scaffold and a polyurethane scaffold were analyzed by surface photography and Berliner-Blau-BSA-solution vacuum filling. Further the two scaffold materials were vacuum-filled with human MSCs and analyzed by histology and immunohistochemistry after 21 days in chondrogenic media to determine cell distribution and cell survival as well as proteoglycan production, collagen type I and II content. The polyurethane scaffold showed better results than the hyaluronic acid gelatin composite scaffold, with signs of central necrosis in the HA gelatin composite scaffolds. The polyurethane scaffold showed good porosity, excellent pore interconnectivity, good cell distribution and cell survival, as well as an extensive content of proteoglycans and collagen type II. The polyurethane scaffold seems to be a promising biomaterial for a mesenchymal stem cell-based tissue engineering approach for meniscal repair. The new score could be applied as a new standard for in vitro scaffold testing.
Cognitive disparities, lead plumbing, and water chemistry: prior exposure to water-borne lead and intelligence test scores among World War Two U.S. Army enlistees.

Science.gov (United States)

Ferrie, Joseph P; Rolf, Karen; Troesken, Werner

2012-01-01

Higher prior exposure to water-borne lead among male World War Two U.S. Army enlistees was associated with lower intelligence test scores. Exposure was proxied by urban residence and the water pH levels of the cities where enlistees lived in 1930. Army General Classification Test scores were six points lower (nearly 1/3 standard deviation) where pH was 6 (so the water lead concentration for a given amount of lead piping was higher) than where pH was 7 (so the concentration was lower). This difference rose with time exposed. At this time, the dangers of exposure to lead in water were not widely known and lead was ubiquitous in water systems, so these results are not likely the effect of individuals selecting into locations with different levels of exposure. Copyright © 2011 Elsevier B.V. All rights reserved.
An Argument against Using Standardized Test Scores for Placement of International Undergraduate Students in English as a Second Language (ESL) Courses

Science.gov (United States)

Kokhan, Kateryna

2013-01-01

Development and administration of institutional ESL placement tests require a great deal of financial and human resources. Due to a steady increase in the number of international students studying in the United States, some US universities have started to consider using standardized test scores for ESL placement. The English Placement Test (EPT)…
Patient-reported speech in noise difficulties and hyperacusis symptoms and correlation with test results.

Science.gov (United States)

Spyridakou, Chrysa; Luxon, Linda M; Bamiou, Doris E

2012-07-01

To compare self-reported symptoms of difficulty hearing speech in noise and hyperacusis in adults with auditory processing disorders (APDs) and normal controls; and to compare self-reported symptoms to objective test results (speech in babble test, transient evoked otoacoustic emission [TEOAE] suppression test using contralateral noise). A prospective case-control pilot study. Twenty-two participants were recruited in the study: 10 patients with reported hearing difficulty, normal audiometry, and a clinical diagnosis of APD; and 12 normal age-matched controls with no reported hearing difficulty. All participants completed the validated Amsterdam Inventory for Auditory Disability questionnaire, a hyperacusis questionnaire, a speech in babble test, and a TEOAE suppression test using contralateral noise. Patients had significantly worse scores than controls in all domains of the Amsterdam Inventory questionnaire (with the exception of sound detection) and the hyperacusis questionnaire (P reported symptoms of difficulty hearing speech in noise and speech in babble test results in the right ear (ρ = 0.624, P = .002), and between self-reported symptoms of hyperacusis and TEOAE suppression test results in the right ear (ρ = -0.597 P = .003). There was no significant correlation between the two tests. A strong correlation was observed between right ear speech in babble and patient-reported intelligibility of speech in noise, and right ear TEOAE suppression by contralateral noise and hyperacusis questionnaire. Copyright © 2012 The American Laryngological, Rhinological, and Otological Society, Inc.
Prediction of antigenic epitopes on protein surfaces by consensus scoring

Directory of Open Access Journals (Sweden)

Zhang Chi

2009-09-01

Full Text Available Abstract Background Prediction of antigenic epitopes on protein surfaces is important for vaccine design. Most existing epitope prediction methods focus on protein sequences to predict continuous epitopes linear in sequence. Only a few structure-based epitope prediction algorithms are available and they have not yet shown satisfying performance. Results We present a new antigen Epitope Prediction method, which uses ConsEnsus Scoring (EPCES from six different scoring functions - residue epitope propensity, conservation score, side-chain energy score, contact number, surface planarity score, and secondary structure composition. Applied to unbounded antigen structures from an independent test set, EPCES was able to predict antigenic eptitopes with 47.8% sensitivity, 69.5% specificity and an AUC value of 0.632. The performance of the method is statistically similar to other published methods. The AUC value of EPCES is slightly higher compared to the best results of existing algorithms by about 0.034. Conclusion Our work shows consensus scoring of multiple features has a better performance than any single term. The successful prediction is also due to the new score of residue epitope propensity based on atomic solvent accessibility.
External Validation of the Simple Clinical Score and the HOTEL Score, Two Scores for Predicting Short-Term Mortality after Admission to an Acute Medical Unit

DEFF Research Database (Denmark)

Stræde, Mia; Brabrand, Mikkel

2014-01-01

with the objective of validating the Simple Clinical Score (SCS) and the HOTEL score, two existing risk stratification systems that predict mortality for medical patients based solely on clinical information, but not only vital signs. METHODS: Pre-planned prospective observational cohort study. SETTING: Danish 460.......932 to 0.988) for 24-hours mortality and 0.826 (95% CI, 0.774-0.879) for 30-day mortality, and goodness-of-fit test, χ2 = 2.68 (10 degrees of freedom), P = 0.998 and χ2 = 4.00, P = 0.947, respectively. We included 1470 patients when calculating the HOTEL score. Discriminatory power (AUROC) was 0.931 (95......% CI, 0.901-0.962) for 24-hours mortality and goodness-of-fit test, χ2 = 5.56 (10 degrees of freedom), P = 0.234. CONCLUSION: We find that both the SCS and HOTEL scores showed an excellent to outstanding ability in identifying patients at high risk of dying with good or acceptable precision....
Irradiation effects test Series Scoping Test 1: test results report

International Nuclear Information System (INIS)

Quapp, W.J.; Allison, C.M.; Farrar, L.C.

1977-09-01

The report describes the results of the first scoping test in the Irradiation Effects Test Series conducted by the Thermal Fuels Behavior Program, which is part of the Water Reactor Research Program of EG and G Idaho, Inc. The research is sponsored by the United States Nuclear Regulatory Commission. This test used an unirradiated, three-foot-long, PWR-type fuel rod. The objective of this test was to thoroughly evaluate the remote fabrication procedures to be used for irradiated rods in future tests, handling plans, and reactor operations. Additionally, selected fuel behavior data were obtained. The fuel rod was subjected to a series of preconditioning power cycles followed by a power increase which brought the fuel rod power to about 20.4 kW/ft peak linear heat rating at a coolant mass flux of 1.83 x 10 6 lb/hr-ft 2 . Film boiling occurred for a period of 4.8 minutes following flow reductions to 9.6 x 10 5 and 7.5 x 10 5 lb/hr-ft 2 . The test fuel rod failed following reactor shutdown as a result of heavy internal and external cladding oxidation and embrittlement which occurred during the film boiling operation
A Case for Adjusting Subjectively Rated Scores in the Advanced Placement Tests. Program Statistics Research. Technical Report No. 94-5.

Science.gov (United States)

Longford, Nicholas T.

A case is presented for adjusting the scores for free response items in the Advanced Placement (AP) tests. Using information about the rating process from the reliability studies, administrations of the AP test for three subject areas, psychology, computer science, and English language and composition, are analyzed. In the reliability studies, 299…
Advantages of micronuclei analysis through images autocapturing and screen scoring

International Nuclear Information System (INIS)

González, J.E.; Martínez-López, W.

2015-01-01

The cytokinesis-block micronucleus (CBMN) test is a quantitative assay for genetic toxicity assessment. One of the advantages of the MN assay is its amenability for automation. Different type of cells has been used to evaluate genetic damage through MN assay, such as, human lymphocytes and rodent cell lines (i.e. CHO, V79, CHL and L5178Y). The MN quantification is a time consuming process and several efforts has been conducted for its automation. Some of them include an operator checking step, like PathFinder CellScan System, or are fully automated such as MNScore from MetaSytems. Usually, fully automated systems detect two or three times less MN than visual scoring. In some cases, the impact of false positive detection is reduced with a visual detection step. In the present work we have tested a combination of image autocapturing of CHOK1 cells previously treated with bleomycin (0, 2.5, 5.0 and 10.0 μg/ml) or UVC (0, 4, 8 and 16 J/m”2 ) with a screen scoring. Capturing images using the AutoCapture option from Metafer 4 from MetaSystems (GmbH, Germany) plus screen scoring render similar results in terms of MN cells frequency than microscopic live scoring. The resultant bias from the Bland–Altman analysis was -1.1% with confidence intervals between -2.2% and -0.1%, indicating an acceptable agreement between both MN scoring method. However, the mean time devoted to live microscope scoring per sample was 159 minutes compared to 39 minutes for microscope images autocapturing and screen scoring. Therefore, it become advantageous to combine autocapturing of microscope images plus screen scoring when many samples have to be analyzed for radiological biodosimetry purposes. (authors)
dBBQs: dataBase of Bacterial Quality scores.

Science.gov (United States)

Wanchai, Visanu; Patumcharoenpol, Preecha; Nookaew, Intawat; Ussery, David

2017-12-28

It is well-known that genome sequencing technologies are becoming significantly cheaper and faster. As a result of this, the exponential growth in sequencing data in public databases allows us to explore ever growing large collections of genome sequences. However, it is less known that the majority of available sequenced genome sequences in public databases are not complete, drafts of varying qualities. We have calculated quality scores for around 100,000 bacterial genomes from all major genome repositories and put them in a fast and easy-to-use database. Prokaryotic genomic data from all sources were collected and combined to make a non-redundant set of bacterial genomes. The genome quality score for each was calculated by four different measurements: assembly quality, number of rRNA and tRNA genes, and the occurrence of conserved functional domains. The dataBase of Bacterial Quality scores (dBBQs) was designed to store and retrieve quality scores. It offers fast searching and download features which the result can be used for further analysis. In addition, the search results are shown in interactive JavaScript chart framework using DC.js. The analysis of quality scores across major public genome databases find that around 68% of the genomes are of acceptable quality for many uses. dBBQs (available at http://arc-gem.uams.edu/dbbqs ) provides genome quality scores for all available prokaryotic genome sequences with a user-friendly Web-interface. These scores can be used as cut-offs to get a high-quality set of genomes for testing bioinformatics tools or improving the analysis. Moreover, all data of the four measurements that were combined to make the quality score for each genome, which can potentially be used for further analysis. dBBQs will be updated regularly and is freely use for non-commercial purpose.

Pre-season adductor squeeze test and HAGOS function sport and recreation subscale scores predict groin injury in Gaelic football players.

Science.gov (United States)

Delahunt, Eamonn; Fitzpatrick, Helen; Blake, Catherine

2017-01-01

To determine if pre-season adductor squeeze test and HAGOS function, sport and recreation subscale scores can identify Gaelic football players at risk of developing groin injury. Prospective study. Senior inter-county Gaelic football team. Fifty-five male elite Gaelic football players (age = 24.0 ± 2.8 years, body mass = 84.48 ± 7.67 kg, height = 1.85 ± 0.06 m, BMI = 24.70 ± 1.77 kg/m 2 ) from a single senior inter-county Gaelic football team. Occurrence of groin injury during the season. Ten time-loss groin injuries were registered representing 13% of all injuries. The odds ratio for sustaining a groin injury if pre-season adductor squeeze test score was below 225 mmHg, was 7.78. The odds ratio for sustaining a groin injury if pre-season HAGOS function, sport and recreation subscale score was football players at risk of developing groin injury. Copyright © 2016 Elsevier Ltd. All rights reserved.
Performance on large-scale science tests: Item attributes that may impact achievement scores

Science.gov (United States)

Gordon, Janet Victoria

Significant differences in achievement among ethnic groups persist on the eighth-grade science Washington Assessment of Student Learning (WASL). The WASL measures academic performance in science using both scenario and stand-alone question types. Previous research suggests that presenting target items connected to an authentic context, like scenario question types, can increase science achievement scores especially in underrepresented groups and thus help to close the achievement gap. The purpose of this study was to identify significant differences in performance between gender and ethnic subgroups by question type on the 2005 eighth-grade science WASL. MANOVA and ANOVA were used to examine relationships between gender and ethnic subgroups as independent variables with achievement scores on scenario and stand-alone question types as dependent variables. MANOVA revealed no significant effects for gender, suggesting that the 2005 eighth-grade science WASL was gender neutral. However, there were significant effects for ethnicity. ANOVA revealed significant effects for ethnicity and ethnicity by gender interaction in both question types. Effect sizes were negligible for the ethnicity by gender interaction. Large effect sizes between ethnicities on scenario question types became moderate to small effect sizes on stand-alone question types. This indicates the score advantage the higher performing subgroups had over the lower performing subgroups was not as large on stand-alone question types compared to scenario question types. A further comparison examined performance on multiple-choice items only within both question types. Similar achievement patterns between ethnicities emerged; however, achievement patterns between genders changed in boys' favor. Scenario question types appeared to register differences between ethnic groups to a greater degree than stand-alone question types. These differences may be attributable to individual differences in cognition
Prediction of mortality using on-line, self-reported health data: empirical test of the RealAge score.

Directory of Open Access Journals (Sweden)

William R Hobbs

Full Text Available OBJECTIVE: We validate an online, personalized mortality risk measure called "RealAge" assigned to 30 million individuals over the past 10 years. METHODS: 188,698 RealAge survey respondents were linked to California Department of Public Health death records using a one-way cryptographic hash of first name, last name, and date of birth. 1,046 were identified as deceased. We used Cox proportional hazards models and receiver operating characteristic (ROC curves to estimate the relative scales and predictive accuracies of chronological age, the RealAge score, and the Framingham ATP-III score for hard coronary heart disease (HCHD in this data. To address concerns about selection and to examine possible heterogeneity, we compared the results by time to death at registration, underlying cause of death, and relative health among users. RESULTS: THE REALAGE SCORE IS ACCURATELY SCALED (HAZARD RATIOS: age 1.076; RealAge-age 1.084 and more accurate than chronological age (age c-statistic: 0.748; RealAge c-statistic: 0.847 in predicting mortality from hard coronary heart disease following survey completion. The score is more accurate than the Framingham ATP-III score for hard coronary heart disease (c-statistic: 0.814, perhaps because self-reported cholesterol levels are relatively uninformative in the RealAge user sample. RealAge predicts deaths from malignant neoplasms, heart disease, and external causes. The score does not predict malignant neoplasm deaths when restricted to users with no smoking history, no prior cancer diagnosis, and no indicated health interest in cancer (p-value 0.820. CONCLUSION: The RealAge score is a valid measure of mortality risk in its user population.
Irradiation effects test series, test IE-5. Test results report

International Nuclear Information System (INIS)

Croucher, D.W.; Yackle, T.R.; Allison, C.M.; Ploger, S.A.

1978-01-01

Test IE-5, conducted in the Power Burst Facility at the Idaho National Engineering Laboratory, employed three 0.97-m long pressurized water reactor type fuel rods, fabricated from previously irradiated zircaloy-4 cladding and one similar rod fabricated from unirradiated cladding. The objectives of the test were to evaluate the influence of simulated fission products, cladding irradiation damage, and fuel rod internal pressure on pellet-cladding interaction during a power ramp and on fuel rod behavior during film boiling operation. The four rods were subjected to a preconditioning period, a power ramp to an average fuel rod peak power of 65 kW/m, and steady state operation for one hour at a coolant mass flux of 4880 kg/s-m 2 for each rod. After a flow reduction to 1800 kg/s-m 2 , film boiling occurred on one rod. Additional flow reductions to 970 kg/s-m 2 produced film boiling on the three remaining fuel rods. Maximum time in film boiling was 80s. The rod having the highest initial internal pressure (8.3 MPa) failed 10s after the onset of film boiling. A second rod failed about 90s after reactor shutdown. The report contains a description of the experiment, the test conduct, test results, and results from the preliminary postirradiation examination. Calculations using a transient fuel rod behavior code are compared with the test results
The specificity of the Stroop interference score of errors to ADHD in boys

DEFF Research Database (Denmark)

Sørensen, L; Plessen, K J; Adolfsdottir, S

2014-01-01

scores on the Inhibit scale from the Behavior Rating Inventory of Executive Function. These findings support that a Stroop interference score of errors is sensitive for inhibition problems in children with ADHD and encourages the use of Stroop versions including error recordings independent of response......The Stroop Interference Test is widely used to assess the inhibition function; however, divergent results have emerged from meta-analyses in children with ADHD. This has led to conflicting results as to whether the Stroop test detects the level of inhibition in these children. We hypothesized...... that the general approach to include interference scores depending on response time causes conflicting results, whereas recordings of errors may prove a superior measure of the inhibition function in children with ADHD. In the present study, 39 children with an ADHD diagnosis, two subgroups with and without...
Best waveform score for diagnosing keratoconus

Directory of Open Access Journals (Sweden)

Allan Luz

2013-12-01

Full Text Available PURPOSE: To test whether corneal hysteresis (CH and corneal resistance factor (CRF can discriminate between keratoconus and normal eyes and to evaluate whether the averages of two consecutive measurements perform differently from the one with the best waveform score (WS for diagnosing keratoconus. METHODS: ORA measurements for one eye per individual were selected randomly from 53 normal patients and from 27 patients with keratoconus. Two groups were considered the average (CH-Avg, CRF-Avg and best waveform score (CH-WS, CRF-WS groups. The Mann-Whitney U-test was used to evaluate whether the variables had similar distributions in the Normal and Keratoconus groups. Receiver operating characteristics (ROC curves were calculated for each parameter to assess the efficacy for diagnosing keratoconus and the same obtained for each variable were compared pairwise using the Hanley-McNeil test. RESULTS: The CH-Avg, CRF-Avg, CH-WS and CRF-WS differed significantly between the normal and keratoconus groups (p<0.001. The areas under the ROC curve (AUROC for CH-Avg, CRF-Avg, CH-WS, and CRF-WS were 0.824, 0.873, 0.891, and 0.931, respectively. CH-WS and CRF-WS had significantly better AUROCs than CH-Avg and CRF-Avg, respectively (p=0.001 and 0.002. CONCLUSION: The analysis of the biomechanical properties of the cornea through the ORA method has proved to be an important aid in the diagnosis of keratoconus, regardless of the method used. The best waveform score (WS measurements were superior to the average of consecutive ORA measurements for diagnosing keratoconus.
The Veterans Affairs Cardiac Risk Score: Recalibrating the Atherosclerotic Cardiovascular Disease Score for Applied Use.

Science.gov (United States)

Sussman, Jeremy B; Wiitala, Wyndy L; Zawistowski, Matthew; Hofer, Timothy P; Bentley, Douglas; Hayward, Rodney A

2017-09-01

Accurately estimating cardiovascular risk is fundamental to good decision-making in cardiovascular disease (CVD) prevention, but risk scores developed in one population often perform poorly in dissimilar populations. We sought to examine whether a large integrated health system can use their electronic health data to better predict individual patients' risk of developing CVD. We created a cohort using all patients ages 45-80 who used Department of Veterans Affairs (VA) ambulatory care services in 2006 with no history of CVD, heart failure, or loop diuretics. Our outcome variable was new-onset CVD in 2007-2011. We then developed a series of recalibrated scores, including a fully refit "VA Risk Score-CVD (VARS-CVD)." We tested the different scores using standard measures of prediction quality. For the 1,512,092 patients in the study, the Atherosclerotic cardiovascular disease risk score had similar discrimination as the VARS-CVD (c-statistic of 0.66 in men and 0.73 in women), but the Atherosclerotic cardiovascular disease model had poor calibration, predicting 63% more events than observed. Calibration was excellent in the fully recalibrated VARS-CVD tool, but simpler techniques tested proved less reliable. We found that local electronic health record data can be used to estimate CVD better than an established risk score based on research populations. Recalibration improved estimates dramatically, and the type of recalibration was important. Such tools can also easily be integrated into health system's electronic health record and can be more readily updated.
The Machine Scoring of Writing

Science.gov (United States)

McCurry, Doug

2010-01-01

This article provides an introduction to the kind of computer software that is used to score student writing in some high stakes testing programs, and that is being promoted as a teaching and learning tool to schools. It sketches the state of play with machines for the scoring of writing, and describes how these machines work and what they do.…
Some Results on Mean Square Error for Factor Score Prediction

Science.gov (United States)

Krijnen, Wim P.

2006-01-01

For the confirmatory factor model a series of inequalities is given with respect to the mean square error (MSE) of three main factor score predictors. The eigenvalues of these MSE matrices are a monotonic function of the eigenvalues of the matrix gamma[subscript rho] = theta[superscript 1/2] lambda[subscript rho] 'psi[subscript rho] [superscript…
Development and validation of a composite scoring system for robot-assisted surgical training--the Robotic Skills Assessment Score.

Science.gov (United States)

Chowriappa, Ashirwad J; Shi, Yi; Raza, Syed Johar; Ahmed, Kamran; Stegemann, Andrew; Wilding, Gregory; Kaouk, Jihad; Peabody, James O; Menon, Mani; Hassett, James M; Kesavadas, Thenkurussi; Guru, Khurshid A

2013-12-01

A standardized scoring system does not exist in virtual reality-based assessment metrics to describe safe and crucial surgical skills in robot-assisted surgery. This study aims to develop an assessment score along with its construct validation. All subjects performed key tasks on previously validated Fundamental Skills of Robotic Surgery curriculum, which were recorded, and metrics were stored. After an expert consensus for the purpose of content validation (Delphi), critical safety determining procedural steps were identified from the Fundamental Skills of Robotic Surgery curriculum and a hierarchical task decomposition of multiple parameters using a variety of metrics was used to develop Robotic Skills Assessment Score (RSA-Score). Robotic Skills Assessment mainly focuses on safety in operative field, critical error, economy, bimanual dexterity, and time. Following, the RSA-Score was further evaluated for construct validation and feasibility. Spearman correlation tests performed between tasks using the RSA-Scores indicate no cross correlation. Wilcoxon rank sum tests were performed between the two groups. The proposed RSA-Score was evaluated on non-robotic surgeons (n = 15) and on expert-robotic surgeons (n = 12). The expert group demonstrated significantly better performance on all four tasks in comparison to the novice group. Validation of the RSA-Score in this study was carried out on the Robotic Surgical Simulator. The RSA-Score is a valid scoring system that could be incorporated in any virtual reality-based surgical simulator to achieve standardized assessment of fundamental surgical tents during robot-assisted surgery. Copyright © 2013 Elsevier Inc. All rights reserved.
Using Minimum Acceptable GRE Scores for Graduate Admissions Suppresses Diversity

Science.gov (United States)

Miller, Casey

2014-01-01

I will present data showing that significant performance disparities on the GRE general test exist based on the test taker's race and gender [1]. Because of the belief that high GRE scores qualify one for graduate studies, the diversity issues faced by STEM fields may originate, at least in part, in misuse of the GRE scores by graduate admissions committees. I will quantitatively demonstrate this by showing that the combination of a hard cut-off and the different score distributions leads to the systematic underrepresentation of certain groups. I will present data from USF’s PhD program that shows a lack of correlation between GRE scores and research ability; similar null results are emerging from numerous other programs. I will then discuss how assessing non-cognitive competencies in the selection process may lead to a more enlightened search for the next generation of scientists. [1] C. W. Miller, "Admissions Criteria and Diversity in Graduate School", APS News Vol 22, Issue 2, The Back Page (2013) http://www.aps.org/publications/apsnews/201302/backpage.cfm
Scoring Strategies for the TOEFL iBT A Complete Guide

CERN Document Server

Stirling, Bruce

2012-01-01

TOEFL students all ask: How can I get a high TOEFL iBT score? Answer: Learn argument scoring strategies. Why? Because the TOEFL iBT recycles opinion-based and fact-based arguments for testing purposes from start to finish. In other words, the TOEFL iBT is all arguments. That's right, all arguments. If you want a high score, you need essential argument scoring strategies. That is what Scoring Strategies for the TOEFL iBT gives you, and more!. TEST-PROVEN STRATEGIES. Learn essential TOEFL iBT scoring strategies developed in American university classrooms and proven successful on the TOEFL iBT. R
How to calculate an MMSE score from a MODA score (and vice versa) in patients with Alzheimer's disease.

Science.gov (United States)

Cazzaniga, R; Francescani, A; Saetti, C; Spinnler, H

2003-11-01

The aim of the present study was to provide a statistically sound way of reciprocally converting scores of the mini-mental state examination (MMSE) and the Milan overall dementia assessment (MODA). A consecutive series of 182 patients with "probable" Alzheimer's disease patients was examined with both tests. MODA and MMSE scores proved to be highly correlated. A formula for converting MODA and MMSE scores was generated.
Confidence Intervals for Weighted Composite Scores under the Compound Binomial Error Model

Science.gov (United States)

Kim, Kyung Yong; Lee, Won-Chan

2018-01-01

Reporting confidence intervals with test scores helps test users make important decisions about examinees by providing information about the precision of test scores. Although a variety of estimation procedures based on the binomial error model are available for computing intervals for test scores, these procedures assume that items are randomly…
Temperamental traits and results of psychoaptitude tests in applicants to become a cadet officer in the Italian Navy.

Science.gov (United States)

Maremmani, Icro; Maremmani, Angelo Giovanni Icro; Leonardi, Annalisa; Rovai, Luca; Bacciardi, Silvia; Rugani, Fabio; Dell'Osso, Liliana; Akiskal, Kareen; Akiskal, S Hagop

2013-09-05

Consistently with the involvement of affective temperaments in professional choices, our research team is aiming to outline the temperamental profile of subjects who are applying to enter a military career in the Italian Armed Forces. In this study we aim to verify the importance of temperamental traits not only in choosing the military career as a profession, but also in passing or failing the entrance examinations. We compared the affective temperaments (evaluated by TEMPS-A[P]) of those applying to become a cadet officer in the Italian Navy, divided into various subgroups depending on whether they passed or failed the entrance examination at various levels (high school final test, medical (physical and psychiatric), mathematical examination and aptitude test). We also tested for correlations between grades received and temperamental scores. Higher scores for those with a hyperthymic and lower scores for those with a depressive, cyclothymic or irritable temperament characterized applicants taking medical exams and aptitude tests. Higher scores on the high school final test correlated with lower hyperthymic, cyclothymic and irritable temperament scores. No correlations were found between temperamental traits and mathematical examinations. Multivariate analysis stressed the negative impact of a cyclothymic temperament and the poor discriminant power of temperaments regarding medical and mathematical examinations, and aptitude tests. Conversely, temperaments showed good discriminant power as far as psychiatric examinations are concerned. Hyperthymic temperamental traits appear to be important not only in choosing a profession, but also in passing entrance examinations. Even so, affective temperaments (strong hyperthymic and weak cyclothymic, depressive and irritable traits) are the only successfully predictors of the outcome of psychiatric examinations and, to a lesser extent, medical examinations and aptitude tests. Achieving high school graduation and passing
A diagnostic scoring system for myxedema coma.

Science.gov (United States)

Popoveniuc, Geanina; Chandra, Tanu; Sud, Anchal; Sharma, Meeta; Blackman, Marc R; Burman, Kenneth D; Mete, Mihriye; Desale, Sameer; Wartofsky, Leonard

2014-08-01

To develop diagnostic criteria for myxedema coma (MC), a decompensated state of extreme hypothyroidism with a high mortality rate if untreated, in order to facilitate its early recognition and treatment. The frequencies of characteristics associated with MC were assessed retrospectively in patients from our institutions in order to derive a semiquantitative diagnostic point scale that was further applied on selected patients whose data were retrieved from the literature. Logistic regression analysis was used to test the predictive power of the score. Receiver operating characteristic (ROC) curve analysis was performed to test the discriminative power of the score. Of the 21 patients examined, 7 were reclassified as not having MC (non-MC), and they were used as controls. The scoring system included a composite of alterations of thermoregulatory, central nervous, cardiovascular, gastrointestinal, and metabolic systems, and presence or absence of a precipitating event. All 14 of our MC patients had a score of ≥60, whereas 6 of 7 non-MC patients had scores of 25 to 50. A total of 16 of 22 MC patients whose data were retrieved from the literature had a score ≥60, and 6 of 22 of these patients scored between 45 and 55. The odds ratio per each score unit increase as a continuum was 1.09 (95% confidence interval [CI], 1.01 to 1.16; P = .019); a score of 60 identified coma, with an odds ratio of 1.22. The area under the ROC curve was 0.88 (95% CI, 0.65 to 1.00), and the score of 60 had 100% sensitivity and 85.71% specificity. A score ≥60 in the proposed scoring system is potentially diagnostic for MC, whereas scores between 45 and 59 could classify patients at risk for MC.
The power to detect linkage in complex disease by means of simple LOD-score analyses.

Science.gov (United States)

Greenberg, D A; Abreu, P; Hodge, S E

1998-09-01

Maximum-likelihood analysis (via LOD score) provides the most powerful method for finding linkage when the mode of inheritance (MOI) is known. However, because one must assume an MOI, the application of LOD-score analysis to complex disease has been questioned. Although it is known that one can legitimately maximize the maximum LOD score with respect to genetic parameters, this approach raises three concerns: (1) multiple testing, (2) effect on power to detect linkage, and (3) adequacy of the approximate MOI for the true MOI. We evaluated the power of LOD scores to detect linkage when the true MOI was complex but a LOD score analysis assumed simple models. We simulated data from 14 different genetic models, including dominant and recessive at high (80%) and low (20%) penetrances, intermediate models, and several additive two-locus models. We calculated LOD scores by assuming two simple models, dominant and recessive, each with 50% penetrance, then took the higher of the two LOD scores as the raw test statistic and corrected for multiple tests. We call this test statistic "MMLS-C." We found that the ELODs for MMLS-C are >=80% of the ELOD under the true model when the ELOD for the true model is >=3. Similarly, the power to reach a given LOD score was usually >=80% that of the true model, when the power under the true model was >=60%. These results underscore that a critical factor in LOD-score analysis is the MOI at the linked locus, not that of the disease or trait per se. Thus, a limited set of simple genetic models in LOD-score analysis can work well in testing for linkage.
External validation of the simple clinical score and the HOTEL score, two scores for predicting short-term mortality after admission to an acute medical unit.

Science.gov (United States)

Stræde, Mia; Brabrand, Mikkel

2014-01-01

Clinical scores can be of aid to predict early mortality after admission to a medical admission unit. A developed scoring system needs to be externally validated to minimise the risk of the discriminatory power and calibration to be falsely elevated. We performed the present study with the objective of validating the Simple Clinical Score (SCS) and the HOTEL score, two existing risk stratification systems that predict mortality for medical patients based solely on clinical information, but not only vital signs. Pre-planned prospective observational cohort study. Danish 460-bed regional teaching hospital. We included 3046 consecutive patients from 2 October 2008 until 19 February 2009. 26 (0.9%) died within one calendar day and 196 (6.4%) died within 30 days. We calculated SCS for 1080 patients. We found an AUROC of 0.960 (95% confidence interval [CI], 0.932 to 0.988) for 24-hours mortality and 0.826 (95% CI, 0.774-0.879) for 30-day mortality, and goodness-of-fit test, χ(2) = 2.68 (10 degrees of freedom), P = 0.998 and χ(2) = 4.00, P = 0.947, respectively. We included 1470 patients when calculating the HOTEL score. Discriminatory power (AUROC) was 0.931 (95% CI, 0.901-0.962) for 24-hours mortality and goodness-of-fit test, χ(2) = 5.56 (10 degrees of freedom), P = 0.234. We find that both the SCS and HOTEL scores showed an excellent to outstanding ability in identifying patients at high risk of dying with good or acceptable precision.
Investigating the Effect of Sympathetic Skin Response Parameters on the Psychological Test Scores in Patients with Fibromyalgia Syndrome by Using ANNS

Directory of Open Access Journals (Sweden)

Murat Yıldız

2013-01-01

Full Text Available In this study, psychological tests such as Visual Analogue Pain Scale, Verbal Pain Scale, Beck Depression Inventory, Beck Anxiety Inventory, Hamilton Depression Rating Scale and Hamilton Anxiety Scale were applied to the selected healthy subjects and patients with Fibromyalgia Syndrome (FMS in Suleyman Demirel University, Faculty of Medicine, Department of Physical Medicine and Rehabilitation and the scores were recorded. A measurement system was established in the same department of the university to measure the sympathetic skin response (SSR from the subjects. The SSR was measured and recorded. The parameters such as latency time, maximum amplitude and the elapsed time were calculated by using Matlab software from the recorded SSR data. SSR parameters were added to the scores and diagnosis accuracy percentages of the FMS calculated by using artificial neural networks (ANNs. Obtained results from the simulations showed that the specified parameters of the SSR and FMS were concerned and these parameters can be used as a diagnostic method in FMS.
An alternative to the balance error scoring system: using a low-cost balance board to improve the validity/reliability of sports-related concussion balance testing.

Science.gov (United States)

Chang, Jasper O; Levy, Susan S; Seay, Seth W; Goble, Daniel J

2014-05-01

Recent guidelines advocate sports medicine professionals to use balance tests to assess sensorimotor status in the management of concussions. The present study sought to determine whether a low-cost balance board could provide a valid, reliable, and objective means of performing this balance testing. Criterion validity testing relative to a gold standard and 7 day test-retest reliability. University biomechanics laboratory. Thirty healthy young adults. Balance ability was assessed on 2 days separated by 1 week using (1) a gold standard measure (ie, scientific grade force plate), (2) a low-cost Nintendo Wii Balance Board (WBB), and (3) the Balance Error Scoring System (BESS). Validity of the WBB center of pressure path length and BESS scores were determined relative to the force plate data. Test-retest reliability was established based on intraclass correlation coefficients. Composite scores for the WBB had excellent validity (r = 0.99) and test-retest reliability (R = 0.88). Both the validity (r = 0.10-0.52) and test-retest reliability (r = 0.61-0.78) were lower for the BESS. These findings demonstrate that a low-cost balance board can provide improved balance testing accuracy/reliability compared with the BESS. This approach provides a potentially more valid/reliable, yet affordable, means of assessing sports-related concussion compared with current methods.

MITG test procedure and results

International Nuclear Information System (INIS)

Eck, M.E.; Mukunda, M.

1983-01-01

Elements and modules for Radioisotope Thermoelectric Generator have been performance tested since the inception of the RTG program. These test articles seldom resembled flight hardware and often lacked adequate diagnostic instrumentation. Because of this, performance problems were not identified in the early stage of program development. The lack of test data in an unexpected area often hampered the development of a problem solution. A procedure for conducting the MITG Test was developed in an effort to obtain data in a systematic, unambiguous manner. This procedure required the development of extensive data acquisition software and test automation. The development of a facility to implement the test procedure, the facility hardware and software requirements, and the results of the MITG testing are the subject of this paper
Does the COPD assessment test (CAT(TM)) questionnaire produce similar results when self- or interviewer administered?

Science.gov (United States)

Agusti, A; Soler-Cataluña, J J; Molina, J; Morejon, E; Garcia-Losa, M; Roset, M; Badia, X

2015-10-01

The COPD assessment test (CAT) is a questionnaire that assesses the impact of chronic obstructive pulmonary disease (COPD) on health status, but some patients have difficulties filling it up by themselves. We examined whether the mode of administration of the Spanish version of CAT (self vs. interviewer) influences its scores and/or psychometric properties. Observational, prospective study in 49 Spanish centers that includes clinically stable COPD patients (n = 153) and patients hospitalized because of an exacerbation (ECOPD; n = 224). The CAT was self-administered (CAT-SA) or administered by an interviewer (CAT-IA) based on the investigator judgment of the patient's capacity. To assess convergent validity, the Saint George's Respiratory Disease Questionnaire (SGRQ) and the London Chest Activity of Daily Living (LCADL) instrument were also administered. Psychometric properties were compared across modes of administration. A total of 118 patients (31 %) completed the CAT-SA and 259 (69 %) CAT-IA. Multiple regression analysis showed that mode of administration did not affect CAT scores. The CAT showed excellent psychometric properties in both modes of administration. Internal consistency coefficients (Cronbach's alpha) were high (0.86 for CAT-SA and 0.85 for CAT-IA) as was test-retest reliability (intraclass correlation coefficients of 0.83 for CAT-SA and CAT-IA). Correlations with SGRQ and LCADL were moderate to strong both in CAT-SA and CAT-IA, indicating good convergent validity. Similar results were observed when testing longitudinal validity. The mode of administration does not influence CAT scores or its psychometric properties. Hence, both modes of administration can be used in clinical practice depending on the physician judgment of patient's capacity.
Micronucleus test for radiation biodosimetry in mass casualty events: Evaluation of visual and automated scoring

Energy Technology Data Exchange (ETDEWEB)

Bolognesi, Claudia, E-mail: claudia.bolognesi@istge.i [Environmental Carcinogenesis Unit, National Cancer Research Institute, Largo R. Benzi 10, 16132 Genoa (Italy); Balia, Cristina; Roggieri, Paola [Environmental Carcinogenesis Unit, National Cancer Research Institute, Largo R. Benzi 10, 16132 Genoa (Italy); Cardinale, Francesco [Clinical Epidemiology Unit, National Cancer Research Institute, Largo R. Benzi 10, 16132 Genoa (Italy); Department of Health Sciences, University of Genoa, Genoa (Italy); Bruzzi, Paolo [Clinical Epidemiology Unit, National Cancer Research Institute, Largo R. Benzi 10, 16132 Genoa (Italy); Sorcinelli, Francesca [Environmental Carcinogenesis Unit, National Cancer Research Institute, Largo R. Benzi 10, 16132 Genoa (Italy); Laboratory of Genetics, Histology and Molecular Biology Section, Army Medical and Veterinary, Research Center, Via Santo Stefano Rotondo 4, 00184 Roma (Italy); Lista, Florigio [Laboratory of Genetics, Histology and Molecular Biology Section, Army Medical and Veterinary, Research Center, Via Santo Stefano Rotondo 4, 00184 Roma (Italy); D' Amelio, Raffaele [Sapienza, Universita di Roma II Facolta di Medicina e Chirurgia and Ministero della Difesa, Direzione Generale Sanita Militare (Italy); Righi, Enzo [Frascati National Laboratories, National Institute of Nuclear Physics, Via Enrico Fermi 40, 00044 Frascati, Rome (Italy)

2011-02-15

In the case of a large-scale nuclear or radiological incidents a reliable estimate of dose is an essential tool for providing timely assessment of radiation exposure and for making life-saving medical decisions. Cytogenetics is considered as the 'gold standard' for biodosimetry. The dicentric analysis (DA) represents the most specific cytogenetic bioassay. The micronucleus test (MN) applied in interphase in peripheral lymphocytes is an alternative and simpler approach. A dose-effect calibration curve for the MN frequency in peripheral lymphocytes from 27 adult donors was established after in vitro irradiation at a dose range 0.15-8 Gy of {sup 137}Cs gamma rays (dose rate 6 Gy min{sup -1}). Dose prediction by visual scoring in a dose-blinded study (0.15-4.0 Gy) revealed a high level of accuracy (R = 0.89). The scoring of MN is time consuming and requires adequate skills and expertise. Automated image analysis is a feasible approach allowing to reduce the time and to increase the accuracy of the dose estimation decreasing the variability due to subjective evaluation. A good correlation (R = 0.705) between visual and automated scoring with visual correction was observed over the dose range 0-2 Gy. Almost perfect discrimination power for exposure to 1-2 Gy, and a satisfactory power for 0.6 Gy were detected. This threshold level can be considered sufficient for identification of sub lethally exposed individuals by automated CBMN assay.
Irradiation Effects Test Series: Test IE-3. Test results report

International Nuclear Information System (INIS)

Farrar, L.C.; Allison, C.M.; Croucher, D.W.; Ploger, S.A.

1977-10-01

The objectives of the test reported were to: (a) determine the behavior of irradiated fuel rods subjected to a rapid power increase during which the possibility of a pellet-cladding mechanical interaction failure is enhanced and (b) determine the behavior of these fuel rods during film boiling following this rapid power increase. Test IE-3 used four 0.97-m long pressurized water reactor type fuel rods fabricated from previously irradiated fuel. The fuel rods were subjected to a preconditioning period, followed by a power ramp to 69 kW/m at a coolant mass flux of 4920 kg/s-m 2 . After a flow reduction to 2120 kg/s-m 2 , film boiling occurred on the fuel rods. One rod failed approximately 45 seconds after the reactor was shut down as a result of cladding embrittlement due to extensive cladding oxidation. Data are presented on the behavior of these irradiated fuel rods during steady-state operation, the power ramp, and film boiling operation. The effects of a power ramp and power ramp rates on pellet-cladding interaction are discussed. Test data are compared with FRAP-T3 computer model calculations and data from a previous Irradiation Effects test in which four irradiated fuel rods of a similar design were tested. Test IE-3 results indicate that the irradiated state of the fuel rods did not significantly affect fuel rod behavior during normal, abnormal (power ramp of 20 kW/m per minute), and accident (film boiling) conditions
Poor performances of EuroSCORE and CARE score for prediction of perioperative mortality in octogenarians undergoing aortic valve replacement for aortic stenosis.

Science.gov (United States)

Chhor, Vibol; Merceron, Sybille; Ricome, Sylvie; Baron, Gabriel; Daoud, Omar; Dilly, Marie-Pierre; Aubier, Benjamin; Provenchere, Sophie; Philip, Ivan

2010-08-01

Although results of cardiac surgery are improving, octogenarians have a higher procedure-related mortality and more complications with increased length of stay in ICU. Consequently, careful evaluation of perioperative risk seems necessary. The aims of our study were to assess and compare the performances of EuroSCORE and CARE score in the prediction of perioperative mortality among octogenarians undergoing aortic valve replacement for aortic stenosis and to compare these predictive performances with those obtained in younger patients. This retrospective study included all consecutive patients undergoing cardiac surgery in our institution between November 2005 and December 2007. For each patient, risk assessment for mortality was performed using logistic EuroSCORE, additive EuroSCORE and CARE score. The main outcome measure was early postoperative mortality. Predictive performances of these scores were assessed by calibration and discrimination using goodness-of-fit test and area under the receiver operating characteristic curve, respectively. During this 2-year period, we studied 2117 patients, among whom 134/211 octogenarians and 335/1906 nonoctogenarians underwent an aortic valve replacement for aortic stenosis. When considering patients with aortic stenosis, discrimination was poor in octogenarians and the difference from nonoctogenarians was significant for each score (0.58, 0.59 and 0.56 vs. 0.82, 0.81 and 0.77 for additive EuroSCORE, logistic EuroSCORE and CARE score in octogenarians and nonoctogenarians, respectively, P performances of these scores are poor in octogenarians undergoing cardiac surgery, especially aortic valve replacement. Risk assessment and therapeutic decisions in octogenarians should not be made with these scoring systems alone.
ISSUE PAPER: What Do Test Scores in Texas Tell Us?

National Research Council Canada - National Science Library

Klein, Stephen

2000-01-01

...) about possible unintended consequences of these programs. We conducted several analyses to examine the issue of whether TAAS scores can be trusted to provide an accurate index of student skills and abilities...
WebScore: An Effective Page Scoring Approach for Uncertain Web Social Networks

Directory of Open Access Journals (Sweden)

Shaojie Qiao

2011-10-01

Full Text Available To effectively score pages with uncertainty in web social networks, we first proposed a new concept called transition probability matrix and formally defined the uncertainty in web social networks. Second, we proposed a hybrid page scoring algorithm, called WebScore, based on the PageRank algorithm and three centrality measures including degree, betweenness, and closeness. Particularly,WebScore takes into a full consideration of the uncertainty of web social networks by computing the transition probability from one page to another. The basic idea ofWebScore is to: (1 integrate uncertainty into PageRank in order to accurately rank pages, and (2 apply the centrality measures to calculate the importance of pages in web social networks. In order to verify the performance of WebScore, we developed a web social network analysis system which can partition web pages into distinct groups and score them in an effective fashion. Finally, we conducted extensive experiments on real data and the results show that WebScore is effective at scoring uncertain pages with less time deficiency than PageRank and centrality measures based page scoring algorithms.
Simplified Therapeutic Intervention Scoring System : The TISS-28 items - Results from a multicenter study

NARCIS (Netherlands)

Miranda, DR; deRijk, A; Schaufeli, W

Objectives: To validate a simplified version of the Therapeutic Intervention Scoring System, the TISS-28, and to determine the association of TISS-28 with the time spent on scored and nonscored nursing activities. Design: Prospective, multicenter study. Setting: Twenty-two adult medical, surgical,
Low bone mineral density in COPD patients with osteoporosis is related to low daily physical activity and high COPD assessment test scores

Directory of Open Access Journals (Sweden)

Liu WT

2015-09-01

, all P<0.05 and T-score (r=0.471, 0.531, 0.459, respectively, all P<0.05, whereas CAT scores were significantly negatively correlated with (total hip and femoral neck BMD (r=-0.412, -0.552, respectively, P<0.05 and (lumbar spine, total hip, and femoral neck T-score (r=-0.389, -0.429, -0.543, respectively, P<0.05. Low femoral neck BMD in COPD patients was related to high CAT scores. Our results show no significant difference in desaturation index, low SpO2, and inflammatory markers (IL-6, TNF-α, IL-8/CXCL8, CRP, and 8-isoprostane between the two groups. Chest physicians should be aware that COPD patients with OP have low DPA and high CAT scores.Keywords: chronic obstructive pulmonary disease, osteoporosis, daily physical activity, COPD assessment test, bone mineral density
Alternative filtration testing program: Pre-evaluation of test results

International Nuclear Information System (INIS)

Georgeton, G.K.; Poirier, M.R.

1990-01-01

Based on results of testing eight solids removal technologies and one pretreatment option, it is recommended that a centrifugal ultrafilter and polymeric ultrafilter undergo further testing as possible alternatives to the Norton Ceramic filters. Deep bed filtration should be considered as a third alternative, if a backwashable cartridge filter is shown to be inefficient in separate testing
Alternative filtration testing program: Pre-evaluation of test results

Energy Technology Data Exchange (ETDEWEB)

Georgeton, G.K.; Poirier, M.R.

1990-09-28

Based on results of testing eight solids removal technologies and one pretreatment option, it is recommended that a centrifugal ultrafilter and polymeric ultrafilter undergo further testing as possible alternatives to the Norton Ceramic filters. Deep bed filtration should be considered as a third alternative, if a backwashable cartridge filter is shown to be inefficient in separate testing.
A Categorical Instrument for Scoring Second Language Writing Skills.

Science.gov (United States)

Brown, James Dean; Bailey, Kathleen M.

1984-01-01

Discusses a study of the reliability of a categorical instrument for evaluating compositions written by upper intermediate university English as a second language students. The instrument tests organization, logical development of ideas, grammar, mechanics, and style. Results indicate that the scoring instrument is moderately reliable. (SED)
A Comparative Study between the Conventional MCQ Scores and MCQ with the CBA Scores at the Standardized Clinical Knowledge Exam for Clinical Medical Students

Directory of Open Access Journals (Sweden)

Mahmood Ghadermarzi

2015-06-01

Full Text Available Background and purpose: Partial knowledge is one of the main factors to be considered when dealing with the improvement of the administration of Multiple Choice Questions (MCQ in testing. Various strategies have been proposed for this factor in the traditional testing environment. Therefore, this study proposed a Confidence Based Assessment (CBA as a pertinent solution and aims at comparing the effect of the CBA Scoring system with that of the conventional scoring systems (with and without negative score estimation as penalty on the students’ scores and estimating their partial knowledge on clinical studies.Methods: This comparative study was conducted using a standardized clinical knowledge exam for 117 clinical students. After two-step training, both the conventional MCQ and CBA examination was given in a single session simultaneously. The exam included 100 questions and the volunteers were requested to complete a questionnaire regarding their attitude and satisfaction on their first experience of the CBA after exam. A new confidence based marking system was selected for the scoring, which was a hybrid of the UCL and MUK2010 systems. The MCQ-Assistant, SPSS and Microsoft office Excel software were used for scoring and data analysis.Results: The mean age of the volunteers was 27.3±5.47, of whom 43.6% were men and 69.2% were senior medical students. Exam reliability was 0.977. The fit line of the MCQ scores without penalty estimation was R2=0.9816 and Intercept=18.125 or approximately.2 deviation in the low scores. The MCQ scoring with penalty had a fit line approximately parallel to the 45-degree line but on or above it and the CBA scoring fit line was nearer to the 45-degree line, parallel to it and a little below it. These two sets of scores had a significant p value0.037. The response percentage to the CBA is higher (p value=0.0001. The discrimination power of the MCQ and the CBA for the upper and lower 1/3 of the students was not
Comparison of formula and number-right scoring in undergraduate medical training: a Rasch model analysis.

Science.gov (United States)

Cecilio-Fernandes, Dario; Medema, Harro; Collares, Carlos Fernando; Schuwirth, Lambert; Cohen-Schotanus, Janke; Tio, René A

2017-11-09

Progress testing is an assessment tool used to periodically assess all students at the end-of-curriculum level. Because students cannot know everything, it is important that they recognize their lack of knowledge. For that reason, the formula-scoring method has usually been used. However, where partial knowledge needs to be taken into account, the number-right scoring method is used. Research comparing both methods has yielded conflicting results. As far as we know, in all these studies, Classical Test Theory or Generalizability Theory was used to analyze the data. In contrast to these studies, we will explore the use of the Rasch model to compare both methods. A 2 × 2 crossover design was used in a study where 298 students from four medical schools participated. A sample of 200 previously used questions from the progress tests was selected. The data were analyzed using the Rasch model, which provides fit parameters, reliability coefficients, and response option analysis. The fit parameters were in the optimal interval ranging from 0.50 to 1.50, and the means were around 1.00. The person and item reliability coefficients were higher in the number-right condition than in the formula-scoring condition. The response option analysis showed that the majority of dysfunctional items emerged in the formula-scoring condition. The findings of this study support the use of number-right scoring over formula scoring. Rasch model analyses showed that tests with number-right scoring have better psychometric properties than formula scoring. However, choosing the appropriate scoring method should depend not only on psychometric properties but also on self-directed test-taking strategies and metacognitive skills.
Performance of Surgical Risk Scores to Predict Mortality after Transcatheter Aortic Valve Implantation

Directory of Open Access Journals (Sweden)

Leonardo Sinnott Silva

2015-01-01

Full Text Available Abstract Background: Predicting mortality in patients undergoing transcatheter aortic valve implantation (TAVI remains a challenge. Objectives: To evaluate the performance of 5 risk scores for cardiac surgery in predicting the 30-day mortality among patients of the Brazilian Registry of TAVI. Methods: The Brazilian Multicenter Registry prospectively enrolled 418 patients undergoing TAVI in 18 centers between 2008 and 2013. The 30-day mortality risk was calculated using the following surgical scores: the logistic EuroSCORE I (ESI, EuroSCORE II (ESII, Society of Thoracic Surgeons (STS score, Ambler score (AS and Guaragna score (GS. The performance of the risk scores was evaluated in terms of their calibration (Hosmer–Lemeshow test and discrimination [area under the receiver–operating characteristic curve (AUC]. Results: The mean age was 81.5 ± 7.7 years. The CoreValve (Medtronic was used in 86.1% of the cohort, and the transfemoral approach was used in 96.2%. The observed 30-day mortality was 9.1%. The 30-day mortality predicted by the scores was as follows: ESI, 20.2 ± 13.8%; ESII, 6.5 ± 13.8%; STS score, 14.7 ± 4.4%; AS, 7.0 ± 3.8%; GS, 17.3 ± 10.8%. Using AUC, none of the tested scores could accurately predict the 30-day mortality. AUC for the scores was as follows: 0.58 [95% confidence interval (CI: 0.49 to 0.68, p = 0.09] for ESI; 0.54 (95% CI: 0.44 to 0.64, p = 0.42 for ESII; 0.57 (95% CI: 0.47 to 0.67, p = 0.16 for AS; 0.48 (95% IC: 0.38 to 0.57, p = 0.68 for STS score; and 0.52 (95% CI: 0.42 to 0.62, p = 0.64 for GS. The Hosmer–Lemeshow test indicated acceptable calibration for all scores (p > 0.05. Conclusions: In this real world Brazilian registry, the surgical risk scores were inaccurate in predicting mortality after TAVI. Risk models specifically developed for TAVI are required.
Progress Testing for Medical Students at the University of Auckland: Results from the First Year of Assessments

Directory of Open Access Journals (Sweden)

Steven Lillis

2014-01-01

Full Text Available Background Progress testing is a method of assessing longitudinal progress of students using a single best answer format pitched at the standard of a newly graduated doctor. Aim To evaluate the results of the first year of summative progress testing at the University of Auckland for Years 2 and 4 in 2013. SUBJECTS: Two cohorts of medical students from Years 2 and 4 of the Medical Program. Methods A survey was administered to all involved students. Open text feedback was also sought. Psychometric data were collected on test performance, and indices of reliability and validity were calculated. Results The three tests showed increased mean scores over time. Reliability of the assessments was uniformly high. There was good concurrent validity. Students believe that progress testing assists in integrating science with clinical knowledge and improve learning. Year 4 students reported improved knowledge retention and deeper understanding. Conclusion Progress testing has been successfully introduced into the Faculty for two separate year cohorts and results have met expectations. Other year cohorts will be added incrementally. Recommendation Key success factors for introducing progress testing are partnership with an experienced university, multiple and iterative briefings with staff and students as well as demonstrating the usefulness of progress testing by providing students with detailed feedback on performance.
More Issues in Observed-Score Equating

Science.gov (United States)

van der Linden, Wim J.

2013-01-01

This article is a response to the commentaries on the position paper on observed-score equating by van der Linden (this issue). The response focuses on the more general issues in these commentaries, such as the nature of the observed scores that are equated, the importance of test-theory assumptions in equating, the necessity to use multiple…
The Effects of Teaching Descriptive Geometry in General Engineering 103 on Spatial Relations Tests Scores.

Science.gov (United States)

Stallings, William M.

It was hypothesized that instruction in descriptive geometry produces an increase in SRT scores. The resultant data do not firmly support this hypothesis. It is suggested that this study be replicated with the use of randomly selected control groups. (MS)
Irradiation effects test series test IE-1 test results report

International Nuclear Information System (INIS)

Quapp, W.J.; Allison, C.M.; Farrar, L.C.; Mehner, A.S.

1977-03-01

The report describes the results of the first programmatic test in the Nuclear Regulatory Commission Irradiation Effects Test Series. This test (IE-1) used four 0.97m long PWR-type fuel rods fabricated from previously irradiated Saxton fuel. The objectives of this test were to evaluate the effect of fuel pellet density on pellet-cladding interaction during a power ramp and to evaluate the influence of the irradiated state of the fuel and cladding on rod behavior during film boiling operation. Data are presented on the behavior of irradiated fuel rods during steady-state operation, a power ramp, and film boiling operation. The effects of as-fabricated gap size, as-fabricated fuel density, rod power, and power ramp rate on pellet-cladding interaction are discussed. Test data are compared with FRAP-T2 computer model predictions, and comments on the consequences of sustained film boiling operation on irradiated fuel rod behavior are provided
An Integrated Model of Academic Self-Concept Development: Academic Self-Concept, Grades, Test Scores, and Tracking over 6 Years

Science.gov (United States)

Marsh, Herbert W.; Pekrun, Reinhard; Murayama, Kou; Arens, A. Katrin; Parker, Philip D.; Guo, Jiesi; Dicke, Theresa

2018-01-01

Our newly proposed integrated academic self-concept model integrates 3 major theories of academic self-concept formation and developmental perspectives into a unified conceptual and methodological framework. Relations among math self-concept (MSC), school grades, test scores, and school-level contextual effects over 6 years, from the end of…

¿Exito en California? A Validity Critique of Language Program Evaluations and Analysis of English Learner Test Scores

Directory of Open Access Journals (Sweden)

Marilyn S. Thompson

2002-01-01

Full Text Available Several states have recently faced ballot initiatives that propose to functionally eliminate bilingual education in favor of English-only approaches. Proponents of these initiatives have argued an overall rise in standardized achievement scores of California's limited English proficient (LEP students is largely due to the implementation of English immersion programs mandated by Proposition 227 in 1998, hence, they claim Exito en California (Success in California. However, many such arguments presented in the media were based on flawed summaries of these data. We first discuss the background, media coverage, and previous research associated with California's Proposition 227. We then present a series of validity concerns regarding use of Stanford-9 achievement data to address policy for educating LEP students; these concerns include the language of the test, alternative explanations, sample selection, and data analysis decisions. Finally, we present a comprehensive summary of scaled-score achievement means and trajectories for California's LEP and non-LEP students for 1998-2000. Our analyses indicate that although scores have risen overall, the achievement gap between LEP and EP students does not appear to be narrowing.
The Bayesian Score Statistic

NARCIS (Netherlands)

Kleibergen, F.R.; Kleijn, R.; Paap, R.

2000-01-01

We propose a novel Bayesian test under a (noninformative) Jeffreys'priorspecification. We check whether the fixed scalar value of the so-calledBayesian Score Statistic (BSS) under the null hypothesis is aplausiblerealization from its known and standardized distribution under thealternative. Unlike
Irradiation effects test series, test IE-5. Test results report. [PWR

Energy Technology Data Exchange (ETDEWEB)

Croucher, D. W.; Yackle, T. R.; Allison, C. M.; Ploger, S. A.

1978-01-01

Test IE-5, conducted in the Power Burst Facility at the Idaho National Engineering Laboratory, employed three 0.97-m long pressurized water reactor type fuel rods, fabricated from previously irradiated zircaloy-4 cladding and one similar rod fabricated from unirradiated cladding. The objectives of the test were to evaluate the influence of simulated fission products, cladding irradiation damage, and fuel rod internal pressure on pellet-cladding interaction during a power ramp and on fuel rod behavior during film boiling operation. The four rods were subjected to a preconditioning period, a power ramp to an average fuel rod peak power of 65 kW/m, and steady state operation for one hour at a coolant mass flux of 4880 kg/s-m/sup 2/ for each rod. After a flow reduction to 1800 kg/s-m/sup 2/, film boiling occurred on one rod. Additional flow reductions to 970 kg/s-m/sup 2/ produced film boiling on the three remaining fuel rods. Maximum time in film boiling was 80s. The rod having the highest initial internal pressure (8.3 MPa) failed 10s after the onset of film boiling. A second rod failed about 90s after reactor shutdown. The report contains a description of the experiment, the test conduct, test results, and results from the preliminary postirradiation examination. Calculations using a transient fuel rod behavior code are compared with the test results.
Revision Vodcast Influence on Assessment Scores and Study Processes in Secondary Physics

Science.gov (United States)

Marencik, Joseph J.

A quasi-experimental switching replications design with matched participants was employed to determine the influence of revision vodcasts, or video podcasts, on students' assessment scores and study processes in secondary physics. This study satisfied a need for quantitative results in the area of vodcast influence on students' learning processes. Thirty-eight physics students in an urban Ohio public high school participated in the study. The students in one Physics class were paired with students in another Physics class through the matching characteristics of current student cumulative test score mean and baseline study process as measured by the Study Process Questionnaire (SPQ). Students in both classes were given identical pedagogic treatment and access to traditional revision tools except for the supplemental revision vodcasts given to the experimental group. After students in the experimental group viewed the revision vodcast for a particular topic, the assessment scores of the students in the experimental group were compared to the assessment scores of the control group through the direct-difference, D, test to determine any difference between the assessment score means of each group. The SPQ was given at the beginning of the experiment and after each physics assessment. The direct-difference method was again used to determine any difference between the SPQ deep approach scores of each group. The SPQ was also used to determine any correlative effects between study process and revision vodcast use on students' assessment scores through descriptive statistics and an analysis of variance (ANOVA) test. Analysis indicated that revision vodcast use significantly increased students' assessment scores (p.05). There were no significant correlative effects of revision vodcast use and study processes on students' assessment scores (p>.05). This study offers educators the empirical support to devote the necessary effort, time, and resources into developing successful
Construction of Chained True Score Equipercentile Equatings under the Kernel Equating (KE) Framework and Their Relationship to Levine True Score Equating. Research Report. ETS RR-09-24

Science.gov (United States)

Chen, Haiwen; Holland, Paul

2009-01-01

In this paper, we develop a new chained equipercentile equating procedure for the nonequivalent groups with anchor test (NEAT) design under the assumptions of the classical test theory model. This new equating is named chained true score equipercentile equating. We also apply the kernel equating framework to this equating design, resulting in a…
The use of test scores from large-scale assessment surveys: psychometric and statistical considerations

Directory of Open Access Journals (Sweden)

Henry Braun

2017-11-01

Full Text Available Abstract Background Economists are making increasing use of measures of student achievement obtained through large-scale survey assessments such as NAEP, TIMSS, and PISA. The construction of these measures, employing plausible value (PV methodology, is quite different from that of the more familiar test scores associated with assessments such as the SAT or ACT. These differences have important implications both for utilization and interpretation. Although much has been written about PVs, it appears that there are still misconceptions about whether and how to employ them in secondary analyses. Methods We address a range of technical issues, including those raised in a recent article that was written to inform economists using these databases. First, an extensive review of the relevant literature was conducted, with particular attention to key publications that describe the derivation and psychometric characteristics of such achievement measures. Second, a simulation study was carried out to compare the statistical properties of estimates based on the use of PVs with those based on other, commonly used methods. Results It is shown, through both theoretical analysis and simulation, that under fairly general conditions appropriate use of PV yields approximately unbiased estimates of model parameters in regression analyses of large scale survey data. The superiority of the PV methodology is particularly evident when measures of student achievement are employed as explanatory variables. Conclusions The PV methodology used to report student test performance in large scale surveys remains the state-of-the-art for secondary analyses of these databases.
Might the Rorschach be a projective test after all? Social projection of an undesired trait alters Rorschach Oral Dependency scores.

Science.gov (United States)

Bornstein, Robert F

2007-06-01

The degree to which projection plays a role in Rorschach (Rorschach, 1921/1942) responding remains controversial, in part because extant data have yielded inconclusive results. In this investigation, I examined the impact of social projection on Rorschach Oral Dependency (ROD) scores using methods adapted from social cognition research. In Study 1, I prescreened 85 college students (40 women and 45 men) with the ROD scale and a widely used self-report measure of dependency, the Interpersonal Dependency Inventory (IDI; Hirschfeld et al., 1977). Results show that informing participants who scored low on the IDI that they were in fact highly dependent led to significant increases in ROD scores; I did not obtain parallel ROD increases for participants who scored high on the IDI or for participants who received low-dependent feedback. In Study 2, I examined a separate sample of 80 prescreened college students (40 women and 40 men) and showed that providing low self-report participants an opportunity to attribute dependency to a fictional target person prior to Rorschach responding attenuated the impact of high-dependent feedback on ROD scores. These results suggest that projection played a role in at least one domain of Rorschach responding. I discuss theoretical, clinical, and empirical implications of these results.
Test results of HTTR control system

International Nuclear Information System (INIS)

Motegi, Toshihiro; Iigaki, Kazuhiko; Saito, Kenji; Sawahata, Hiroaki; Hirato, Yoji; Kondo, Makoto; Shibutani, Hideki; Ogawa, Satoru; Shinozaki, Masayuki; Mizushima, Toshihiko; Kawasaki, Kozo

2006-06-01

The plant control performance of the IHX helium flow rate control system, the PPWC helium flow rate control system, the secondary helium flow rate control system, the inlet temperature control system, the reactor power control system and the outlet temperature control system of the HTTR are obtained through function tests and power-up tests. As the test results, the control systems show stable control response under transient condition. Both of inlet temperature control system and reactor power control system shows stable operation from 30% to 100%, respectively. This report describes the outline of control systems and test results. (author)
A Summary Score for the Framingham Heart Study Neuropsychological Battery.

Science.gov (United States)

Downer, Brian; Fardo, David W; Schmitt, Frederick A

2015-10-01

To calculate three summary scores of the Framingham Heart Study neuropsychological battery and determine which score best differentiates between subjects classified as having normal cognition, test-based impaired learning and memory, test-based multidomain impairment, and dementia. The final sample included 2,503 participants. Three summary scores were assessed: (a) composite score that provided equal weight to each subtest, (b) composite score that provided equal weight to each cognitive domain assessed by the neuropsychological battery, and (c) abbreviated score comprised of subtests for learning and memory. Receiver operating characteristic analysis was used to determine which summary score best differentiated between the four cognitive states. The summary score that provided equal weight to each subtest best differentiated between the four cognitive states. A summary score that provides equal weight to each subtest is an efficient way to utilize all of the cognitive data collected by a neuropsychological battery. © The Author(s) 2015.
Including osteoprotegerin and collagen IV in a score-based blood test for liver fibrosis increases diagnostic accuracy.

Science.gov (United States)

Bosselut, Nelly; Taibi, Ludmia; Guéchot, Jérôme; Zarski, Jean-Pierre; Sturm, Nathalie; Gelineau, Marie-Christine; Poggi, Bernard; Thoret, Sophie; Lasnier, Elisabeth; Baudin, Bruno; Housset, Chantal; Vaubourdolle, Michel

2013-01-16

Noninvasive methods for liver fibrosis evaluation in chronic liver diseases have been recently developed, i.e. transient elastography (Fibroscan™) and blood tests (Fibrometer®, Fibrotest®, and Hepascore®). In this study, we aimed to design a new score in chronic hepatitis C (CHC) by selecting blood markers in a large panel and we compared its diagnostic performance with those of other noninvasive methods. Sixteen blood tests were performed in 306 untreated CHC patients included in a multicenter prospective study (ANRS HC EP 23 Fibrostar) using METAVIR histological fibrosis stage as reference. The new score was constructed by non linear regression using the most accurate biomarkers. Five markers (alpha-2-macroglobulin, apolipoprotein-A1, AST, collagen IV and osteoprotegerin) were included in the new function called Coopscore©. Using the Obuchowski Index, Coopscore© shows higher diagnostic performances than for Fibrometer®, Fibrotest®, Hepascore® and Fibroscan™ in CHC. Association between Fibroscan™ and Coopscore© might avoid 68% of liver biopsies for the diagnosis of significant fibrosis. Coopscore© provides higher accuracy than other noninvasive methods for the diagnosis of liver fibrosis in CHC. The association of Coopscore© with Fibroscan™ increases its predictive value. Copyright © 2012 Elsevier B.V. All rights reserved.
Predictive value of grade point average (GPA), Medical College Admission Test (MCAT), internal examinations (Block) and National Board of Medical Examiners (NBME) scores on Medical Council of Canada qualifying examination part I (MCCQE-1) scores.

Science.gov (United States)

Roy, Banibrata; Ripstein, Ira; Perry, Kyle; Cohen, Barry

2016-01-01

To determine whether the pre-medical Grade Point Average (GPA), Medical College Admission Test (MCAT), Internal examinations (Block) and National Board of Medical Examiners (NBME) scores are correlated with and predict the Medical Council of Canada Qualifying Examination Part I (MCCQE-1) scores. Data from 392 admitted students in the graduating classes of 2010-2013 at University of Manitoba (UofM), College of Medicine was considered. Pearson's correlation to assess the strength of the relationship, multiple linear regression to estimate MCCQE-1 score and stepwise linear regression to investigate the amount of variance were employed. Complete data from 367 (94%) students were studied. The MCCQE-1 had a moderate-to-large positive correlation with NBME scores and Block scores but a low correlation with GPA and MCAT scores. The multiple linear regression model gives a good estimate of the MCCQE-1 (R2 =0.604). Stepwise regression analysis demonstrated that 59.2% of the variation in the MCCQE-1 was accounted for by the NBME, but only 1.9% by the Block exams, and negligible variation came from the GPA and the MCAT. Amongst all the examinations used at UofM, the NBME is most closely correlated with MCCQE-1.
Hematoma shape, hematoma size, Glasgow coma scale score and ICH score: which predicts the 30-day mortality better for intracerebral hematoma?

Directory of Open Access Journals (Sweden)

Chih-Wei Wang

Full Text Available To investigate the performance of hematoma shape, hematoma size, Glasgow coma scale (GCS score, and intracerebral hematoma (ICH score in predicting the 30-day mortality for ICH patients. To examine the influence of the estimation error of hematoma size on the prediction of 30-day mortality.This retrospective study, approved by a local institutional review board with written informed consent waived, recruited 106 patients diagnosed as ICH by non-enhanced computed tomography study. The hemorrhagic shape, hematoma size measured by computer-assisted volumetric analysis (CAVA and estimated by ABC/2 formula, ICH score and GCS score was examined. The predicting performance of 30-day mortality of the aforementioned variables was evaluated. Statistical analysis was performed using Kolmogorov-Smirnov tests, paired t test, nonparametric test, linear regression analysis, and binary logistic regression. The receiver operating characteristics curves were plotted and areas under curve (AUC were calculated for 30-day mortality. A P value less than 0.05 was considered as statistically significant.The overall 30-day mortality rate was 15.1% of ICH patients. The hematoma shape, hematoma size, ICH score, and GCS score all significantly predict the 30-day mortality for ICH patients, with an AUC of 0.692 (P = 0.0018, 0.715 (P = 0.0008 (by ABC/2 to 0.738 (P = 0.0002 (by CAVA, 0.877 (P<0.0001 (by ABC/2 to 0.882 (P<0.0001 (by CAVA, and 0.912 (P<0.0001, respectively.Our study shows that hematoma shape, hematoma size, ICH scores and GCS score all significantly predict the 30-day mortality in an increasing order of AUC. The effect of overestimation of hematoma size by ABC/2 formula in predicting the 30-day mortality could be remedied by using ICH score.
Visuospatial characteristics of an elderly Chinese population: results from the WAIS-R block design test.

Science.gov (United States)

Yin, Shufei; Zhu, Xinyi; Huang, Xin; Li, Juan

2015-01-01

Visuospatial deficits have long been recognized as a potential predictor of dementia, with visuospatial ability decline having been found to accelerate in later stages of dementia. We, therefore, believe that the visuospatial performance of patients with mild cognitive impairment (MCI) and dementia (Dem) might change with varying visuospatial task difficulties. This study administered the Wechsler Adult Intelligence Scale-Revised (WAIS-R) Block Design Test (BDT) to determine whether visuospatial ability can help discriminate between MCI patients from Dem patients and normal controls (NC). Results showed that the BDT could contribute to the discrimination between MCI and Dem. Specifically, simple BDT task scores could best distinguish MCI from Dem patients, while difficult BDT task scores could contribute to discriminating between MCI and NC. Given the potential clinical value of the BDT in the diagnosis of Dem and MCI, normative data stratified by age and education for the Chinese elderly population are presented for use in research and clinical settings.
External quality assessment in the voluntary counseling and testing centers in the Brazilian Amazon using dried tube specimens: results of an effectiveness evaluation

Directory of Open Access Journals (Sweden)

Andréa Mônica Brandão Beber

2015-06-01

Full Text Available INTRODUCTION : In 2011, the Brazilian Ministry of Health rolled out a program for the external quality assessment of rapid human immunodeficiency virus (HIV tests using the dried tube specimen (DTS method (EQA-RT/DTS-HIV. Our objective was to evaluate the implementation of this program at 71 voluntary counseling and testing centers (VCTCs in the Brazilian Legal Amazonian area one year after its introduction. METHODS : Quantitative and qualitative study that analyzed secondary data and interviews with healthcare workers (HCWs (n=39 and VCTC coordinators (n=32 were performed. The assessment used 18 key indicators to evaluate the three dimensions of the program's logical framework: structure, process, and result. Each indicator was scored from 1-4, and the aggregate results corresponding to the dimensions were expressed as proportions. The results were compared to the perceptions of the HCWs and coordinators regarding the EQA-RT/DTS-HIV program. RESULTS: The aggregate scores for the three dimensions of structure, process, and result were 91.7%, 78.6%, and 95%, respectively. The lowest score in each dimension corresponded to a different indicator: access to Quali-TR online system 39% (structure, registration in Quali-TR online system 38.7% (process, and VCTC completed the full process in the program's first round 63.4% (result. Approximately 36% of the HCWs and 52% of the coordinators reported enhanced trust in the program for its rapid HIV testing performance. CONCLUSIONS: All three program dimensions exhibited satisfactory results (>75%. Nevertheless, the study findings highlight the need to improve certain program components. Additionally, long-term follow-ups is needed to provide a more thorough picture of the process for external quality assessment.
Longitudinal analysis of standardized test scores of students in the Science Writing Heuristic approach

Science.gov (United States)

Chanlen, Niphon

The purpose of this study was to examine the longitudinal impacts of the Science Writing Heuristic (SWH) approach on student science achievement measured by the Iowa Test of Basic Skills (ITBS). A number of studies have reported positive impact of an inquiry-based instruction on student achievement, critical thinking skills, reasoning skills, attitude toward science, etc. So far, studies have focused on exploring how an intervention affects student achievement using teacher/researcher-generated measurement. Only a few studies have attempted to explore the long-term impacts of an intervention on student science achievement measured by standardized tests. The students' science and reading ITBS data was collected from 2000 to 2011 from a school district which had adopted the SWH approach as the main approach in science classrooms since 2002. The data consisted of 12,350 data points from 3,039 students. The multilevel model for change with discontinuity in elevation and slope technique was used to analyze changes in student science achievement growth trajectories prior and after adopting the SWH approach. The results showed that the SWH approach positively impacted students by initially raising science achievement scores. The initial impact was maintained and gradually increased when students were continuously exposed to the SWH approach. Disadvantaged students who were at risk of having low science achievement had bigger benefits from experience with the SWH approach. As a result, existing problematic achievement gaps were narrowed down. Moreover, students who started experience with the SWH approach as early as elementary school seemed to have better science achievement growth compared to students who started experiencing with the SWH approach only in high school. The results found in this study not only confirmed the positive impacts of the SWH approach on student achievement, but also demonstrated additive impacts found when students had longitudinal experiences
The correlation between pedestrian injury severity in real-life crashes and Euro NCAP pedestrian test results.

Science.gov (United States)

Strandroth, Johan; Rizzi, Matteo; Sternlund, Simon; Lie, Anders; Tingvall, Claes

2011-12-01

The aim of the present study was to estimate the correlation between Euro NCAP pedestrian rating scores and injury outcome in real-life car-to-pedestrian crashes, with special focus on long-term disability. Another aim was to determine whether brake assist (BA) systems affect the injury outcome in real-life car-to-pedestrian crashes and to estimate the combined effects in injury reduction of a high Euro NCAP ranking score and BA. In the current study, the Euro NCAP pedestrian scoring was compared with the real-life outcome in pedestrian crashes that occurred in Sweden during 2003 to 2010. The real-life crash data were obtained from the data acquisition system Swedish Traffic Accident Data Acquisition (STRADA), which combines police records and hospital admission data. The medical data consisted of International Classification of Diseases (ICD) diagnoses and Abbreviated Injury Scale (AIS) scoring. In all, approximately 500 pedestrians submitted to hospital were included in the study. Each car model was coded according to Euro NCAP pedestrian scores. In addition, the presence or absence of BA was coded for each car involved. Cars were grouped according to their scoring. Injury outcomes were analyzed with AIS and, at the victim level, with permanent medical impairment. This was done by translating the injury scores for each individual to the risk of serious consequences (RSC) at 1, 5, and 10 percent risk of disability level. This indicates the total risk of a medical disability for each victim, given the severity and location of injuries. The mean RSC (mRSC) was then calculated for each car group and t-tests were conducted to falsify the null hypothesis at p ≤ .05 that the mRSC within the groups was equal. The results showed a significant reduction of injury severity for cars with better pedestrian scoring, although cars with a high score could not be studied due to lack of cases. The reduction in RSC for medium-performing cars in comparison with low-performing cars
Engineering model cryocooler test results

International Nuclear Information System (INIS)

Skimko, M.A.; Stacy, W.D.; McCormick, J.A.

1992-01-01

This paper reports that recent testing of diaphragm-defined, Stirling-cycle machines and components has demonstrated cooling performance potential, validated the design code, and confirmed several critical operating characteristics. A breadboard cryocooler was rebuilt and tested from cryogenic to near-ambient cold end temperatures. There was a significant increase in capacity at cryogenic temperatures and the performance results compared will with code predictions at all temperatures. Further testing on a breadboard diaphragm compressor validated the calculated requirement for a minimum axial clearance between diaphragms and mating heads
Irradiation Effects Test Series: Test IE-2. Test results report

International Nuclear Information System (INIS)

Allison, C.M.; Croucher, D.W.; Ploger, S.A.; Mehner, A.S.

1977-08-01

The report describes the results of a test using four 0.97-m long PWR-type fuel rods with differences in diametral gap and cladding irradiation. The objective of this test was to provide information about the effects of these differences on fuel rod behavior during quasi-equilibrium and film boiling operation. The fuel rods were subjected to a series of preconditioning power cycles of less than 30 kW/m. Rod powers were then increased to 68 kW/m at a coolant mass flux of 4900 kg/s-m 2 . After one hour at 68 kW/m, a power-cooling-mismatch sequence was initiated by a flow reduction at constant power. At a flow of 2550 kg/s-m 2 , the onset of film boiling occurred on one rod, Rod IE-011. An additional flow reduction to 2245 kg/s-m 2 caused the onset of film boiling on the remaining three rods. Data are presented on the behavior of fuel rods during quasiequilibrium and during film boiling operation. The effects of initial gap size, cladding irradiation, rod power cycling, a rapid power increase, and sustained film boiling are discussed. These discussions are based on measured test data, preliminary postirradiation examination results, and comparisons of results with FRAP-T3 computer model calculations
Multimodal Personal Verification Using Likelihood Ratio for the Match Score Fusion

Directory of Open Access Journals (Sweden)

Long Binh Tran

2017-01-01

Full Text Available In this paper, the authors present a novel personal verification system based on the likelihood ratio test for fusion of match scores from multiple biometric matchers (face, fingerprint, hand shape, and palm print. In the proposed system, multimodal features are extracted by Zernike Moment (ZM. After matching, the match scores from multiple biometric matchers are fused based on the likelihood ratio test. A finite Gaussian mixture model (GMM is used for estimating the genuine and impostor densities of match scores for personal verification. Our approach is also compared to some different famous approaches such as the support vector machine and the sum rule with min-max. The experimental results have confirmed that the proposed system can achieve excellent identification performance for its higher level in accuracy than different famous approaches and thus can be utilized for more application related to person verification.
Summary of CCTF test results

International Nuclear Information System (INIS)

Iguchi, T.; Murao, Y.; Sugimoto, J.; Akimoto, H.; Okubo, T.; Hojo, T.

1987-01-01

Conservatism of current safety analysis was assessed by comparing the predicted result with cylindrical core test facility (CCTF) test result performed at Japan Atomic Energy Research Institute. WREM code was selected for the assessment. The overall conservatism of the WREM code on the peak clad temperature prediction was confirmed against CCTF evaluation model (EM) test which simulated the typical initial and boundary conditions in the safety evaluation analysis. WREM code predicted the reasonable core boundary conditions and the conservatism of the code came mainly from core calculation. The conservatism of the WREM code against CCTF data could be attributed to the following three points: (1) no horizontal mixing assumption between subchannels at each elevation; (2) no modeling on heat transfer enhancement caused by the radial core power profile; and (3) conservative heat transfer correlations in the code

Raising test scores vs. teaching higher order thinking (HOT): senior science teachers' views on how several concurrent policies affect classroom practices

Science.gov (United States)

Zohar, Anat; Alboher Agmon, Vered

2018-04-01

This study investigates how senior science teachers viewed the effects of a Raising Test Scores policy and its implementation on instruction of higher order thinking (HOT), and on teaching thinking to students with low academic achievements.
A Comparison between Linear IRT Observed-Score Equating and Levine Observed-Score Equating under the Generalized Kernel Equating Framework

Science.gov (United States)

Chen, Haiwen

2012-01-01

In this article, linear item response theory (IRT) observed-score equating is compared under a generalized kernel equating framework with Levine observed-score equating for nonequivalent groups with anchor test design. Interestingly, these two equating methods are closely related despite being based on different methodologies. Specifically, when…
[The Collage Impression Scoring Scale (CIISS) may help predict sobriety for alcoholics].

Science.gov (United States)

Itoh, Mitsuru; Ishii, Takayoshi

2009-08-01

The Collage Impression Scoring Scale (CISS; Imamura, 2004) was used by 54 raters to score collages made by 24 alcoholics on admission to the hospital and at discharge. The CISS contains three factors: stability, expression and creativity. Comparisons using paired t-tests showed that the collages made at discharge had lower scores on the three CISS factors than the collages made on admission. The results for 11 alcoholics, who were followed for six months after discharge, showed that the scores for CISS factors for the abstinent group were lower than those for the relapsed drinking group. The abstinent group showed more anxiety than the relapsed drinking group. This result suggests that the abstinent alcoholics'anxieties were projected onto the collages because they were facing their internal problems more seriously. Thus the CISS was effective as a predictive index for alcoholics who maintain sobriety.
[The diagnostic scores for deep venous thrombosis].

Science.gov (United States)

Junod, A

2015-08-26

Seven diagnostic scores for the deep venous thrombosis (DVT) of lower limbs are analyzed and compared. Two features make this exer- cise difficult: the problem of distal DVT and of their proximal extension and the status of patients, whether out- or in-patients. The most popular score is the Wells score (1997), modi- fied in 2003. It includes one subjective ele- ment based on clinical judgment. The Primary Care score 12005), less known, has similar pro- perties, but uses only objective data. The pre- sent trend is to associate clinical scores with the dosage of D-Dimers to rule out with a good sensitivity the probability of TVP. For the upper limb DVT, the Constans score (2008) is available, which can also be coupled with D-Dimers testing (Kleinjan).
Automated essay scoring and the future of educational assessment in medical education.

Science.gov (United States)

Gierl, Mark J; Latifi, Syed; Lai, Hollis; Boulais, André-Philippe; De Champlain, André

2014-10-01

Constructed-response tasks, which range from short-answer tests to essay questions, are included in assessments of medical knowledge because they allow educators to measure students' ability to think, reason, solve complex problems, communicate and collaborate through their use of writing. However, constructed-response tasks are also costly to administer and challenging to score because they rely on human raters. One alternative to the manual scoring process is to integrate computer technology with writing assessment. The process of scoring written responses using computer programs is known as 'automated essay scoring' (AES). An AES system uses a computer program that builds a scoring model by extracting linguistic features from a constructed-response prompt that has been pre-scored by human raters and then, using machine learning algorithms, maps the linguistic features to the human scores so that the computer can be used to classify (i.e. score or grade) the responses of a new group of students. The accuracy of the score classification can be evaluated using different measures of agreement. Automated essay scoring provides a method for scoring constructed-response tests that complements the current use of selected-response testing in medical education. The method can serve medical educators by providing the summative scores required for high-stakes testing. It can also serve medical students by providing them with detailed feedback as part of a formative assessment process. Automated essay scoring systems yield scores that consistently agree with those of human raters at a level as high, if not higher, as the level of agreement among human raters themselves. The system offers medical educators many benefits for scoring constructed-response tasks, such as improving the consistency of scoring, reducing the time required for scoring and reporting, minimising the costs of scoring, and providing students with immediate feedback on constructed-response tasks. © 2014
Reproducibility of scoring emphysema by HRCT

International Nuclear Information System (INIS)

Malinen, A.; Partanen, K.; Rytkoenen, H.; Vanninen, R.; Erkinjuntti-Pekkanen, R.

2002-01-01

Purpose: We evaluated the reproducibility of three visual scoring methods of emphysema and compared these methods with pulmonary function tests (VC, DLCO, FEV1 and FEV%) among farmer's lung patients and farmers. Material and Methods: Three radiologists examined high-resolution CT images of farmer's lung patients and their matched controls (n=70) for chronic interstitial lung diseases. Intraobserver reproducibility and interobserver variability were assessed for three methods: severity, Sanders' (extent) and Sakai. Pulmonary function tests as spirometry and diffusing capacity were measured. Results: Intraobserver -values for all three methods were good (0.51-0.74). Interobserver varied from 0.35 to 0.72. The Sanders' and the severity methods correlated strongly with pulmonary function tests, especially DLCO and FEV1. Conclusion: The Sanders' method proved to be reliable in evaluating emphysema, in terms of good consistency of interpretation and good correlation with pulmonary function tests
Psychometric properties of a Swedish translation of the VISA-P outcome score for patellar tendinopathy

Directory of Open Access Journals (Sweden)

Edman Gunnar

2004-12-01

Full Text Available Abstract Background Self-administrated patient outcome scores are increasingly recommended for evaluation of primary outcome in clinical studies. The VISA-P score, developed at the Victorian Institute of Sport Assessment in Melbourne, Australia, is a questionnaire developed for patients with patellar tendinopathy and the patients assess severity of symptoms, function and ability to participate in sport. The aim of this study was to translate the questionnaire into Swedish and to study the reliability and validity of the translated questionnaire and resultant scores. Methods The questionnaire was translated into Swedish according to internationally recommended guidelines for cross-cultural adaptation of self-report measures. The reliability and validity were tested in three different populations. The populations used were healthy students (n = 17, members of the Swedish male national basketball team (n = 17, considered as a population at risk, and a group of non-surgically treated patients (n = 17 with clinically diagnosed patellar tendinopathy. The questionnaire was completed by 51 subjects altogether. Results The translated VISA-P questionnaire showed very good test-retest reliability (ICC = 0.97. The mean (± SD of the VISA-P score, at both the first and second test occasions was highest in the healthy student group 83 (± 13 and 81 (± 15, respectively. The score of the basketball players was 79 (± 24 and 80 (± 23, while the patient group scored significantly (p Conclusions The translated version of the VISA-P questionnaire was linguistically and culturally equivalent to the original version. The translated score showed good reliability.
The Motivated Strategies for Learning Questionnaire: score validity among medicine residents.

Science.gov (United States)

Cook, David A; Thompson, Warren G; Thomas, Kris G

2011-12-01

The Motivated Strategies for Learning Questionnaire (MSLQ) purports to measure motivation using the expectancy-value model. Although it is widely used in other fields, this instrument has received little study in health professions education. The purpose of this study was to evaluate the validity of MSLQ scores. We conducted a validity study evaluating the relationships of MSLQ scores to other variables and their internal structure (reliability and factor analysis). Participants included 210 internal medicine and family medicine residents participating in a web-based course on ambulatory medicine at an academic medical centre. Measurements included pre-course MSLQ scores, pre- and post-module motivation surveys, post-module knowledge test and post-module Instructional Materials Motivation Survey (IMMS) scores. Internal consistency was universally high for all MSLQ items together (Cronbach's α = 0.93) and for each domain (α ≥ 0.67). Total MSLQ scores showed statistically significant positive associations with post-test knowledge scores. For example, a 1-point rise in total MSLQ score was associated with a 4.4% increase in post-test scores (β = 4.4; p motivation and satisfaction. Scores on MSLQ domains demonstrated associations that generally aligned with our hypotheses. Self-efficacy and control of learning belief scores demonstrated the strongest domain-specific relationships with knowledge scores (β = 2.9 for both). Confirmatory factor analysis showed a borderline model fit. Follow-up exploratory factor analysis revealed the scores of five factors (self-efficacy, intrinsic interest, test anxiety, extrinsic goals, attribution) demonstrated psychometric and predictive properties similar to those of the original scales. Scores on the MSLQ are reliable and predict meaningful outcomes. However, the factor structure suggests a simplified model might better fit the empiric data. Future research might consider how assessing and responding to motivation could enhance
Risk score for first-screening of prevalent undiagnosed chronic kidney disease in Peru: the CRONICAS-CKD risk score.

Science.gov (United States)

Carrillo-Larco, Rodrigo M; Miranda, J Jaime; Gilman, Robert H; Medina-Lezama, Josefina; Chirinos-Pacheco, Julio A; Muñoz-Retamozo, Paola V; Smeeth, Liam; Checkley, William; Bernabe-Ortiz, Antonio

2017-11-29

Chronic Kidney Disease (CKD) represents a great burden for the patient and the health system, particularly if diagnosed at late stages. Consequently, tools to identify patients at high risk of having CKD are needed, particularly in limited-resources settings where laboratory facilities are scarce. This study aimed to develop a risk score for prevalent undiagnosed CKD using data from four settings in Peru: a complete risk score including all associated risk factors and another excluding laboratory-based variables. Cross-sectional study. We used two population-based studies: one for developing and internal validation (CRONICAS), and another (PREVENCION) for external validation. Risk factors included clinical- and laboratory-based variables, among others: sex, age, hypertension and obesity; and lipid profile, anemia and glucose metabolism. The outcome was undiagnosed CKD: eGFR anemia were strongly associated with undiagnosed CKD. In the external validation, at a cut-off point of 2, the complete and laboratory-free risk scores performed similarly well with a ROC area of 76.2% and 76.0%, respectively (P = 0.784). The best assessment parameter of these risk scores was their negative predictive value: 99.1% and 99.0% for the complete and laboratory-free, respectively. The developed risk scores showed a moderate performance as a screening test. People with a score of ≥ 2 points should undergo further testing to rule out CKD. Using the laboratory-free risk score is a practical approach in developing countries where laboratories are not readily available and undiagnosed CKD has significant morbidity and mortality.
Simple shoulder test and Oxford Shoulder Score: Persian translation and cross-cultural validation.

Science.gov (United States)

Naghdi, Soofia; Nakhostin Ansari, Noureddin; Rustaie, Nilufar; Akbari, Mohammad; Ebadi, Safoora; Senobari, Maryam; Hasson, Scott

2015-12-01

To translate, culturally adapt, and validate the simple shoulder test (SST) and Oxford Shoulder Score (OSS) into Persian language using a cross-sectional and prospective cohort design. A standard forward and backward translation was followed to culturally adapt the SST and the OSS into Persian language. Psychometric properties of floor and ceiling effects, construct convergent validity, discriminant validity, internal consistency reliability, test-retest reliability, standard error of the measurement (SEM), smallest detectable change (SDC), and factor structure were determined. One hundred patients with shoulder disorders and 50 healthy subjects participated in the study. The PSST and the POSS showed no missing responses. No floor or ceiling effects were observed. Both the PSST and POSS detected differences between patients and healthy subjects supporting their discriminant validity. Construct convergent validity was confirmed by a very good correlation between the PSST and POSS (r = 0.68). There was high internal consistency for both the PSST (α = 0.73) and the POSS (α = 0.91 and 0.92). Test-retest reliability with 1-week interval was excellent (ICCagreement = 0.94 for PSST and 0.90 for POSS). Factor analyses demonstrated a three-factor solution for the PSST (49.7 % of variance) and a two-factor solution for the POSS (61.6 % of variance). The SEM/SDC was satisfactory for PSST (5.5/15.3) and POSS (6.8/18.8). The PSST and POSS are valid and reliable outcome measures for assessing functional limitations in Persian-speaking patients with shoulder disorders.
Forging the Basis for Developing Protein-Ligand Interaction Scoring Functions.

Science.gov (United States)

Liu, Zhihai; Su, Minyi; Han, Li; Liu, Jie; Yang, Qifan; Li, Yan; Wang, Renxiao

2017-02-21

In structure-based drug design, scoring functions are widely used for fast evaluation of protein-ligand interactions. They are often applied in combination with molecular docking and de novo design methods. Since the early 1990s, a whole spectrum of protein-ligand interaction scoring functions have been developed. Regardless of their technical difference, scoring functions all need data sets combining protein-ligand complex structures and binding affinity data for parametrization and validation. However, data sets of this kind used to be rather limited in terms of size and quality. On the other hand, standard metrics for evaluating scoring function used to be ambiguous. Scoring functions are often tested in molecular docking or even virtual screening trials, which do not directly reflect the genuine quality of scoring functions. Collectively, these underlying obstacles have impeded the invention of more advanced scoring functions. In this Account, we describe our long-lasting efforts to overcome these obstacles, which involve two related projects. On the first project, we have created the PDBbind database. It is the first database that systematically annotates the protein-ligand complexes in the Protein Data Bank (PDB) with experimental binding data. This database has been updated annually since its first public release in 2004. The latest release (version 2016) provides binding data for 16 179 biomolecular complexes in PDB. Data sets provided by PDBbind have been applied to many computational and statistical studies on protein-ligand interaction and various subjects. In particular, it has become a major data resource for scoring function development. On the second project, we have established the Comparative Assessment of Scoring Functions (CASF) benchmark for scoring function evaluation. Our key idea is to decouple the "scoring" process from the "sampling" process, so scoring functions can be tested in a relatively pure context to reflect their quality. In our
Evaluation and scoring of radiotherapy treatment plans using an artificial neural network

International Nuclear Information System (INIS)

Willoughby, Twyla R.; Starkschall, George; Janjan, Nora A.; Rosen, Isaac I.

1996-01-01

Purpose: The objective of this work was to demonstrate the feasibility of using an artificial neural network to predict the clinical evaluation of radiotherapy treatment plans. Methods and Materials: Approximately 150 treatment plans were developed for 16 patients who received external-beam radiotherapy for soft-tissue sarcomas of the lower extremity. Plans were assigned a figure of merit by a radiation oncologist using a five-point rating scale. Plan scoring was performed by a single physician to ensure consistency in rating. Dose-volume information extracted from a training set of 511 treatment plans on 14 patients was correlated to the physician-generated figure of merit using an artificial neural network. The neural network was tested with a test set of 19 treatment plans on two patients whose plans were not used in the training of the neural net. Results: Physician scoring of treatment plans was consistent to within one point on the rating scale 88% of the time. The neural net reproduced the physician scores in the training set to within one point approximately 90% of the time. It reproduced the physician scores in the test set to within one point approximately 83% of the time. Conclusions: An artificial neural network can be trained to generate a score for a treatment plan that can be correlated to a clinically-based figure of merit. The accuracy of the neural net in scoring plans compares well with the reproducibility of the clinical scoring. The system of radiotherapy treatment plan evaluation using an artificial neural network demonstrates promise as a method for generating a clinically relevant figure of merit
Climax granite test results

Energy Technology Data Exchange (ETDEWEB)

Ramspott, L.D.

1980-01-15

The Lawrence Livermore Laboratory (LLL), as part of the Nevada Nuclear Waste Storage Investigations (NNWSI) program, is carrying out in situ rock mechanics testing in the Climax granitic stock at the Nevada Test Site (NTS). This summary addresses only those field data taken to date that address thermomechanical modeling for a hard-rock repository. The results to be discussed include thermal measurements in a heater test that was conducted from October 1977 through July 1978, and stress and displacement measurements made during and after excavation of the canister storage drift for the Spent Fuel Test (SFT) in the Climax granite. Associated laboratory and field measurements are summarized. The rock temperature for a given applied heat load at a point in time and space can be adequately modeled with simple analytic calculations involving superposition and integration of numerous point source solutions. The input, for locations beyond about a meter from the source, can be a constant thermal conductivity and diffusivity. The value of thermal conductivity required to match the field data is as much as 25% different from laboratory-measured values. Therefore, unless we come to understand the mechanisms for this difference, a simple in situ test will be required to obtain a value for final repository design. Some sensitivity calculations have shown that the temperature field is about ten times more sensitive to conductivity than to diffusivity under the test conditions. The orthogonal array was designed to detect anisotropy. After considering all error sources, anisotropic efforts in the thermal field were less than 5 to 10%.
Mathematics Admission Test Remarks

Directory of Open Access Journals (Sweden)

Ideon Erge

2016-12-01

Full Text Available Since 2014, there have been admission tests in mathematics for applicants to the Estonian University of Life Sciences for Geodesy, Land Management and Real Estate Planning; Civil Engineering; Hydraulic Engineering and Water Pollution Control; Engineering and Technetronics curricula. According to admission criteria, the test must be taken by students who have not passed the specific mathematics course state exam or when the score was less than 20 points. The admission test may also be taken by those who wish to improve their state exam score. In 2016, there were 126 such applicants of whom 63 took the test. In 2015, the numbers were 129 and 89 and in 2014 150 and 47 accordingly. The test was scored on scale of 100. The arithmetic average of the score was 30.6 points in 2016, 29.03 in 2015 and 18.84 in 2014. The test was considered to be passed with 1 point in 2014 and 20 points in 2015 and 2016. We analyzed test results and gave examples of problems which were solved exceptionally well or not at all.
Automated Scoring of Constructed-Response Science Items: Prospects and Obstacles

Science.gov (United States)

Liu, Ou Lydia; Brew, Chris; Blackmore, John; Gerard, Libby; Madhok, Jacquie; Linn, Marcia C.

2014-01-01

Content-based automated scoring has been applied in a variety of science domains. However, many prior applications involved simplified scoring rubrics without considering rubrics representing multiple levels of understanding. This study tested a concept-based scoring tool for content-based scoring, c-rater™, for four science items with rubrics…
Tradução e adaptação transcultural do "Harris Hip Score modificado por Byrd" Translation and transcultural adaptation of the modified Harris Hip Score

Directory of Open Access Journals (Sweden)

Rodrigo Pereira Guimarães

2010-01-01

Full Text Available OBJETIVO: As artroscopias do quadril têm sido utilizadas tanto para fins diagnósticos, como para fins terapêuticos, fazendo parte do arsenal rotineiro dos cirurgiões do quadril. Devido a necessidade de avaliação dos resultados artroscópicos, Byrd propôs a modificação do "Harris Hip Score", realizando a avaliação da dor e função. O objetivo deste estudo foi traduzir e adaptar transculturalmente o protocolo de avaliação do "Harris Hip Score" modificado por Byrd, utilizado nas artroscopias do quadril. MÉTODO: O método utilizado constituiu em: 1 tradução inicial, 2 retrotradução, 3 pré - teste e 4 teste definitivo. RESULTADOS: A versão em português foi aplicada em 30 pacientes com afecções do quadril para verificar o nível de compreensão do protocolo. Foram realizadas mudanças e substituições de termos e expressões que não foram entendidas pelos pacientes durante o pré-teste e realizada a versão final em consenso. Novamente a versão final do questionário foi aplicada com 100% de entendimento pelos pacientes. CONCLUSÃO: disponibiliza-se assim a versão final em português do questionário "Harris Hip Score" modificado por Byrd. A validação desta versão já está em desenvolvimento.OBJECTIVE: Hip arthroscopy has been used for diagnostic as well as therapeutic purposes, and it is part of the daily arsenal of hip surgeons. Due to the need for arthroscopic evaluation of the results, Byrd proposed a modification of the Harris Hip Score by assessing pain and function. This study aimed to translate and cross-culturally adapt the evaluation protocol of the modified Harris Hip Score used in hip arthroscopies. METHOD: The method used consisted of: 1 an initial translation, 2 a back translation, 3 a pre-test and 4 a final test. RESULTS: The Portuguese version was used with 30 patients with hip disorders to determine the level of comprehension of the protocol. Expressions which were not understood by patients during the
Abnormal Cervical Cancer Screening Test Results

Science.gov (United States)

... AQ FREQUENTLY ASKED QUESTIONS FAQ187 GYNECOLOGIC PROBLEMS Abnormal Cervical Cancer Screening Test Results • What is cervical cancer screening? • What causes abnormal cervical cancer screening test ...
Association between the Osteoporosis Self-Assessment Tool for Asians Score and Mortality in Patients with Isolated Moderate and Severe Traumatic Brain Injury: A Propensity Score-Matched Analysis.

Science.gov (United States)

Rau, Cheng-Shyuan; Kuo, Pao-Jen; Wu, Shao-Chun; Chen, Yi-Chun; Hsieh, Hsiao-Yun; Hsieh, Ching-Hua

2016-12-03

Background: The purpose of this study was to use a propensity score-matched analysis to investigate the association between the Osteoporosis Self-Assessment Tool for Asians (OSTA) scores and clinical outcomes of patients with isolated moderate and severe traumatic brain injury (TBI). Methods: The study population comprised 7855 patients aged ≥40 years who were hospitalized for treatment of isolated moderate and severe TBI (an Abbreviated Injury Scale (AIS) ≥3 points only in the head and not in other regions of the body) between 1 January 2009 and 31 December 2014. Patients were categorized as high-risk (OSTA score -1; n = 5359). Two-sided Pearson's chi-squared, or Fisher's exact tests were used to compare categorical data. Unpaired Student's t -test and Mann-Whitney U test were performed to analyze normally and non-normally distributed continuous data, respectively. Propensity score-matching in a 1:1 ratio was performed using NCSS software, with adjustment for covariates. Results: Compared to low-risk patients, high- and medium-risk patients were significantly older and injured more severely. The high- and medium-risk patients had significantly higher mortality rates, longer hospital length of stay, and a higher proportion of admission to the intensive care unit than low-risk patients. Analysis of propensity score-matched patients with adjusted covariates, including gender, co-morbidity, blood alcohol concentration level, Glasgow Coma Scale score, and Injury Severity Score revealed that high- and medium-risk patients still had a 2.4-fold (odds ratio (OR), 2.4; 95% confidence interval (CI), 1.39-4.15; p = 0.001) and 1.8-fold (OR, 1.8; 95% CI, 1.19-2.86; p = 0.005) higher mortality, respectively, than low-risk patients. However, further addition of age as a covariate for the propensity score-matching demonstrated that there was no significant difference between high-risk and low-risk patients or between medium-risk and low-risk patients, implying that older age
Irradiation Effects Test Series: Test IE-3. Test results report. [PWR

Energy Technology Data Exchange (ETDEWEB)

Farrar, L. C.; Allison, C. M.; Croucher, D. W.; Ploger, S. A.

1977-10-01

The objectives of the test reported were to: (a) determine the behavior of irradiated fuel rods subjected to a rapid power increase during which the possibility of a pellet-cladding mechanical interaction failure is enhanced and (b) determine the behavior of these fuel rods during film boiling following this rapid power increase. Test IE-3 used four 0.97-m long pressurized water reactor type fuel rods fabricated from previously irradiated fuel. The fuel rods were subjected to a preconditioning period, followed by a power ramp to 69 kW/m at a coolant mass flux of 4920 kg/s-m/sup 2/. After a flow reduction to 2120 kg/s-m/sup 2/, film boiling occurred on the fuel rods. One rod failed approximately 45 seconds after the reactor was shut down as a result of cladding embrittlement due to extensive cladding oxidation. Data are presented on the behavior of these irradiated fuel rods during steady-state operation, the power ramp, and film boiling operation. The effects of a power ramp and power ramp rates on pellet-cladding interaction are discussed. Test data are compared with FRAP-T3 computer model calculations and data from a previous Irradiation Effects test in which four irradiated fuel rods of a similar design were tested. Test IE-3 results indicate that the irradiated state of the fuel rods did not significantly affect fuel rod behavior during normal, abnormal (power ramp of 20 kW/m per minute), and accident (film boiling) conditions.
Novel, customizable scoring functions, parameterized using N-PLS, for structure-based drug discovery.

Science.gov (United States)

Catana, Cornel; Stouten, Pieter F W

2007-01-01

The ability to accurately predict biological affinity on the basis of in silico docking to a protein target remains a challenging goal in the CADD arena. Typically, "standard" scoring functions have been employed that use the calculated docking result and a set of empirical parameters to calculate a predicted binding affinity. To improve on this, we are exploring novel strategies for rapidly developing and tuning "customized" scoring functions tailored to a specific need. In the present work, three such customized scoring functions were developed using a set of 129 high-resolution protein-ligand crystal structures with measured Ki values. The functions were parametrized using N-PLS (N-way partial least squares), a multivariate technique well-known in the 3D quantitative structure-activity relationship field. A modest correlation between observed and calculated pKi values using a standard scoring function (r2 = 0.5) could be improved to 0.8 when a customized scoring function was applied. To mimic a more realistic scenario, a second scoring function was developed, not based on crystal structures but exclusively on several binding poses generated with the Flo+ docking program. Finally, a validation study was conducted by generating a third scoring function with 99 randomly selected complexes from the 129 as a training set and predicting pKi values for a test set that comprised the remaining 30 complexes. Training and test set r2 values were 0.77 and 0.78, respectively. These results indicate that, even without direct structural information, predictive customized scoring functions can be developed using N-PLS, and this approach holds significant potential as a general procedure for predicting binding affinity on the basis of in silico docking.

Comparative evaluation of chest radiography, low-field MRI, the Shwachman-Kulczycki score and pulmonary function tests in patients with cystic fibrosis

International Nuclear Information System (INIS)

Anjorin, Angela; Vogl, Thomas J.; Schmidt, Helga; Posselt, Hans-Georg; Smaczny, Christina; Ackermann, Hanns; Deimling, Michael; Abolmaali, Nasreddin

2008-01-01

The aim of this study was to investigate whether the parenchymal lung damage in patients suffering from cystic fibrosis (CF) can be equivalently quantified by the Chrispin-Norman (CN) scores determined with low-field magnetic resonance imaging (MRI) and conventional chest radiography (CXR). Both scores were correlated with pulmonary function tests (PFT) and the Shwachman-Kulczycki method (SKM). To evaluate the comparability of MRI and CXR for different states of the disease, all scores were applied to patients divided into three age groups. Seventy-three CF patients (mean SKM score: 62 ± 8) with a median age (range) of 14 years (7-32) were included. The mean CN scores determined with both imaging methods were comparable (CXR: 12.1 ± 4.7; MRI: 12.0 ± 4.5) and showed high correlation (P < 0.05, R = 0.97). Only weak correlations were found between imaging, PFT, and SKM. Both imaging modalities revealed significantly more severe disease expression with age, while PFT and SKM failed to detect early signs of disease. We conclude that imaging of the lung in CF patients is capable of detecting subtle and early parenchymal destruction before lung function or clinical scoring is affected. Furthermore, low-field MRI revealed high consistency with chest radiography and may be used for a thorough follow-up while avoiding radiation exposure. (orig.)
Comparison of Shock Response Spectrum for Different Gun Tests

Directory of Open Access Journals (Sweden)

J.A. Cordes

2013-01-01

Full Text Available The Soft Catch Gun at Picatinny Arsenal is regularly used for component testing. Most shots contain accelerometers which record accelerations as a function of time. Statistics of accelerometer data indicate that the muzzle exit accelerations are, on average, higher than tactical firings. For that reason, Soft Catch Gun tests with unusually high accelerations may not be scored for Lot Acceptance Tests (LAT by some customers. The 95/50 Normal Tolerance Limit (NTL is proposed as a means of determining which test results should be scored. This paper presents comparisons of Shock Response Spectra (SRS used for the 95/50 scoring criteria. The paper also provides a Discussion Section outlining some concerns with scoring LAT results based on test results outside of the proposed 95/50 criteria.
Identifying and Evaluating External Validity Evidence for Passing Scores

Science.gov (United States)

Davis-Becker, Susan L.; Buckendahl, Chad W.

2013-01-01

A critical component of the standard setting process is collecting evidence to evaluate the recommended cut scores and their use for making decisions and classifying students based on test performance. Kane (1994, 2001) proposed a framework by which practitioners can identify and evaluate evidence of the results of the standard setting from (1)…
How different from random are docking predictions when ranked by scoring functions?

DEFF Research Database (Denmark)

Feliu, Elisenda; Oliva, Baldomero

2010-01-01

on the number of near-native structures in the sampling. We studied the effect of filtering out redundant structures and tested the use of pair-potentials derived using ZDock and ZRank. Our results show that for many targets, it is not possible to determine when a successful reranking performed by scoring...... functions results merely from random choice. This analysis reveals that changes should be made in the design of the CAPRI scoring experiment. We propose including the statistical assessment in this experiment either at the preprocessing or the evaluation step....
Dual-energy X-ray absorptiometry diagnostic discordance between Z-scores and T-scores in young adults.

LENUS (Irish Health Repository)

Carey, John J

2009-01-01

Diagnostic criteria for postmenopausal osteoporosis using central dual-energy X-ray absorptiometry (DXA) T-scores have been widely accepted. The validity of these criteria for other populations, including premenopausal women and young men, has not been established. The International Society for Clinical Densitometry (ISCD) recommends using DXA Z-scores, not T-scores, for diagnosis in premenopausal women and men aged 20-49 yr, though studies supporting this position have not been published. We examined diagnostic agreement between DXA-generated T-scores and Z-scores in a cohort of men and women aged 20-49 yr, using 1994 World Health Organization and 2005 ISCD DXA criteria. Four thousand two hundred and seventy-five unique subjects were available for analysis. The agreement between DXA T-scores and Z-scores was moderate (Cohen\\'s kappa: 0.53-0.75). The use of Z-scores resulted in significantly fewer (McNemar\\'s p<0.001) subjects diagnosed with "osteopenia," "low bone mass for age," or "osteoporosis." Thirty-nine percent of Hologic (Hologic, Inc., Bedford, MA) subjects and 30% of Lunar (GE Lunar, GE Madison, WI) subjects diagnosed with "osteoporosis" by T-score were reclassified as either "normal" or "osteopenia" when their Z-score was used. Substitution of DXA Z-scores for T-scores results in significant diagnostic disagreement and significantly fewer persons being diagnosed with low bone mineral density.
The Effects of Balance Training on Stability and Proprioception Scores of the Ankle in College Students

Directory of Open Access Journals (Sweden)

Andrew L. Shim

2015-10-01

Full Text Available Objective: The purpose of this study was to determine if stability and proprioception scores improved on college-aged students using a slack line device. Methods: One group of 20 participants aged 18-23 from a Midwestern university performed a pre-test/post-test on a computerized posturography plate to determine Center of Pressure (CoP and Limit of Stability (LoS scores. Participants performed three 20-30 minute sessions per week of balance and proprioceptive training using a Balance Bow for a period of four weeks. Data were analyzed (SPSS 21.0 using a dependent t-test to determine if any changes occurred between pre- and post-test scores after four weeks. Results: The analyses found no significance difference in Center of Pressure (CoP, normal stability eyes open (NSEO, normal stability eyes closed (NSEC, perturbed stability eyes open (PSEO, perturbed stability eyes closed (PSEC, or LoS forward (F, backward (B, or right (R scores in college-aged participants. A significant difference was found in LoS left (L and a notable trend towards significance was found in LoS R results. Conclusion: With the exception of LoS L stability scores, it was concluded that 12 sessions of 20-30 minutes, utilizing a slack line device, over a four week training period did not significantly improve stability and proprioceptive scores of the ankle in college-aged participants. Keywords: Proprioception, Limit of Stability (LoS, Center of Pressure (CoP, slack line device
Direct power comparisons between simple LOD scores and NPL scores for linkage analysis in complex diseases.

Science.gov (United States)

Abreu, P C; Greenberg, D A; Hodge, S E

1999-09-01

Several methods have been proposed for linkage analysis of complex traits with unknown mode of inheritance. These methods include the LOD score maximized over disease models (MMLS) and the "nonparametric" linkage (NPL) statistic. In previous work, we evaluated the increase of type I error when maximizing over two or more genetic models, and we compared the power of MMLS to detect linkage, in a number of complex modes of inheritance, with analysis assuming the true model. In the present study, we compare MMLS and NPL directly. We simulated 100 data sets with 20 families each, using 26 generating models: (1) 4 intermediate models (penetrance of heterozygote between that of the two homozygotes); (2) 6 two-locus additive models; and (3) 16 two-locus heterogeneity models (admixture alpha = 1.0,.7,.5, and.3; alpha = 1.0 replicates simple Mendelian models). For LOD scores, we assumed dominant and recessive inheritance with 50% penetrance. We took the higher of the two maximum LOD scores and subtracted 0.3 to correct for multiple tests (MMLS-C). We compared expected maximum LOD scores and power, using MMLS-C and NPL as well as the true model. Since NPL uses only the affected family members, we also performed an affecteds-only analysis using MMLS-C. The MMLS-C was both uniformly more powerful than NPL for most cases we examined, except when linkage information was low, and close to the results for the true model under locus heterogeneity. We still found better power for the MMLS-C compared with NPL in affecteds-only analysis. The results show that use of two simple modes of inheritance at a fixed penetrance can have more power than NPL when the trait mode of inheritance is complex and when there is heterogeneity in the data set.
Validity Assessment of Low-risk SCORE Function and SCORE Function Calibrated to the Spanish Population in the FRESCO Cohorts.

Science.gov (United States)

Baena-Díez, José Miguel; Subirana, Isaac; Ramos, Rafael; Gómez de la Cámara, Agustín; Elosua, Roberto; Vila, Joan; Marín-Ibáñez, Alejandro; Guembe, María Jesús; Rigo, Fernando; Tormo-Díaz, María José; Moreno-Iribas, Conchi; Cabré, Joan Josep; Segura, Antonio; Lapetra, José; Quesada, Miquel; Medrano, María José; González-Diego, Paulino; Frontera, Guillem; Gavrila, Diana; Ardanaz, Eva; Basora, Josep; García, José María; García-Lareo, Manel; Gutiérrez-Fuentes, José Antonio; Mayoral, Eduardo; Sala, Joan; Dégano, Irene R; Francès, Albert; Castell, Conxa; Grau, María; Marrugat, Jaume

2018-04-01

To assess the validity of the original low-risk SCORE function without and with high-density lipoprotein cholesterol and SCORE calibrated to the Spanish population. Pooled analysis with individual data from 12 Spanish population-based cohort studies. We included 30 919 individuals aged 40 to 64 years with no history of cardiovascular disease at baseline, who were followed up for 10 years for the causes of death included in the SCORE project. The validity of the risk functions was analyzed with the area under the ROC curve (discrimination) and the Hosmer-Lemeshow test (calibration), respectively. Follow-up comprised 286 105 persons/y. Ten-year cardiovascular mortality was 0.6%. The ratio between estimated/observed cases ranged from 9.1, 6.5, and 9.1 in men and 3.3, 1.3, and 1.9 in women with original low-risk SCORE risk function without and with high-density lipoprotein cholesterol and calibrated SCORE, respectively; differences were statistically significant with the Hosmer-Lemeshow test between predicted and observed mortality with SCORE (P cardiovascular mortality observed in the Spanish population. Despite the acceptable discrimination capacity, prediction of the number of fatal cardiovascular events (calibration) was significantly inaccurate. Copyright © 2017 Sociedad Española de Cardiología. Published by Elsevier España, S.L.U. All rights reserved.
Preoptometry and optometry school grade point average and optometry admissions test scores as predictors of performance on the national board of examiners in optometry part I (basic science) examination.

Science.gov (United States)

Bailey, J E; Yackle, K A; Yuen, M T; Voorhees, L I

2000-04-01

To evaluate preoptometry and optometry school grade point averages and Optometry Admission Test (OAT) scores as predictors of performance on the National Board of Examiners in Optometry NBEO Part I (Basic Science) (NBEOPI) examination. Simple and multiple correlation coefficients were computed from data obtained from a sample of three consecutive classes of optometry students (1995-1997; n = 278) at Southern California College of Optometry. The GPA after year two of optometry school was the highest correlation (r = 0.75) among all predictor variables; the average of all scores on the OAT was the highest correlation among preoptometry predictor variables (r = 0.46). Stepwise regression analysis indicated a combination of the optometry GPA, the OAT Academic Average, and the GPA in certain optometry curricular tracks resulted in an improved correlation (multiple r = 0.81). Predicted NBEOPI scores were computed from the regression equation and then analyzed by receiver operating characteristic (roc) and statistic of agreement (kappa) methods. From this analysis, we identified the predicted score that maximized identification of true and false NBEOPI failures (71% and 10%, respectively). Cross validation of this result on a separate class of optometry students resulted in a slightly lower correlation between actual and predicted NBEOPI scores (r = 0.77) but showed the criterion-predicted score to be somewhat lax. The optometry school GPA after 2 years is a reasonably good predictor of performance on the full NBEOPI examination, but the prediction is enhanced by adding the Academic Average OAT score. However, predicting performance in certain subject areas of the NBEOPI examination, for example Psychology and Ocular/Visual Biology, was rather insubstantial. Nevertheless, predicting NBEOPI performance from the best combination of year two optometry GPAs and preoptometry variables is better than has been shown in previous studies predicting optometry GPA from the best
High resolution CT in children with cystic fibrosis: correlation with pulmonary functions and radiographic scores

International Nuclear Information System (INIS)

Demirkazik, Figen Basaran; Ariyuerek, O. Macit; Oezcelik, Ugur; Goecmen, Ayhan; Hassanabad, Hossein K.; Kiper, Nural

2001-01-01

Objective: To compare the high resolution CT (HRCT) scores of the Bhalla system with pulmonary function tests and radiographic and clinical points of the Shwachman-Kulczycki clinical scoring system. Methods: HRCT of the chest was obtained in 40 children to assess the role of HRCT in evaluating bronchopulmonary pathology in children with cystic fibrosis (CF). The HRCT severity scores of the Bhalla system were compared with chest radiographic and clinical points of the Shwachman-Kulczycki scoring system and pulmonary function tests. Only 14 of the patients older than 6 years cooperated with spirometry. Results: HRCT scores correlated well with radiographic points (r=0.80, P 1 (r=0.66, P=0.01). Although radiographic points correlated significantly with FVC (r=0.61, P=0.02) and FEV 1 (r=0.56, P=0.04), HRCT provides a more precise scoring than the chest X-ray. Conclusion: The HRCT scoring system may provide a sensitive method of monitoring pulmonary disease status and may replace the radiographic scoring in the Shwachman-Kulczycki system. It may be helpful especially in follow-up of small children too young to cooperate with spirometry
Regression-Based Norms for a Bi-factor Model for Scoring the Brief Test of Adult Cognition by Telephone (BTACT).

Science.gov (United States)

Gurnani, Ashita S; John, Samantha E; Gavett, Brandon E

2015-05-01

The current study developed regression-based normative adjustments for a bi-factor model of the The Brief Test of Adult Cognition by Telephone (BTACT). Archival data from the Midlife Development in the United States-II Cognitive Project were used to develop eight separate linear regression models that predicted bi-factor BTACT scores, accounting for age, education, gender, and occupation-alone and in various combinations. All regression models provided statistically significant fit to the data. A three-predictor regression model fit best and accounted for 32.8% of the variance in the global bi-factor BTACT score. The fit of the regression models was not improved by gender. Eight different regression models are presented to allow the user flexibility in applying demographic corrections to the bi-factor BTACT scores. Occupation corrections, while not widely used, may provide useful demographic adjustments for adult populations or for those individuals who have attained an occupational status not commensurate with expected educational attainment. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
The Relationships between Social Class, Listening Test Anxiety and Test Scores

OpenAIRE

Omid Talebi Rezaabadi

2016-01-01

This study investigated the relationships between the social anxiety, social class and listening-test anxiety of students learning English as a foreign language. The aims of the study were to examine the relationship between listening-test anxiety and listening-test performance. The data were collected using an adapted Foreign Language Listening Anxiety Scale and a newly developed Foreign Language Social Anxiety Scale. The potential correlation between social anxiety and listening-test perfor...
Accuracy of a pediatric early warning score in the recognition of clinical deterioration

Directory of Open Access Journals (Sweden)

Juliana de Oliveira Freitas Miranda

Full Text Available ABSTRACT Objective: to evaluate the accuracy of the version of the Brighton Pediatric Early Warning Score translated and adapted for the Brazilian context, in the recognition of clinical deterioration. Method: a diagnostic test study to measure the accuracy of the Brighton Pediatric Early Warning Score for the Brazilian context, in relation to a reference standard. The sample consisted of 271 children, aged 0 to 10 years, blindly evaluated by a nurse and a physician, specialists in pediatrics, with interval of 5 to 10 minutes between the evaluations, for the application of the Brighton Pediatric Early Warning Score for the Brazilian context and of the reference standard. The data were processed and analyzed using the Statistical Package for the Social Sciences and VassarStats.net programs. The performance of the Brighton Pediatric Early Warning Score for the Brazilian context was evaluated through the indicators of sensitivity, specificity, predictive values, area under the ROC curve, likelihood ratios and post-test probability. Results: the Brighton Pediatric Early Warning Score for the Brazilian context showed sensitivity of 73.9%, specificity of 95.5%, positive predictive value of 73.3%, negative predictive value of 94.7%, area under Receiver Operating Characteristic Curve of 91.9% and the positive post-test probability was 80%. Conclusion: the Brighton Pediatric Early Warning Score for the Brazilian context, presented good performance, considered valid for the recognition of clinical deterioration warning signs of the children studied.
Reliability, validity and sensitivity to change of neurogenic bowel dysfunction score in patients with spinal cord injury

DEFF Research Database (Denmark)

Erdem, D.; Hava, D.; Keskinoglu, P.

2017-01-01

cord injury (SCI). The reliability of NBD score was assessed by test-retest reliability and internal consistency. Cronbach's alpha coefficient was calculated to determine internal consistency. The construct validity was evaluated by exploring correlations between the NBD score and SF-36 scales, patient...... assessment of impact of NBD on quality of life (QoL) and the physician global assessment (PGA). The Global Rating of Change (GRC) scale was used to assess the change of NBD to investigate the sensitivity of the score to change. Results: Cronbach's alpha coefficient was 0.547. In test-retest reliability...
Empirical validation of the S-Score algorithm in the analysis of gene expression data

Directory of Open Access Journals (Sweden)

Archer Kellie J

2006-03-01

Full Text Available Abstract Background Current methods of analyzing Affymetrix GeneChip® microarray data require the estimation of probe set expression summaries, followed by application of statistical tests to determine which genes are differentially expressed. The S-Score algorithm described by Zhang and colleagues is an alternative method that allows tests of hypotheses directly from probe level data. It is based on an error model in which the detected signal is proportional to the probe pair signal for highly expressed genes, but approaches a background level (rather than 0 for genes with low levels of expression. This model is used to calculate relative change in probe pair intensities that converts probe signals into multiple measurements with equalized errors, which are summed over a probe set to form the S-Score. Assuming no expression differences between chips, the S-Score follows a standard normal distribution, allowing direct tests of hypotheses to be made. Using spike-in and dilution datasets, we validated the S-Score method against comparisons of gene expression utilizing the more recently developed methods RMA, dChip, and MAS5. Results The S-score showed excellent sensitivity and specificity in detecting low-level gene expression changes. Rank ordering of S-Score values more accurately reflected known fold-change values compared to other algorithms. Conclusion The S-score method, utilizing probe level data directly, offers significant advantages over comparisons using only probe set expression summaries.
Reproducibility of scoring emphysema by HRCT

Energy Technology Data Exchange (ETDEWEB)

Malinen, A.; Partanen, K.; Rytkoenen, H.; Vanninen, R. [Kuopio Univ. Hospital (Finland). Dept. of Clinical Radiology; Erkinjuntti-Pekkanen, R. [Kuopio Univ. Hospital (Finland). Dept. of Pulmonary Diseases

2002-04-01

Purpose: We evaluated the reproducibility of three visual scoring methods of emphysema and compared these methods with pulmonary function tests (VC, DLCO, FEV1 and FEV%) among farmer's lung patients and farmers. Material and Methods: Three radiologists examined high-resolution CT images of farmer's lung patients and their matched controls (n=70) for chronic interstitial lung diseases. Intraobserver reproducibility and interobserver variability were assessed for three methods: severity, Sanders' (extent) and Sakai. Pulmonary function tests as spirometry and diffusing capacity were measured. Results: Intraobserver -values for all three methods were good (0.51-0.74). Interobserver varied from 0.35 to 0.72. The Sanders' and the severity methods correlated strongly with pulmonary function tests, especially DLCO and FEV1. Conclusion: The Sanders' method proved to be reliable in evaluating emphysema, in terms of good consistency of interpretation and good correlation with pulmonary function tests.
BWR Full Integral Simulation Test (FIST). Phase I test results

International Nuclear Information System (INIS)

Hwang, W.S.; Alamgir, M.; Sutherland, W.A.

1984-09-01

A new full height BWR system simulator has been built under the Full-Integral-Simulation-Test (FIST) program to investigate the system responses to various transients. The test program consists of two test phases. This report provides a summary, discussions, highlights and conclusions of the FIST Phase I tests. Eight matrix tests were conducted in the FIST Phase I. These tests have investigated the large break, small break and steamline break LOCA's, as well as natural circulation and power transients. Results and governing phenomena of each test have been evaluated and discussed in detail in this report. One of the FIST program objectives is to assess the TRAC code by comparisons with test data. Two pretest predictions made with TRACB02 are presented and compared with test data in this report
Modified Augmented Renal Clearance Score Predicts Rapid Piperacillin and Tazobactam Clearance in Critically Ill Surgery and Trauma Patients

Science.gov (United States)

2014-04-24

collision; VAP , ventilator-associated pneumonia. TABLE 2. PK Parameter Estimates for Free Piperacillin and Tazobactam in Patients Stratified by ARC Score...SOFA score are typically generated during routine care of the most severely ill patients . Positive screening test results (high ARC scores) can be...Modified Augmented Renal Clearance score predicts rapid piperacillin and tazobactam clearance in critically ill surgery and trauma patients Kevin S
Automated aortic calcium scoring on low-dose chest computed tomography

International Nuclear Information System (INIS)

Isgum, Ivana; Rutten, Annemarieke; Prokop, Mathias; Staring, Marius; Klein, Stefan; Pluim, Josien P. W.; Viergever, Max A.; Ginneken, Bram van

2010-01-01

Purpose: Thoracic computed tomography (CT) scans provide information about cardiovascular risk status. These scans are non-ECG synchronized, thus precise quantification of coronary calcifications is difficult. Aortic calcium scoring is less sensitive to cardiac motion, so it is an alternative to coronary calcium scoring as an indicator of cardiovascular risk. The authors developed and evaluated a computer-aided system for automatic detection and quantification of aortic calcifications in low-dose noncontrast-enhanced chest CT. Methods: The system was trained and tested on scans from participants of a lung cancer screening trial. A total of 433 low-dose, non-ECG-synchronized, noncontrast-enhanced 16 detector row examinations of the chest was randomly divided into 340 training and 93 test data sets. A first observer manually identified aortic calcifications on training and test scans. A second observer did the same on the test scans only. First, a multiatlas-based segmentation method was developed to delineate the aorta. Segmented volume was thresholded and potential calcifications (candidate objects) were extracted by three-dimensional connected component labeling. Due to image resolution and noise, in rare cases extracted candidate objects were connected to the spine. They were separated into a part outside and parts inside the aorta, and only the latter was further analyzed. All candidate objects were represented by 63 features describing their size, position, and texture. Subsequently, a two-stage classification with a selection of features and k-nearest neighbor classifiers was performed. Based on the detected aortic calcifications, total calcium volume score was determined for each subject. Results: The computer system correctly detected, on the average, 945 mm 3 out of 965 mm 3 (97.9%) calcified plaque volume in the aorta with an average of 64 mm 3 of false positive volume per scan. Spearman rank correlation coefficient was ρ=0.960 between the system and the
Quantification of Emphysema with a Three-Dimensional Chest CT Scan: Correlation with the Visual Emphysema Scoring on Chest CT, Pulmonary Function Tests and Dyspnea Severity

Energy Technology Data Exchange (ETDEWEB)

Park, Hyun Jeong; Hwang, Jung Hwa [Dept. of Radiology, Soonchunhyang University Seoul Hospital, Seoul (Korea, Republic of)

2011-09-15

We wanted to prospectively evaluate the correlation between the quantification of emphysema using 3D CT densitometry with the visual emphysema score, pulmonary function tests (PFT) and the dyspnea score in patients with chronic obstructive pulmonary disease (COPD). Non-enhanced chest CT with 3D reconstruction was performed in 28 men with COPD (age 54-88 years). With histogram analysis, the total lung volume, mean lung density and proportion of low attenuation lung volume below predetermined thresholds were measured. The CT parameters were compared with the visual emphysema score, the PFT and the dyspnea score. A low attenuation lung volume below -950 HU was well correlated with the DLco and FEV{sub 1}/FVC. A Low attenuation lung volume below -950 HU and -930 HU was correlated with visual the emphysema score. A low attenuation lung volume below -950 HU was correlated with the dyspnea score, although the correlations between the other CT parameters and the dyspnea score were not significant. Objective quantification of emphysema using 3D CT densitometry was correlated with the visual emphysema score. A low attenuation lung volume below -950 HU was correlated with the DLco, the FEV{sub 1}/FVC and the dyspnea score.

Quantification of Emphysema with a Three-Dimensional Chest CT Scan: Correlation with the Visual Emphysema Scoring on Chest CT, Pulmonary Function Tests and Dyspnea Severity

International Nuclear Information System (INIS)

Park, Hyun Jeong; Hwang, Jung Hwa

2011-01-01

We wanted to prospectively evaluate the correlation between the quantification of emphysema using 3D CT densitometry with the visual emphysema score, pulmonary function tests (PFT) and the dyspnea score in patients with chronic obstructive pulmonary disease (COPD). Non-enhanced chest CT with 3D reconstruction was performed in 28 men with COPD (age 54-88 years). With histogram analysis, the total lung volume, mean lung density and proportion of low attenuation lung volume below predetermined thresholds were measured. The CT parameters were compared with the visual emphysema score, the PFT and the dyspnea score. A low attenuation lung volume below -950 HU was well correlated with the DLco and FEV 1 /FVC. A Low attenuation lung volume below -950 HU and -930 HU was correlated with visual the emphysema score. A low attenuation lung volume below -950 HU was correlated with the dyspnea score, although the correlations between the other CT parameters and the dyspnea score were not significant. Objective quantification of emphysema using 3D CT densitometry was correlated with the visual emphysema score. A low attenuation lung volume below -950 HU was correlated with the DLco, the FEV 1 /FVC and the dyspnea score.
Automated Scoring of Short-Answer Open-Ended GRE® Subject Test Items. ETS GRE® Board Research Report No. 04-02. ETS RR-08-20

Science.gov (United States)

Attali, Yigal; Powers, Don; Freedman, Marshall; Harrison, Marissa; Obetz, Susan

2008-01-01

This report describes the development, administration, and scoring of open-ended variants of GRE® Subject Test items in biology and psychology. These questions were administered in a Web-based experiment to registered examinees of the respective Subject Tests. The questions required a short answer of 1-3 sentences, and responses were automatically…
The Relationships between Social Class, Listening Test Anxiety and Test Scores

Science.gov (United States)

Rezaabadi, Omid Talebi

2016-01-01

This study investigated the relationships between the social anxiety, social class and listening-test anxiety of students learning English as a foreign language. The aims of the study were to examine the relationship between listening-test anxiety and listening-test performance. The data were collected using an adapted Foreign Language Listening…
Results of steel containment vessel model test

International Nuclear Information System (INIS)

Luk, V.K.; Ludwigsen, J.S.; Hessheimer, M.F.; Komine, Kuniaki; Matsumoto, Tomoyuki; Costello, J.F.

1998-05-01

A series of static overpressurization tests of scale models of nuclear containment structures is being conducted by Sandia National Laboratories for the Nuclear Power Engineering Corporation of Japan and the US Nuclear Regulatory Commission. Two tests are being conducted: (1) a test of a model of a steel containment vessel (SCV) and (2) a test of a model of a prestressed concrete containment vessel (PCCV). This paper summarizes the conduct of the high pressure pneumatic test of the SCV model and the results of that test. Results of this test are summarized and are compared with pretest predictions performed by the sponsoring organizations and others who participated in a blind pretest prediction effort. Questions raised by this comparison are identified and plans for posttest analysis are discussed
The Effect of English Language on Multiple Choice Question Scores of Thai Medical Students.

Science.gov (United States)

Phisalprapa, Pochamana; Muangkaew, Wayuda; Assanasen, Jintana; Kunavisarut, Tada; Thongngarm, Torpong; Ruchutrakool, Theera; Kobwanthanakun, Surapon; Dejsomritrutai, Wanchai

2016-04-01

Universities in Thailand are preparing for Thailand's integration into the ASEAN Economic Community (AEC) by increasing the number of tests in English language. English language is not the native language of Thailand Differences in English language proficiency may affect scores among test-takers, even when subject knowledge among test-takers is comparable and may falsely represent the knowledge level of the test-taker. To study the impact of English language multiple choice test questions on test scores of medical students. The final examination of fourth-year medical students completing internal medicine rotation contains 120 multiple choice questions (MCQ). The languages used on the test are Thai and English at a ratio of 3:1. Individual scores of tests taken in both languages were collected and the effect of English language on MCQ was analyzed Individual MCQ scores were then compared with individual student English language proficiency and student grade point average (GPA). Two hundred ninety five fourth-year medical students were enrolled. The mean percentage of MCQ scores in Thai and English were significantly different (65.0 ± 8.4 and 56.5 ± 12.4, respectively, p English was fair (Spearman's correlation coefficient = 0.41, p English than in Thai language. Students were classified into six grade categories (A, B+, B, C+, C, and D+), which cumulatively measured total internal medicine rotation performance score plus final examination score. MCQ scores from Thai language examination were more closely correlated with total course grades than were the scores from English language examination (Spearman's correlation coefficient = 0.73 (p English proficiency score was very high, at 3.71 ± 0.35 from a total of 4.00. Mean student GPA was 3.40 ± 0.33 from a possible 4.00. English language MCQ examination scores were more highly associated with GPA than with English language proficiency. The use of English language multiple choice question test may decrease scores
Posterior probability of linkage and maximal lod score.

Science.gov (United States)

Génin, E; Martinez, M; Clerget-Darpoux, F

1995-01-01

To detect linkage between a trait and a marker, Morton (1955) proposed to calculate the lod score z(theta 1) at a given value theta 1 of the recombination fraction. If z(theta 1) reaches +3 then linkage is concluded. However, in practice, lod scores are calculated for different values of the recombination fraction between 0 and 0.5 and the test is based on the maximum value of the lod score Zmax. The impact of this deviation of the test on the probability that in fact linkage does not exist, when linkage was concluded, is documented here. This posterior probability of no linkage can be derived by using Bayes' theorem. It is less than 5% when the lod score at a predetermined theta 1 is used for the test. But, for a Zmax of +3, we showed that it can reach 16.4%. Thus, considering a composite alternative hypothesis instead of a single one decreases the reliability of the test. The reliability decreases rapidly when Zmax is less than +3. Given a Zmax of +2.5, there is a 33% chance that linkage does not exist. Moreover, the posterior probability depends not only on the value of Zmax but also jointly on the family structures and on the genetic model. For a given Zmax, the chance that linkage exists may then vary.
Ticagrelor versus clopidogrel in real-world patients with ST elevation myocardial infarction: 1-year results by propensity score analysis.

Science.gov (United States)

Vercellino, Matteo; Sànchez, Federico Ariel; Boasi, Valentina; Perri, Dino; Tacchi, Chiara; Secco, Gioel Gabrio; Cattunar, Stefano; Pistis, Gianfranco; Mascelli, Giovanni

2017-04-05

European guidelines recommend the use of ticagrelor versus clopidogrel in patients with ST elevation myocardial infarction (STEMI). This recommendation is based on inconclusive results and subanalyses from clinical trials. Few data are available on the effects of ticagrelor in a real-world population. To compare the effects of ticagrelor and clopidogrel in a real-world STEMI population, we conducted a pre-post case-control study examining all patients with STEMI included in the Cardio-STEMI Sanremo registry between February 2011 and June 2013. Cases and controls were defined according to P2Y 12 inhibitors, correcting the bias due to lack of randomization by propensity score analysis. Ticagrelor was introduced in 2012 in both in-hospital and pre-hospital settings independently of this study. Of the 416 patients enrolled in the Cardio-STEMI registry, 401 with a definite diagnosis of STEMI were included in this study. One hundred forty-two patients received ticagrelor and 259 received clopidogrel. Regarding clinical presentation and procedural data, those in the ticagrelor group had lower CRUSADE scores (23 [14-36] vs 27 [18-38]; p = 0.015] but a higher proportion of radial access (33% vs 14%; p word propensity score analysis, ticagrelor did not affect the risk of MACE during the hospital phase, or the incidence of hospital bleeding in patients with STEMI. However, in this mono-centric experience, ticagrelor resulted in improved 1-year survival, even after correction by propensity score.
Superconducting solenoid model magnet test results

International Nuclear Information System (INIS)

Carcagno, R.; Dimarco, J.; Feher, S.; Ginsburg, C.M.; Hess, C.; Kashikhin, V.V.; Orris, D.F.; Pischalnikov, Y.; Sylvester, C.; Tartaglia, M.A.; Terechkine, I.; Tompkins, J.C.; Wokas, T.; Fermilab

2006-01-01

Superconducting solenoid magnets suitable for the room temperature front end of the Fermilab High Intensity Neutrino Source (formerly known as Proton Driver), an 8 GeV superconducting H- linac, have been designed and fabricated at Fermilab, and tested in the Fermilab Magnet Test Facility. We report here results of studies on the first model magnets in this program, including the mechanical properties during fabrication and testing in liquid helium at 4.2 K, quench performance, and magnetic field measurements. We also describe new test facility systems and instrumentation that have been developed to accomplish these tests
Superconducting solenoid model magnet test results

Energy Technology Data Exchange (ETDEWEB)

Carcagno, R.; Dimarco, J.; Feher, S.; Ginsburg, C.M.; Hess, C.; Kashikhin, V.V.; Orris, D.F.; Pischalnikov, Y.; Sylvester, C.; Tartaglia, M.A.; Terechkine, I.; /Fermilab

2006-08-01

Superconducting solenoid magnets suitable for the room temperature front end of the Fermilab High Intensity Neutrino Source (formerly known as Proton Driver), an 8 GeV superconducting H- linac, have been designed and fabricated at Fermilab, and tested in the Fermilab Magnet Test Facility. We report here results of studies on the first model magnets in this program, including the mechanical properties during fabrication and testing in liquid helium at 4.2 K, quench performance, and magnetic field measurements. We also describe new test facility systems and instrumentation that have been developed to accomplish these tests.
Level of Intrauterine Cocaine Exposure and Neuropsychological Test Scores in Preadolescence: Subtle Effects on Auditory Attention and Narrative Memory

Science.gov (United States)

Beeghly, Marjorie; Rose-Jacobs, Ruth; Martin, Brett M.; Cabral, Howard J.; Heeren, Timothy C.; Frank, Deborah A.

2014-01-01

Neuropsychological processes such as attention and memory contribute to children's higher-level cognitive and language functioning and predict academic achievement. The goal of this analysis was to evaluate whether level of intrauterine cocaine exposure (IUCE) alters multiple aspects of preadolescents' neuropsychological functioning assessed using a single age-referenced instrument, the NEPSY: A Developmental Neuropsychological Assessment (NEPSY) [71], after controlling for relevant covariates. Participants included 137 term 9.5-year-old children from low-income urban backgrounds (51% male, 90% African American/Caribbean) from an ongoing prospective longitudinal study. Level of IUCE was assessed in the newborn period using infant meconium and maternal report. 52% of the children had IUCE (65% with lighter IUCE, and 35% with heavier IUCE), and 48% were unexposed. Infants with Fetal Alcohol Syndrome, HIV seropositivity, or intrauterine exposure to illicit substances other than cocaine and marijuana were excluded. At the 9.5-year follow-up visit, trained examiners masked to IUCE and background variables evaluated children's neuropsychological functioning using the NEPSY. The association between level of IUCE and NEPSY outcomes was evaluated in a series of linear regressions controlling for intrauterine exposure to other substances and relevant child, caregiver, and demographic variables. Results indicated that level of IUCE was associated with lower scores on the Auditory Attention and Narrative Memory tasks, both of which require auditory information processing and sustained attention for successful performance. However, results did not follow the expected ordinal, dose-dependent pattern. Children's neuropsychological test scores were also altered by a variety of other biological and psychosocial factors. PMID:24978115
Test-retest reliability at the item level and total score level of the Norwegian version of the Spinal Cord Injury Falls Concern Scale (SCI-FCS).

Science.gov (United States)

Roaldsen, Kirsti Skavberg; Måøy, Åsa Blad; Jørgensen, Vivien; Stanghelle, Johan Kvalvik

2016-05-01

Translation of the Spinal Cord Injury Falls Concern Scale (SCI-FCS), and investigation of test-retest reliability on item-level and total-score-level. Translation, adaptation and test-retest study. A specialized rehabilitation setting in Norway. Fifty-four wheelchair users with a spinal cord injury. The median age of the cohort was 49 years, and the median number of years after injury was 13. Interventions/measurements: The SCI-FCS was translated and back-translated according to guidelines. Individuals answered the SCI-FCS twice over the course of one week. We investigated item-level test-retest reliability using Svensson's rank-based statistical method for disagreement analysis of paired ordinal data. For relative reliability, we analyzed the total-score-level test-retest reliability with intraclass correlation coefficients (ICC2.1), the standard error of measurement (SEM), and the smallest detectable change (SDC) for absolute reliability/measurement-error assessment and Cronbach's alpha for internal consistency. All items showed satisfactory percentage agreement (≥69%) between test and retest. There were small but non-negligible systematic disagreements among three items; we recovered an 11-13% higher chance for a lower second score. There was no disagreement due to random variance. The test-retest agreement (ICC2.1) was excellent (0.83). The SEM was 2.6 (12%), and the SDC was 7.1 (32%). The Cronbach's alpha was high (0.88). The Norwegian SCI-FCS is highly reliable for wheelchair users with chronic spinal cord injuries.
The predictive value of an adjusted COPD assessment test score on the risk of respiratory-related hospitalizations in severe COPD patients.

Science.gov (United States)

Barton, Christopher A; Bassett, Katherine L; Buckman, Julie; Effing, Tanja W; Frith, Peter A; van der Palen, Job; Sloots, Joanne M

2017-02-01

We evaluated whether a chronic obstructive pulmonary disease (COPD) assessment test (CAT) with adjusted weights for the CAT items could better predict future respiratory-related hospitalizations than the original CAT. Two focus groups (respiratory nurses and physicians) generated two adjusted CAT algorithms. Two multivariate logistic regression models for infrequent (≤1/year) versus frequent (>1/year) future respiratory-related hospitalizations were defined: one with the adjusted CAT score that correlated best with future hospitalizations and one with the original CAT score. Patient characteristics related to future hospitalizations ( p ≤ 0.2) were also entered. Eighty-two COPD patients were included. The CAT algorithm derived from the nurse focus group was a borderline significant predictor of hospitalization risk (odds ratio (OR): 1.07; 95% confidence interval (CI): 1.00-1.14; p = 0.050) in a model that also included hospitalization frequency in the previous year (OR: 3.98; 95% CI: 1.30-12.16; p = 0.016) and anticholinergic risk score (OR: 3.08; 95% CI: 0.87-10.89; p = 0.081). Presence of ischemic heart disease and/or heart failure appeared 'protective' (OR: 0.17; 95% CI: 0.05-0.62; p = 0.007). The original CAT score was not significantly associated with hospitalization risk. In conclusion, as a predictor of respiratory-related hospitalizations, an adjusted CAT score was marginally significant (although the original CAT score was not). 'Previous respiratory-related hospitalizations' was the strongest factor in this equation.
Classifying snakebite in South Africa: Validating a scoring system ...

African Journals Online (AJOL)

Factors predictive of ATI and the optimal cut-off score for predicting an ATI were identified. These factors were then used to develop a standard scoring system. The score was then tested prospectively for accuracy in a new validation cohort consisting of 100 patients admitted for snakebite to our unit from 1 December 2014 to ...
The dynamic version of the Bayley-III : Test results and the opinion of practitioners

NARCIS (Netherlands)

Visser, Linda; Ruiter, Selma; van der Meulen, Bieuwe; Ruijssenaars, Wied; Timmerman, Marieke

When problems are suspected with the development of a child, developmental assessment is often carried out as part of the diagnostic process (American Academy of Pediatrics, 2001). The test scores obtained indicate the levels of functioning at that moment in time in the domains investigated. This
Credit concession through credit scoring: Analysis and application proposal

Directory of Open Access Journals (Sweden)

Oriol Amat

2017-01-01

Full Text Available Purpose: The study herein develops and tests a credit scoring model which can help financial institutions in assessing credit requests. Design/methodology/approach: The empirical study has the objective of answering two questions: (1 Which ratios better discriminate the companies based on their being solvent or insolvent? and (2 What is the relative importance of these ratios? To do this, several statistical techniques with a multifactorial focus have been used (Multivariate Analysis of Variance, Linear Discriminant Analysis, Logit and Probit Models. Several samples of companies have been used in order to obtain and to test the model. Findings: Through the application of several statistical techniques, the credit scoring model has been proved to be effective in discriminating between good and bad creditors. Research limitations: This study focuses on manufacturing, commercial and services companies of all sizes in Spain; Therefore, the conclusions may differ for other geographical locations. Practical implications: Because credit is one of the main drivers of growth, a solid credit scoring model can help financial institutions assessing to whom to grant credit and to whom not to grant credit. Social implications: Because of the growing importance of credit for our society and the fear of granting it due to the latest financial turmoil, a solid credit scoring model can strengthen the trust toward the financial institutions assessment’s. Originality/value: There is already a stream of literature related to credit scoring. However, this paper focuses on Spanish firms and proves the results of our model based on real data. The application of the model to detect the probability of default in loans is original.
Genome scan for linkage to asthma using a linkage disequilibrium-lod score test.

Science.gov (United States)

Jiang, Y; Slager, S L; Huang, J

2001-01-01

We report a genome-wide linkage study of asthma on the German and Collaborative Study on the Genetics of Asthma (CSGA) data. Using a combined linkage and linkage disequilibrium test and the nonparametric linkage score, we identified 13 markers from the German data, 1 marker from the African American (CSGA) data, and 7 markers from the Caucasian (CSGA) data in which the p-values ranged between 0.0001 and 0.0100. From our analysis and taking into account previous published linkage studies of asthma, we suggest that three regions in chromosome 5 (around D5S418, D5S644, and D5S422), one region in chromosome 6 (around three neighboring markers D6S1281, D6S291, and D6S1019), one region in chromosome 11 (around D11S2362), and two regions in chromosome 12 (around D12S351 and D12S324) especially merit further investigation.
Applying cognitive acuity theory to the development and scoring of situational judgment tests.

Science.gov (United States)

Leeds, J Peter

2017-11-09

The theory of cognitive acuity (TCA) treats the response options within items as signals to be detected and uses psychophysical methods to estimate the respondents' sensitivity to these signals. Such a framework offers new methods to construct and score situational judgment tests (SJT). Leeds (2012) defined cognitive acuity as the capacity to discern correctness and distinguish between correctness differences among simultaneously presented situation-specific response options. In this study, SJT response options were paired in order to offer the respondent a two-option choice. The contrast in correctness valence between the two options determined the magnitude of signal emission, with larger signals portending a higher probability of detection. A logarithmic relation was found between correctness valence contrast (signal stimulus) and its detectability (sensation response). Respondent sensitivity to such signals was measured and found to be related to the criterion variables. The linkage between psychophysics and elemental psychometrics may offer new directions for measurement theory.
THE EFFICIENCY OF TENNIS DOUBLES SCORING SYSTEMS

Directory of Open Access Journals (Sweden)

Geoff Pollard

2010-09-01

Full Text Available In this paper a family of scoring systems for tennis doubles for testing the hypothesis that pair A is better than pair B versus the alternative hypothesis that pair B is better than A, is established. This family or benchmark of scoring systems can be used as a benchmark against which the efficiency of any doubles scoring system can be assessed. Thus, the formula for the efficiency of any doubles scoring system is derived. As in tennis singles, one scoring system based on the play-the-loser structure is shown to be more efficient than the benchmark systems. An expression for the relative efficiency of two doubles scoring systems is derived. Thus, the relative efficiency of the various scoring systems presently used in doubles can be assessed. The methods of this paper can be extended to a match between two teams of 2, 4, 8, …doubles pairs, so that it is possible to establish a measure for the relative efficiency of the various systems used for tennis contests between teams of players.
Legal provisions governing the acknowledgment of test results

International Nuclear Information System (INIS)

Strecker, A.

1982-01-01

The legal provisions governing the acknowledgment of test results are most frequently applied by administrative orders (design and qualification approvals or specimen testing and approval) and are thus claimable and voidable in accordance with general administrative law. The acknowledgment of test certificates requires a legal basis. Test results, however, can be acknowledged also by administrative bodies. Recently, the Federal Government began to delegate more of its legal authority in this field to private institutions, allowing test results to be acknowledged and test certificates to be issued by government controlled private institutions. (orig.) [de
Pre-operative risk scores for the prediction of outcome in elderly people who require emergency surgery

Directory of Open Access Journals (Sweden)

Bates Tom

2007-06-01

Full Text Available Abstract Background The decision on whether to operate on a sick elderly person with an intra-abdominal emergency is one of the most difficult in general surgery. A predictive risk-score would be of great value in this situation. Methods A Medline search was performed to identify those predictive risk-scores relevant to sick elderly patients in whom emergency surgery might be life-saving. Results Many of the risk scores for surgical patients include the operative findings or require tests which are not available in the acute situation. Most of the relevant studies include younger patients and elective surgery. The Glasgow Aneurysm Score and Hardman Index are specific to ruptured aortic aneurysm while the Boey Score and the Hacetteppe Score are specific to perforated peptic ulcer. The Reiss Index and Fitness Score can be used pre-operatively if the elements of the score can be completed in time. The ASA score, which includes a significant element of subjective clinical judgement, can be augmented with factors such as age and urgency of surgery but no test has a negative predictive value sufficient to recommend against surgical intervention without clinical input. Conclusion Risk scores may be helpful in sick elderly patients needing emergency abdominal surgery but an experienced clinical opinion is still essential.

Automated Agatston score computation in non-ECG gated CT scans using deep learning

Science.gov (United States)

Cano-Espinosa, Carlos; González, Germán.; Washko, George R.; Cazorla, Miguel; San José Estépar, Raúl

2018-03-01

Introduction: The Agatston score is a well-established metric of cardiovascular disease related to clinical outcomes. It is computed from CT scans by a) measuring the volume and intensity of the atherosclerotic plaques and b) aggregating such information in an index. Objective: To generate a convolutional neural network that inputs a non-contrast chest CT scan and outputs the Agatston score associated with it directly, without a prior segmentation of Coronary Artery Calcifications (CAC). Materials and methods: We use a database of 5973 non-contrast non-ECG gated chest CT scans where the Agatston score has been manually computed. The heart of each scan is cropped automatically using an object detector. The database is split in 4973 cases for training and 1000 for testing. We train a 3D deep convolutional neural network to regress the Agatston score directly from the extracted hearts. Results: The proposed method yields a Pearson correlation coefficient of r = 0.93; p <= 0.0001 against manual reference standard in the 1000 test cases. It further stratifies correctly 72.6% of the cases with respect to standard risk groups. This compares to more complex state-of-the-art methods based on prior segmentations of the CACs, which achieve r = 0.94 in ECG-gated pulmonary CT. Conclusions: A convolutional neural network can regress the Agatston score from the image of the heart directly, without a prior segmentation of the CACs. This is a new and simpler paradigm in the Agatston score computation that yields similar results to the state-of-the-art literature.
High resolution CT in children with cystic fibrosis: correlation with pulmonary functions and radiographic scores

Energy Technology Data Exchange (ETDEWEB)

Demirkazik, Figen Basaran E-mail: demirkaz@dialup.ankara.edu.tr; Ariyuerek, O. Macit; Oezcelik, Ugur; Goecmen, Ayhan; Hassanabad, Hossein K.; Kiper, Nural

2001-01-01

Objective: To compare the high resolution CT (HRCT) scores of the Bhalla system with pulmonary function tests and radiographic and clinical points of the Shwachman-Kulczycki clinical scoring system. Methods: HRCT of the chest was obtained in 40 children to assess the role of HRCT in evaluating bronchopulmonary pathology in children with cystic fibrosis (CF). The HRCT severity scores of the Bhalla system were compared with chest radiographic and clinical points of the Shwachman-Kulczycki scoring system and pulmonary function tests. Only 14 of the patients older than 6 years cooperated with spirometry. Results: HRCT scores correlated well with radiographic points (r=0.80, P<0.0001) and clinical points (r=0.67, P<0.0001) of the Shwachman-Kulczycki system, FVC (r=0.71 P=0.004) and FEV{sub 1} (r=0.66, P=0.01). Although radiographic points correlated significantly with FVC (r=0.61, P=0.02) and FEV{sub 1} (r=0.56, P=0.04), HRCT provides a more precise scoring than the chest X-ray. Conclusion: The HRCT scoring system may provide a sensitive method of monitoring pulmonary disease status and may replace the radiographic scoring in the Shwachman-Kulczycki system. It may be helpful especially in follow-up of small children too young to cooperate with spirometry.
Using imputed genotype data in the joint score tests for genetic association and gene-environment interactions in case-control studies.

Science.gov (United States)

Song, Minsun; Wheeler, William; Caporaso, Neil E; Landi, Maria Teresa; Chatterjee, Nilanjan

2018-03-01

Genome-wide association studies (GWAS) are now routinely imputed for untyped single nucleotide polymorphisms (SNPs) based on various powerful statistical algorithms for imputation trained on reference datasets. The use of predicted allele counts for imputed SNPs as the dosage variable is known to produce valid score test for genetic association. In this paper, we investigate how to best handle imputed SNPs in various modern complex tests for genetic associations incorporating gene-environment interactions. We focus on case-control association studies where inference for an underlying logistic regression model can be performed using alternative methods that rely on varying degree on an assumption of gene-environment independence in the underlying population. As increasingly large-scale GWAS are being performed through consortia effort where it is preferable to share only summary-level information across studies, we also describe simple mechanisms for implementing score tests based on standard meta-analysis of "one-step" maximum-likelihood estimates across studies. Applications of the methods in simulation studies and a dataset from GWAS of lung cancer illustrate ability of the proposed methods to maintain type-I error rates for the underlying testing procedures. For analysis of imputed SNPs, similar to typed SNPs, the retrospective methods can lead to considerable efficiency gain for modeling of gene-environment interactions under the assumption of gene-environment independence. Methods are made available for public use through CGEN R software package. © 2017 WILEY PERIODICALS, INC.
SIGI: score-based identification of genomic islands

Directory of Open Access Journals (Sweden)

Merkl Rainer

2004-03-01

Full Text Available Abstract Background Genomic islands can be observed in many microbial genomes. These stretches of DNA have a conspicuous composition with regard to sequence or encoded functions. Genomic islands are assumed to be frequently acquired via horizontal gene transfer. For the analysis of genome structure and the study of horizontal gene transfer, it is necessary to reliably identify and characterize these islands. Results A scoring scheme on codon frequencies Score_G1G2(cdn = log(f_G2(cdn / f_G1(cdn was utilized. To analyse genes of a species G1 and to test their relatedness to species G2, scores were determined by applying the formula to log-odds derived from mean codon frequencies of the two genomes. A non-redundant set of nearly 400 codon usage tables comprising microbial species was derived; its members were used alternatively at position G2. Genes having at least one score value above a species-specific and dynamically determined cut-off value were analysed further. By means of cluster analysis, genes were identified that comprise clusters of statistically significant size. These clusters were predicted as genomic islands. Finally and individually for each of these genes, the taxonomical relation among those species responsible for significant scores was interpreted. The validity of the approach and its limitations were made plausible by an extensive analysis of natural genes and synthetic ones aimed at modelling the process of gene amelioration. Conclusions The method reliably allows to identify genomic island and the likely origin of alien genes.
equate: An R Package for Observed-Score Linking and Equating

Directory of Open Access Journals (Sweden)

Anthony D. Albano

2016-10-01

Full Text Available The R package equate contains functions for observed-score linking and equating under single-group, equivalent-groups, and nonequivalent-groups with anchor test(s designs. This paper introduces these designs and provides an overview of observed-score equating with details about each of the supported methods. Examples demonstrate the basic functionality of the equate package.
QUASAR--scoring and ranking of sequence-structure alignments.

Science.gov (United States)

Birzele, Fabian; Gewehr, Jan E; Zimmer, Ralf

2005-12-15

Sequence-structure alignments are a common means for protein structure prediction in the fields of fold recognition and homology modeling, and there is a broad variety of programs that provide such alignments based on sequence similarity, secondary structure or contact potentials. Nevertheless, finding the best sequence-structure alignment in a pool of alignments remains a difficult problem. QUASAR (quality of sequence-structure alignments ranking) provides a unifying framework for scoring sequence-structure alignments that aids finding well-performing combinations of well-known and custom-made scoring schemes. Those scoring functions can be benchmarked against widely accepted quality scores like MaxSub, TMScore, Touch and APDB, thus enabling users to test their own alignment scores against 'standard-of-truth' structure-based scores. Furthermore, individual score combinations can be optimized with respect to benchmark sets based on known structural relationships using QUASAR's in-built optimization routines.
Estimating the Reliability of Aggregated and Within-Person Centered Scores in Ecological Momentary Assessment

Science.gov (United States)

Huang, Po-Hsien; Weng, Li-Jen

2012-01-01

A procedure for estimating the reliability of test scores in the context of ecological momentary assessment (EMA) was proposed to take into account the characteristics of EMA measures. Two commonly used test scores in EMA were considered: the aggregated score (AGGS) and the within-person centered score (WPCS). Conceptually, AGGS and WPCS represent…
The HAT Score-A Simple Risk Stratification Score for Coagulopathic Bleeding During Adult Extracorporeal Membrane Oxygenation.

Science.gov (United States)

Lonergan, Terence; Herr, Daniel; Kon, Zachary; Menaker, Jay; Rector, Raymond; Tanaka, Kenichi; Mazzeffi, Michael

2017-06-01

The study objective was to create an adult extracorporeal membrane oxygenation (ECMO) coagulopathic bleeding risk score. Secondary analysis was performed on an existing retrospective cohort. Pre-ECMO variables were tested for association with coagulopathic bleeding, and those with the strongest association were included in a multivariable model. Using this model, a risk stratification score was created. The score's utility was validated by comparing bleeding and transfusion rates between score levels. Bleeding also was examined after stratifying by nadir platelet count and overanticoagulation. Predictive power of the score was compared against the risk score for major bleeding during anti-coagulation for atrial fibrillation (HAS-BLED). Tertiary care academic medical center. The study comprised patients who received venoarterial or venovenous ECMO over a 3-year period, excluding those with an identified source of surgical bleeding during exploration. None. Fifty-three (47.3%) of 112 patients experienced coagulopathic bleeding. A 3-variable score-hypertension, age greater than 65, and ECMO type (HAT)-had fair predictive value (area under the receiver operating characteristic curve [AUC] = 0.66) and was superior to HAS-BLED (AUC = 0.64). As the HAT score increased from 0 to 3, bleeding rates also increased as follows: 30.8%, 48.7%, 63.0%, and 71.4%, respectively. Platelet and fresh frozen plasma transfusion tended to increase with the HAT score, but red blood cell transfusion did not. Nadir platelet count less than 50×10 3 /µL and overanticoagulation during ECMO increased the AUC for the model to 0.73, suggesting additive risk. The HAT score may allow for bleeding risk stratification in adult ECMO patients. Future studies in larger cohorts are necessary to confirm these findings. Copyright © 2017 Elsevier Inc. All rights reserved.
Combination of scoring schemes for protein docking

Directory of Open Access Journals (Sweden)

Schomburg Dietmar

2007-08-01

Full Text Available Abstract Background Docking algorithms are developed to predict in which orientation two proteins are likely to bind under natural conditions. The currently used methods usually consist of a sampling step followed by a scoring step. We developed a weighted geometric correlation based on optimised atom specific weighting factors and combined them with our previously published amino acid specific scoring and with a comprehensive SVM-based scoring function. Results The scoring with the atom specific weighting factors yields better results than the amino acid specific scoring. In combination with SVM-based scoring functions the percentage of complexes for which a near native structure can be predicted within the top 100 ranks increased from 14% with the geometric scoring to 54% with the combination of all scoring functions. Especially for the enzyme-inhibitor complexes the results of the ranking are excellent. For half of these complexes a near-native structure can be predicted within the first 10 proposed structures and for more than 86% of all enzyme-inhibitor complexes within the first 50 predicted structures. Conclusion We were able to develop a combination of different scoring schemes which considers a series of previously described and some new scoring criteria yielding a remarkable improvement of prediction quality.
Temporal acuity and speech recognition score in noise in patients with multiple sclerosis

Directory of Open Access Journals (Sweden)

Mehri Maleki

2014-04-01

Full Text Available Background and Aim: Multiple sclerosis (MS is one of the central nervous system diseases can be associated with a variety of symptoms such as hearing disorders. The main consequence of hearing loss is poor speech perception, and temporal acuity has important role in speech perception. We evaluated the speech perception in silent and in the presence of noise and temporal acuity in patients with multiple sclerosis.Methods: Eighteen adults with multiple sclerosis with the mean age of 37.28 years and 18 age- and sex- matched controls with the mean age of 38.00 years participated in this study. Temporal acuity and speech perception were evaluated by random gap detection test (GDT and word recognition score (WRS in three different signal to noise ratios.Results: Statistical analysis of test results revealed significant differences between the two groups (p<0.05. Analysis of gap detection test (in 4 sensation levels and word recognition score in both groups showed significant differences (p<0.001.Conclusion: According to this survey, the ability of patients with multiple sclerosis to process temporal features of stimulus was impaired. It seems that, this impairment is important factor to decrease word recognition score and speech perception.
Personalized Risk Scoring for Critical Care Prognosis Using Mixtures of Gaussian Processes.

Science.gov (United States)

Alaa, Ahmed M; Yoon, Jinsung; Hu, Scott; van der Schaar, Mihaela

2018-01-01

In this paper, we develop a personalized real-time risk scoring algorithm that provides timely and granular assessments for the clinical acuity of ward patients based on their (temporal) lab tests and vital signs; the proposed risk scoring system ensures timely intensive care unit admissions for clinically deteriorating patients. The risk scoring system is based on the idea of sequential hypothesis testing under an uncertain time horizon. The system learns a set of latent patient subtypes from the offline electronic health record data, and trains a mixture of Gaussian Process experts, where each expert models the physiological data streams associated with a specific patient subtype. Transfer learning techniques are used to learn the relationship between a patient's latent subtype and her static admission information (e.g., age, gender, transfer status, ICD-9 codes, etc). Experiments conducted on data from a heterogeneous cohort of 6321 patients admitted to Ronald Reagan UCLA medical center show that our score significantly outperforms the currently deployed risk scores, such as the Rothman index, MEWS, APACHE, and SOFA scores, in terms of timeliness, true positive rate, and positive predictive value. Our results reflect the importance of adopting the concepts of personalized medicine in critical care settings; significant accuracy and timeliness gains can be achieved by accounting for the patients' heterogeneity. The proposed risk scoring methodology can confer huge clinical and social benefits on a massive number of critically ill inpatients who exhibit adverse outcomes including, but not limited to, cardiac arrests, respiratory arrests, and septic shocks.
Relationship between ultrasonic pulse velocity test result and ...

African Journals Online (AJOL)

Ultrasonic Pulse Velocity test result showed an inverse relationship (of -0.935) with the crushed concrete compressive strength. Correlation test, multiple regression analysis, graphs and visual inspection were used to analyze the results. The conclusion drawn is that there exists a relationship between UPV test results and ...
Hospital triage system for adult patients using an influenza-like illness scoring system during the 2009 pandemic--Mexico.

Directory of Open Access Journals (Sweden)

Eduardo Rodriguez-Noriega

2010-05-01

Full Text Available Pandemic influenza A (H1N1 virus emerged during 2009. To help clinicians triage adults with acute respiratory illness, a scoring system for influenza-like illness (ILI was implemented at Hospital Civil de Guadalajara, Mexico.A medical history, laboratory and radiology results were collected on emergency room (ER patients with acute respiratory illness to calculate an ILI-score. Patients were evaluated for admission by their ILI-score and clinicians' assessment of risk for developing complications. Nasal and throat swabs were collected from intermediate and high-risk patients for influenza testing by RT-PCR. The disposition and ILI-score of those oseltamivir-treated versus untreated, clinical characteristics of 2009 pandemic influenza A (H1N1 patients versus test-negative patients were compared by Pearson's Chi(2, Fisher's Exact, and Wilcoxon rank-sum tests.Of 1840 ER patients, 230 were initially hospitalized (mean ILI-score = 15, and the rest were discharged, including 286 ambulatory patients given oseltamivir (median ILI-score = 11, and 1324 untreated (median ILI-score = 5. Fourteen (1% untreated patients returned, and 3 were hospitalized on oseltamivir (median ILI-score = 19. Of 371 patients tested by RT-PCR, 104 (28% had pandemic influenza and 42 (11% had seasonal influenza A detected. Twenty (91% of 22 imaged hospitalized pandemic influenza patients had bilateral infiltrates compared to 23 (38% of 61 imaged hospital test-negative patients (p<0.001. One patient with confirmed pandemic influenza presented 6 days after symptom onset, required mechanical ventilation, and died.The triaging system that used an ILI-score complimented clinicians' judgment of who needed oseltamivir and inpatient care and helped hospital staff manage a surge in demand for services.
Comparison of Duke ergo-metric score and of the classification based on scintigraphic data in the stratification of coronaries

International Nuclear Information System (INIS)

Ouhayoun, E.; Coca, F.J.; Payoux, P.; Tafani, J.A.M.; Esquerre, J.P.

1997-01-01

Stratification of risk (sudden death and infarction) remains a major problem of the way the coronaries are cared. Since 1987, a score based on the test-to-effort data was proposed by Mark and coll. of 'Duke University' team. They tried to demonstrate that this score provides a reliable classification of patients. We have compared the results obtained by using this score with those issued from the simultaneous analysis of the left ventricle (LV) function and LV perfusion. A hundred patients afflicted with coronaries (stenoses > 50%) benefited by a coupled study of the LV function and perfusion at rest and under effort made by means of MIBI scintigraphy. The effort test allowed calculating the 'Duke' score by means of a formula in terms of the angor index defined as follows: 0 for absence, 1 for angor and 2 for angor motivating cessation. According to Duke score three classes can be defined: patients of low risk, score ≥ 5; patients of intermediate risk, score in between 5 and -10; patients of high risk, score ≤ -10. Ejection fraction at effort acme was measured in every patient as well as the extension of perfusion defect, evaluated semi-quantitatively at effort and rest on the basis of bull's eye. Three groups of patients were created according to the results of perfusion+function couple): (A)- normal perfusion and function, the case of good prognostication; (B)- patients slightly afflicted (FEV effort > 50% and in-effort defect extension effort 50%). The last criteria were proved by several studies as bad prognostication. A table presents the risks according the Duke score for the three classes. One can observe that one third of the patients severely afflicted by confirmed ischemia are classified in the low-risk class. Besides, the majority of patients are ranked with intermediary risk, independently of scintigraphic results. In conclusion, these results concerning the stratification of coronaries show the superiority of the criteria based on scintigraphy over
Data Exploration and Analysis of Alternative Learning System Accreditation and Equivalency Test Result Using Data Mining

Science.gov (United States)

Talingdan, J. A.; Trinidad, J. T., Jr.; Palaoag, T. D.

2018-03-01

Alternative Learning System (ALS) is a subsystem of Depatment of Education (DepEd) that serves as an option of learners who cannot afford to go in a formal education. The research focuses on the data exploration and analysis of ALS accreditation and equivalency test result using data mining. The ALS 2014 to 2016 A & E test results in the secondary level were used as data sets in the study. The A & E test results revealed that the passing rate is doubled per year. The results were clustered using k- means clustering algorithm and they were grouped into good, medium, and low standard learners to identify students need exceptional stuff for enhancement. From the clustered data, it was found out that the strand they are weak in is strand 4 which is the Development of Self and a Sense of Community with a general average of 84.23. It also revealed that the essay type of exam got the lowest score with a general average of 2.14 compared to the multiple type of exam that covers the five learning strands. Furthermore, decision tree and naive bayes were also employed in the study to predict the performance of the learners in the A & E test and determine which is better to use for prediction. It was concluded that naive bayes performs better because the accuracy rate is higher than the decision tree algorithm.
Transforming Biology Assessment with Machine Learning: Automated Scoring of Written Evolutionary Explanations

Science.gov (United States)

Nehm, Ross H.; Ha, Minsu; Mayfield, Elijah

2012-02-01

This study explored the use of machine learning to automatically evaluate the accuracy of students' written explanations of evolutionary change. Performance of the Summarization Integrated Development Environment (SIDE) program was compared to human expert scoring using a corpus of 2,260 evolutionary explanations written by 565 undergraduate students in response to two different evolution instruments (the EGALT-F and EGALT-P) that contained prompts that differed in various surface features (such as species and traits). We tested human-SIDE scoring correspondence under a series of different training and testing conditions, using Kappa inter-rater agreement values of greater than 0.80 as a performance benchmark. In addition, we examined the effects of response length on scoring success; that is, whether SIDE scoring models functioned with comparable success on short and long responses. We found that SIDE performance was most effective when scoring models were built and tested at the individual item level and that performance degraded when suites of items or entire instruments were used to build and test scoring models. Overall, SIDE was found to be a powerful and cost-effective tool for assessing student knowledge and performance in a complex science domain.
Cross-cultural adaptation and validation of the Turkish version of Oxford hip score.

Science.gov (United States)

Tuğay, Baki Umut; Tuğay, Nazan; Güney, Hande; Hazar, Zeynep; Yüksel, İnci; Atilla, Bülent

2015-06-01

The purpose of this study was to translate the Oxford hip score (OHS) into Turkish and to evaluate the psychometric properties by testing the internal consistency, reproducibility, construct validity, and responsiveness in patients with hip osteoarthritis (OA). Oxford hip score was translated and culturally adapted according to the guidelines in the literature. Seventy patients (mean age 61.45 ± 9.29 years) with hip osteoarthritis participated in the study. Patients completed the Turkish Oxford hip score (OHS-TR), the Short-Form 36 (SF-36), and Western Ontario and McMaster Universities Index (WOMAC). Internal consistency was tested using Cronbach's α coefficient. Patients completed OHS-TR questionnaire twice in 7 days for determining the reproducibility. Correlation between the total results of both tests was determined by the Pearson correlation coefficient and intraclass correlation coefficient (ICC). Validity was assessed by calculating the Pearson correlation coefficient between the OHS-TR and WOMAC and SF-36 scores. Floor and ceiling effects were analyzed. The internal consistency was high (Cronbach's α 0.93). The construct validity showed a significant correlation between the OHS-TR and WOMAC and related SF-36 domains (p < 0.001). The ICC's ranged between 0.80 and 0.99. There was no floor or ceiling effect in total OHS-TR score. The OHS-TR questionnaire is valid, reliable, and responsive for the Turkish-speaking patients with hip OA.
Mathematics Placement Test: Typical Results with Unexpected Outcomes

Science.gov (United States)

Ingalls, Victoria

2011-01-01

Based on the results of a prior case-study analysis of mathematics placement at one university, the mathematics department developed and piloted a mathematics placement test. This article describes the implementation process for a mathematics placement test and further analyzes the test results for the pilot group. As an unexpected result, the…
Standardized computer-based organized reporting of EEG:SCORE

DEFF Research Database (Denmark)

Beniczky, Sandor; H, Aurlien,; JC, Brøgger,

2013-01-01

process, organized by the European Chapter of the International Federation of Clinical Neurophysiology. The Standardised Computer-based Organised Reporting of EEG (SCORE) software was constructed based on the terms and features of the consensus statement and it was tested in the clinical practice...... in free-text format. The purpose of our endeavor was to create a computer-based system for EEG assessment and reporting, where the physicians would construct the reports by choosing from predefined elements for each relevant EEG feature, as well as the clinical phenomena (for video-EEG recordings....... SCORE can potentially improve the quality of EEG assessment and reporting; it will help incorporate the results of computer-assisted analysis into the report, it will make possible the build-up of a multinational database, and it will help in training young neurophysiologists....
The score statistic of the LD-lod analysis: detecting linkage adaptive to linkage disequilibrium.

Science.gov (United States)

Huang, J; Jiang, Y

2001-01-01

We study the properties of a modified lod score method for testing linkage that incorporates linkage disequilibrium (LD-lod). By examination of its score statistic, we show that the LD-lod score method adaptively combines two sources of information: (a) the IBD sharing score which is informative for linkage regardless of the existence of LD and (b) the contrast between allele-specific IBD sharing scores which is informative for linkage only in the presence of LD. We also consider the connection between the LD-lod score method and the transmission-disequilibrium test (TDT) for triad data and the mean test for affected sib pair (ASP) data. We show that, for triad data, the recessive LD-lod test is asymptotically equivalent to the TDT; and for ASP data, it is an adaptive combination of the TDT and the ASP mean test. We demonstrate that the LD-lod score method has relatively good statistical efficiency in comparison with the ASP mean test and the TDT for a broad range of LD and the genetic models considered in this report. Therefore, the LD-lod score method is an interesting approach for detecting linkage when the extent of LD is unknown, such as in a genome-wide screen with a dense set of genetic markers. Copyright 2001 S. Karger AG, Basel

Test Results of PBMR Fuel Spheres

International Nuclear Information System (INIS)

Koshcheev, Konstantin; Diakov, Alexander; Beltyukov, Igor; Barybin, Andrey; Chernetsov, Mikhail

2014-01-01

Results of pre-irradiation testing of fuel spheres (FS) and coated particles (CP) manufactured by PBMR SOC (Republic of South Africa) are described. The stable high quality level of major characteristics (dimensions, CP coating structure, uranium-235 contamination of the FS matrix graphite and the outer PyC layer of the CP coating) are shown. Results of a methodical irradiation test of two FS in helium and neon medium at temperatures of 800 to 1300 °C with simultaneous determination of release-to-birth ratios for major gaseous fission products (GFP) are described. (author)
Protein structural model selection by combining consensus and single scoring methods.

Directory of Open Access Journals (Sweden)

Zhiquan He

Full Text Available Quality assessment (QA for predicted protein structural models is an important and challenging research problem in protein structure prediction. Consensus Global Distance Test (CGDT methods assess each decoy (predicted structural model based on its structural similarity to all others in a decoy set and has been proved to work well when good decoys are in a majority cluster. Scoring functions evaluate each single decoy based on its structural properties. Both methods have their merits and limitations. In this paper, we present a novel method called PWCom, which consists of two neural networks sequentially to combine CGDT and single model scoring methods such as RW, DDFire and OPUS-Ca. Specifically, for every pair of decoys, the difference of the corresponding feature vectors is input to the first neural network which enables one to predict whether the decoy-pair are significantly different in terms of their GDT scores to the native. If yes, the second neural network is used to decide which one of the two is closer to the native structure. The quality score for each decoy in the pool is based on the number of winning times during the pairwise comparisons. Test results on three benchmark datasets from different model generation methods showed that PWCom significantly improves over consensus GDT and single scoring methods. The QA server (MUFOLD-Server applying this method in CASP 10 QA category was ranked the second place in terms of Pearson and Spearman correlation performance.
Results of interlaboratory tests regarding TXRF

International Nuclear Information System (INIS)

Klockenkaemper, R.; Bohlen, A. von

2000-01-01

Interlaboratory or intercomparison tests can be performed for proficiency testing of individual laboratories, for the certification of a special sample material and for the validation of a certain method. We participated in two interlaboratory tests in order to validate total reflection x-ray fluorescence analysis (TXRF). We used our results to evaluate TXRF and to compare it with other competing methods, particularly with respect of precision and accuracy. The first interlaboratory test was organized by IAEA (International Atomic Energy Agency, Vienna, Austria). As a candidate for reference material, a lichen (IAEA-336 Lichen) was distributed among 27 participants. In our laboratory, the powdered biogenic material was digested with nitric acid under high pressure and analyzed by TXRF. - The second interlaboratory test was organized by IRMM (Institute for Reference Materials and Measurements, Geel, Belgium). As a certified test sample with undisclosed values, a sediment (IMEP-14) was delivered to 220 laboratories. We digested the geogenic material again by nitric acid and additionally by hydrofluoric acid and analyzed it by TXRF. - In both test samples, six or eight different trace elements, respectively, were determined by TXRF with a content between 2 and 2000 mg/kg. Calibration was carried out by internal standardization. For that purpose, Ga or Se, respectively, was added as standard element. The measurement uncertainty of TXRF was estimated by the method of error propagation. In our paper we will report on the results of the two interlaboratory tests. It will be shown that TXRF is highly reliable for a correct determination of trace elements in biogenic and geogenic samples. It is competitive with the established methods of trace analyses which were involved in these tests and it is even superior to them in certain aspects. (author)
Mobile evaporator corrosion test results

International Nuclear Information System (INIS)

Rozeveld, A.; Chamberlain, D.B.

1997-05-01

Laboratory corrosion tests were conducted on eight candidates to select a durable and cost-effective alloy for use in mobile evaporators to process radioactive waste solutions. Based on an extensive literature survey of corrosion data, three stainless steel alloys (304L, 316L, AL-6XN), four nickel-based alloys (825, 625, 690, G-30), and titanium were selected for testing. The corrosion tests included vapor phase, liquid junction (interface), liquid immersion, and crevice corrosion tests on plain and welded samples of candidate materials. Tests were conducted at 80 degrees C for 45 days in two different test solutions: a nitric acid solution. to simulate evaporator conditions during the processing of the cesium ion-exchange eluant and a highly alkaline sodium hydroxide solution to simulate the composition of Tank 241-AW-101 during evaporation. All of the alloys exhibited excellent corrosion resistance in the alkaline test solution. Corrosion rates were very low and localized corrosion was not observed. Results from the nitric acid tests showed that only 316L stainless steel did not meet our performance criteria. The 316L welded interface and crevice specimens had rates of 22.2 mpy and 21.8 mpy, respectively, which exceeds the maximum corrosion rate of 20 mpy. The other welded samples had about the same corrosion resistance as the plain samples. None of the welded samples showed preferential weld or heat-affected zone (HAZ) attack. Vapor corrosion was negligible for all alloys. All of the alloys except 316L exhibited either open-quotes satisfactoryclose quotes (2-20 mpy) or open-quotes excellentclose quotes (<2 mpy) corrosion resistance as defined by National Association of Corrosion Engineers. However, many of the alloys experienced intergranular corrosion in the nitric acid test solution, which could indicate a susceptibility to stress corrosion cracking (SCC) in this environment
Personality and Examination Score Correlates of Abnormal Psychology Course Ratings.

Science.gov (United States)

Pauker, Jerome D.

The relationship between the ratings students assigned to an evening undergraduate abnormal psychology class and their scores on objective personality tests and course examinations was investigated. Students (N=70) completed the MMPI and made global ratings of the course; these scores were correlated separately by sex with the T scores of 13 MMPI…
Cognitive Deficits in Healthy Elderly Population With "Normal" Scores on the Mini-Mental State Examination.

Science.gov (United States)

Votruba, Kristen L; Persad, Carol; Giordani, Bruno

2016-05-01

This study investigated whether healthy older adults with Mini-Mental State Examination (MMSE) scores above 23 exhibit cognitive impairment on neuropsychological tests. Participants completed the MMSE and a neuropsychological battery including tests of 10 domains. Results were compared to published normative data. On neuropsychological testing, participants performed well on measures of naming and recall but showed mild to moderate impairment in working memory and processing speed and marked impairment in inhibition, sustained attention, and executive functioning. Almost everyone (91%) scored at least 1 standard deviation (SD) below the mean in at least 1 domain. The median number of domains in which individuals scored below 1 SD was 3.0 of 10.0, whereas over 21% scored below 1 SD in 5 domains or more. With the strictest of definitions for impairment, 20% of this population scored below 2.0 SDs below the norm in at least 2 domains, a necessary condition for a diagnosis of dementia. The finding that cognitive impairment, particularly in attention and executive functioning, is found in healthy older persons who perform well on the MMSE has clinical and research implications in terms of emphasizing normal variability in performance and early identification of possible impairment. © The Author(s) 2016.
Level of intrauterine cocaine exposure and neuropsychological test scores in preadolescence: subtle effects on auditory attention and narrative memory.

Science.gov (United States)

Beeghly, Marjorie; Rose-Jacobs, Ruth; Martin, Brett M; Cabral, Howard J; Heeren, Timothy C; Frank, Deborah A

2014-01-01

Neuropsychological processes such as attention and memory contribute to children's higher-level cognitive and language functioning and predict academic achievement. The goal of this analysis was to evaluate whether level of intrauterine cocaine exposure (IUCE) alters multiple aspects of preadolescents' neuropsychological functioning assessed using a single age-referenced instrument, the NEPSY: A Developmental Neuropsychological Assessment (NEPSY) (Korkman et al., 1998), after controlling for relevant covariates. Participants included 137 term 9.5-year-old children from low-income urban backgrounds (51% male, 90% African American/Caribbean) from an ongoing prospective longitudinal study. Level of IUCE was assessed in the newborn period using infant meconium and maternal report. 52% of the children had IUCE (65% with lighter IUCE, and 35% with heavier IUCE), and 48% were unexposed. Infants with Fetal Alcohol Syndrome, HIV seropositivity, or intrauterine exposure to illicit substances other than cocaine and marijuana were excluded. At the 9.5-year follow-up visit, trained examiners masked to IUCE and background variables evaluated children's neuropsychological functioning using the NEPSY. The association between level of IUCE and NEPSY outcomes was evaluated in a series of linear regressions controlling for intrauterine exposure to other substances and relevant child, caregiver, and demographic variables. Results indicated that level of IUCE was associated with lower scores on the Auditory Attention and Narrative Memory tasks, both of which require auditory information processing and sustained attention for successful performance. However, results did not follow the expected ordinal, dose-dependent pattern. Children's neuropsychological test scores were also altered by a variety of other biological and psychosocial factors. Copyright © 2014 Elsevier Inc. All rights reserved.
Assessment of US score and CT number for diagnosis of fatty liver

International Nuclear Information System (INIS)

Ogasawara, Tetsuo; Tanda, Shigeru; Lim, Insu; Oota, Keisuke; Taima, Tadashi

1987-01-01

The author evaluates US and CT for diagnosis of fatty liver in 70 cases with fatty change of the liver. We tried to score the US findings of the fatty change, i.e., ''bright liver pattern'', ''liver-kidney contrast'', ''vascular blurring'', ''deep attenuation'', and the usefulness of the scoring was examined. Comparing with CT number, US score was more sensitive, but had no significant correlation with the amount of the fat in the liver and with the abnormality of the liver function tests. The results indicate that US should be used as a primary screening examination, and for the further evaluation of the fatty change of the liver, CT should be carried out. (author)
Accounting for Intraligand Interactions in Flexible Ligand Docking with a PMF-Based Scoring Function.

Science.gov (United States)

Lizunov, A Y; Gonchar, A L; Zaitseva, N I; Zosimov, V V

2015-10-26

We analyzed the frequency with which intraligand contacts occurred in a set of 1300 protein-ligand complexes [ Plewczynski et al. J. Comput. Chem. 2011 , 32 , 742 - 755 .]. Our analysis showed that flexible ligands often form intraligand hydrophobic contacts, while intraligand hydrogen bonds are rare. The test set was also thoroughly investigated and classified. We suggest a universal method for enhancement of a scoring function based on a potential of mean force (PMF-based score) by adding a term accounting for intraligand interactions. The method was implemented via in-house developed program, utilizing an Algo_score scoring function [ Ramensky et al. Proteins: Struct., Funct., Genet. 2007 , 69 , 349 - 357 .] based on the Tarasov-Muryshev PMF [ Muryshev et al. J. Comput.-Aided Mol. Des. 2003 , 17 , 597 - 605 .]. The enhancement of the scoring function was shown to significantly improve the docking and scoring quality for flexible ligands in the test set of 1300 protein-ligand complexes [ Plewczynski et al. J. Comput. Chem. 2011 , 32 , 742 - 755 .]. We then investigated the correlation of the docking results with two parameters of intraligand interactions estimation. These parameters are the weight of intraligand interactions and the minimum number of bonds between the ligand atoms required to take their interaction into account.
Relationships between narrative language samples and norm-referenced test scores in language assessments of school-age children.

Science.gov (United States)

Danahy Ebert, Kerry; Scott, Cheryl M

2014-10-01

Both narrative language samples and norm-referenced language tests can be important components of language assessment for school-age children. The present study explored the relationship between these 2 tools within a group of children referred for language assessment. The study is a retrospective analysis of clinical records from 73 school-age children. Participants had completed an oral narrative language sample and at least one norm-referenced language test. Correlations between microstructural language sample measures and norm-referenced test scores were compared for younger (6- to 8-year-old) and older (9- to 12-year-old) children. Contingency tables were constructed to compare the 2 types of tools, at 2 different cutpoints, in terms of which children were identified as having a language disorder. Correlations between narrative language sample measures and norm-referenced tests were stronger for the younger group than the older group. Within the younger group, the level of language assessed by each measure contributed to associations among measures. Contingency analyses revealed moderate overlap in the children identified by each tool, with agreement affected by the cutpoint used. Narrative language samples may complement norm-referenced tests well, but age combined with narrative task can be expected to influence the nature of the relationship.
Effects of time and recall of patch test results on quality of life (QoL) after testing. Cross-sectional study analyzing QoL in hand eczema patients 1, 5 and 10 years after patch testing.

Science.gov (United States)

Jamil, Wasim N; Lindberg, Magnus

2017-08-01

Patch testing can improve health-related quality of life (HRQOL). To study the impact on HRQOL of elapsed time after patch testing (1-10 years), and how the outcome of testing and patients' recall affects HRQOL. The Dermatology Life Quality Index (DLQI) questionnaire was sent to all patients (aged 18-65 years) who were patch tested for suspected contact allergy in 2009, 2005 and 2000 at the Department of Dermatology in Örebro. The response rate was 51% (n = 256). The DLQI score was significantly lower at 10 years after patch testing (mean DLQI = 5.5) than at 1 year (mean DLQI = 7.7). Work was the most impaired aspect. A binary logistic model showed that only time (10 years after testing) was associated with no effect, a light effect or a moderate effect (DLQI negative or positive test results concerning full recall, partial recall or no recall of diagnosed allergens. Although there was an improvement in HRQOL over time, the work aspect remained a major problem. The improvement was not affected by the outcome of testing and patients' recall of test results. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Distribution of model-based multipoint heterogeneity lod scores.

Science.gov (United States)

Xing, Chao; Morris, Nathan; Xing, Guan

2010-12-01

The distribution of two-point heterogeneity lod scores (HLOD) has been intensively investigated because the conventional χ(2) approximation to the likelihood ratio test is not directly applicable. However, there was no study investigating th e distribution of the multipoint HLOD despite its wide application. Here we want to point out that, compared with the two-point HLOD, the multipoint HLOD essentially tests for homogeneity given linkage and follows a relatively simple limiting distribution ½χ²₀+ ½χ²₁, which can be obtained by established statistical theory. We further examine the theoretical result by simulation studies. © 2010 Wiley-Liss, Inc.
Managing missing scores on the Roland Morris Disability Questionnaire

DEFF Research Database (Denmark)

Kent, Peter; Lauridsen, Henrik Hein

2011-01-01

Study Design: Analysis of Roland Morris Disability Questionnaire (RMDQ) and Oswestry Disability Index (Oswestry) responses.Objectives: To determine the prevalence of unanswered questions on the RMDQ23 (23-item RMDQ version) and Oswestry questionnaires. To determine if managing RMDQ23 missing data...... fully completed RMDQ23 and matching Oswestry questionnaire sets. Raw sum scores were calculated, and questions systematically dropped. At each stage, sum scores were converted to a score on a 0-100 scale and the error calculated. Wilcoxon Tests were used to compare the magnitude of the error scores...
Reproducibility of the results in ultrasonic testing

International Nuclear Information System (INIS)

Chalaye, M.; Launay, J.P.; Thomas, A.

1980-12-01

This memorandum reports on the conclusions of the tests carried out in order to evaluate the reproducibility of ultrasonic tests made on welded joints. FRAMATOME have started a study to assess the dispersion of results afforded by the test line and to characterize its behaviour. The tests covered sensors and ultrasonic generators said to be identical to each other (same commercial batch) [fr
A Study of the Use of the "e-rater"® Scoring Engine for the Analytical Writing Measure of the "GRE"® revised General Test. Research Report. ETS RR-14-24

Science.gov (United States)

Breyer, F. Jay; Attali, Yigal; Williamson, David M.; Ridolfi-McCulla, Laura; Ramineni, Chaitanya; Duchnowski, Matthew; Harris, April

2014-01-01

In this research, we investigated the feasibility of implementing the "e-rater"® scoring engine as a check score in place of all-human scoring for the "Graduate Record Examinations"® ("GRE"®) revised General Test (rGRE) Analytical Writing measure. This report provides the scientific basis for the use of e-rater as a…
Effects of handcuffs on neuropsychological testing: Implications for criminal forensic evaluations.

Science.gov (United States)

Biddle, Christine M; Fazio, Rachel L; Dyshniku, Fiona; Denney, Robert L

2018-01-01

Neuropsychological evaluations are increasingly performed in forensic contexts, including in criminal settings where security sometimes cannot be compromised to facilitate evaluation according to standardized procedures. Interpretation of nonstandardized assessment results poses significant challenges for the neuropsychologist. Research is limited in regard to the validation of neuropsychological test accommodation and modification practices that deviate from standard test administration; there is no published research regarding the effects of hand restraints upon neuropsychological evaluation results. This study provides preliminary results regarding the impact of restraints on motor functioning and common neuropsychological tests with a motor component. When restrained, performance on nearly all tests utilized was significantly impacted, including Trail Making Test A/B, a coding test, and several tests of motor functioning. Significant performance decline was observed in both raw scores and normative scores. Regression models are also provided in order to help forensic neuropsychologists adjust for the effect of hand restraints on raw scores of these tests, as the hand restraints also resulted in significant differences in normative scores; in the most striking case there was nearly a full standard deviation of discrepancy.
The Reliability and Validity of Weighted Composite Scores.

Science.gov (United States)

Kane, Michael; Case, Susan

The scores on two distinct tests (e.g., essay and objective) are often combined into a composite score, which is used to make decisions. The validity of the observed composite can sometimes be evaluated relative to a separate criterion. In cases where no criterion is available, the observed composite has generally been evaluated in terms of its…
Effect of Antihypertensive Therapy on SCORE-Estimated Total Cardiovascular Risk: Results from an Open-Label, Multinational Investigation—The POWER Survey

Directory of Open Access Journals (Sweden)

Guy De Backer

2013-01-01

Full Text Available Background. High blood pressure is a substantial risk factor for cardiovascular disease. Design & Methods. The Physicians' Observational Work on patient Education according to their vascular Risk (POWER survey was an open-label investigation of eprosartan-based therapy (EBT for control of high blood pressure in primary care centers in 16 countries. A prespecified element of this research was appraisal of the impact of EBT on estimated 10-year risk of a fatal cardiovascular event as determined by the Systematic Coronary Risk Evaluation (SCORE model. Results. SCORE estimates of CVD risk were obtained at baseline from 12,718 patients in 15 countries (6504 men and from 9577 patients at 6 months. During EBT mean (±SD systolic/diastolic blood pressures declined from 160.2 ± 13.7/94.1 ± 9.1 mmHg to 134.5 ± 11.2/81.4 ± 7.4 mmHg. This was accompanied by a 38% reduction in mean SCORE-estimated CVD risk and an improvement in SCORE risk classification of one category or more in 3506 patients (36.6%. Conclusion. Experience in POWER affirms that (a effective pharmacological control of blood pressure is feasible in the primary care setting and is accompanied by a reduction in total CVD risk and (b the SCORE instrument is effective in this setting for the monitoring of total CVD risk.
Effect of Antihypertensive Therapy on SCORE-Estimated Total Cardiovascular Risk: Results from an Open-Label, Multinational Investigation—The POWER Survey

Science.gov (United States)

De Backer, Guy; Petrella, Robert J.; Goudev, Assen R.; Radaideh, Ghazi Ahmad; Rynkiewicz, Andrzej; Pathak, Atul

2013-01-01

Background. High blood pressure is a substantial risk factor for cardiovascular disease. Design & Methods. The Physicians' Observational Work on patient Education according to their vascular Risk (POWER) survey was an open-label investigation of eprosartan-based therapy (EBT) for control of high blood pressure in primary care centers in 16 countries. A prespecified element of this research was appraisal of the impact of EBT on estimated 10-year risk of a fatal cardiovascular event as determined by the Systematic Coronary Risk Evaluation (SCORE) model. Results. SCORE estimates of CVD risk were obtained at baseline from 12,718 patients in 15 countries (6504 men) and from 9577 patients at 6 months. During EBT mean (±SD) systolic/diastolic blood pressures declined from 160.2 ± 13.7/94.1 ± 9.1 mmHg to 134.5 ± 11.2/81.4 ± 7.4 mmHg. This was accompanied by a 38% reduction in mean SCORE-estimated CVD risk and an improvement in SCORE risk classification of one category or more in 3506 patients (36.6%). Conclusion. Experience in POWER affirms that (a) effective pharmacological control of blood pressure is feasible in the primary care setting and is accompanied by a reduction in total CVD risk and (b) the SCORE instrument is effective in this setting for the monitoring of total CVD risk. PMID:23997946
Multiple Score Comparison: a network meta-analysis approach to comparison and external validation of prognostic scores

Directory of Open Access Journals (Sweden)

Sarah R. Haile

2017-12-01

Full Text Available Abstract Background Prediction models and prognostic scores have been increasingly popular in both clinical practice and clinical research settings, for example to aid in risk-based decision making or control for confounding. In many medical fields, a large number of prognostic scores are available, but practitioners may find it difficult to choose between them due to lack of external validation as well as lack of comparisons between them. Methods Borrowing methodology from network meta-analysis, we describe an approach to Multiple Score Comparison meta-analysis (MSC which permits concurrent external validation and comparisons of prognostic scores using individual patient data (IPD arising from a large-scale international collaboration. We describe the challenges in adapting network meta-analysis to the MSC setting, for instance the need to explicitly include correlations between the scores on a cohort level, and how to deal with many multi-score studies. We propose first using IPD to make cohort-level aggregate discrimination or calibration scores, comparing all to a common comparator. Then, standard network meta-analysis techniques can be applied, taking care to consider correlation structures in cohorts with multiple scores. Transitivity, consistency and heterogeneity are also examined. Results We provide a clinical application, comparing prognostic scores for 3-year mortality in patients with chronic obstructive pulmonary disease using data from a large-scale collaborative initiative. We focus on the discriminative properties of the prognostic scores. Our results show clear differences in performance, with ADO and eBODE showing higher discrimination with respect to mortality than other considered scores. The assumptions of transitivity and local and global consistency were not violated. Heterogeneity was small. Conclusions We applied a network meta-analytic methodology to externally validate and concurrently compare the prognostic properties

EuroSCORE II and STS as mortality predictors in patients undergoing TAVI

Directory of Open Access Journals (Sweden)

Vitor Emer Egypto Rosa

2016-02-01

Full Text Available SUMMARY Introduction: the EuroSCORE II and STS are the most used scores for surgical risk stratification and indication of transcatheter aortic valve implantation (TAVI. However, its role as a tool for mortality prediction in patients undergoing TAVI is still unclear. Objective: to evaluate the performance of the EuroSCORE II and STS as predictors of in-hospital and 30-day mortality in patients undergoing TAVI. Methods: we included 59 symptomatic patients with severe aortic stenosis that underwent TAVI between 2010 and 2014. The variables were analyzed using Student's t-test and Fisher's exact test and the discriminative power was evaluated using receiver operating characteristic curve (ROC and area under the curve (AUC with a 95% confidence interval. Results: mean age was 81±7.3 years, 42.3% men. The mean EuroSCORE II was 7.6±7.3 % and STS was 20.7±10.3%. Transfemoral procedure was performed in 88.13%, transapical in 3.38% and transaortic in 8.47%. In-hospital mortality was 10.1% and 30-day mortality was 13.5%. Patients who died had EuroSCORE II and STS higher than the survivors (33.7±16.7vs. 18.6±7.3% p=0,0001 for STS and 13.9±16.1 vs. 4.8±3.8% p=0.0007 for EuroSCORE II. The STS showed an AUC of 0.81 and the EuroSCORE II of 0.77 and there were no differences in the discrimination ability using ROC curves (p=0.72. Conclusion: in this cohort, the STS and EuroSCORE II were predictors of in-hospital and 30-days mortality in patients with severe aortic stenosis undergoing TAVI.
Acknowledging the results of blood tests

DEFF Research Database (Denmark)

Torkilsheyggi, Arnvør Martinsdottir á; Hertzum, Morten

At the studied hospital, physicians from the Medical and Surgical Departments work some of their shifts in the Emergency Department (ED). Though icons showing the blood-test process were introduced on electronic whiteboards in the ED, these icons did not lead to increased attention to test acknow...... acknowledgement. Rather, the physicians, trans-ferred work practices from their own departments, which did not have electronic white-boards, to the ED. This finding suggests a challenge to the cross-disciplinary work and norms for how to follow up on blood-test results in the ED....
Factors contributing to speech perception scores in long-term pediatric cochlear implant users.

Science.gov (United States)

Davidson, Lisa S; Geers, Ann E; Blamey, Peter J; Tobey, Emily A; Brenner, Christine A

2011-02-01

The objectives of this report are to (1) describe the speech perception abilities of long-term pediatric cochlear implant (CI) recipients by comparing scores obtained at elementary school (CI-E, 8 to 9 yrs) with scores obtained at high school (CI-HS, 15 to 18 yrs); (2) evaluate speech perception abilities in demanding listening conditions (i.e., noise and lower intensity levels) at adolescence; and (3) examine the relation of speech perception scores to speech and language development over this longitudinal timeframe. All 112 teenagers were part of a previous nationwide study of 8- and 9-yr-olds (N = 181) who received a CI between 2 and 5 yrs of age. The test battery included (1) the Lexical Neighborhood Test (LNT; hard and easy word lists); (2) the Bamford Kowal Bench sentence test; (3) the Children's Auditory-Visual Enhancement Test; (4) the Test of Auditory Comprehension of Language at CI-E; (5) the Peabody Picture Vocabulary Test at CI-HS; and (6) the McGarr sentences (consonants correct) at CI-E and CI-HS. CI-HS speech perception was measured in both optimal and demanding listening conditions (i.e., background noise and low-intensity level). Speech perception scores were compared based on age at test, lexical difficulty of stimuli, listening environment (optimal and demanding), input mode (visual and auditory-visual), and language age. All group mean scores significantly increased with age across the two test sessions. Scores of adolescents significantly decreased in demanding listening conditions. The effect of lexical difficulty on the LNT scores, as evidenced by the difference in performance between easy versus hard lists, increased with age and decreased for adolescents in challenging listening conditions. Calculated curves for percent correct speech perception scores (LNT and Bamford Kowal Bench) and consonants correct on the McGarr sentences plotted against age-equivalent language scores on the Test of Auditory Comprehension of Language and Peabody
Psychometric properties of a Swedish translation of the VISA-P outcome score for patellar tendinopathy.

Science.gov (United States)

Frohm, Anna; Saartok, Tönu; Edman, Gunnar; Renström, Per

2004-12-18

Self-administrated patient outcome scores are increasingly recommended for evaluation of primary outcome in clinical studies. The VISA-P score, developed at the Victorian Institute of Sport Assessment in Melbourne, Australia, is a questionnaire developed for patients with patellar tendinopathy and the patients assess severity of symptoms, function and ability to participate in sport. The aim of this study was to translate the questionnaire into Swedish and to study the reliability and validity of the translated questionnaire and resultant scores. The questionnaire was translated into Swedish according to internationally recommended guidelines for cross-cultural adaptation of self-report measures. The reliability and validity were tested in three different populations. The populations used were healthy students (n = 17), members of the Swedish male national basketball team (n = 17), considered as a population at risk, and a group of non-surgically treated patients (n = 17) with clinically diagnosed patellar tendinopathy. The questionnaire was completed by 51 subjects altogether. The translated VISA-P questionnaire showed very good test-retest reliability (ICC = 0.97).The mean (+/- SD) of the VISA-P score, at both the first and second test occasions was highest in the healthy student group 83 (+/- 13) and 81 (+/- 15), respectively. The score of the basketball players was 79 (+/- 24) and 80 (+/- 23), while the patient group scored significantly (p < 0.05) lower, 48 (+/- 20) and 52 (+/- 19). The translated version of the VISA-P questionnaire was linguistically and culturally equivalent to the original version. The translated score showed good reliability.
Decreasing scoring errors on Wechsler Scale Vocabulary, Comprehension, and Similarities subtests: a preliminary study.

Science.gov (United States)

Linger, Michele L; Ray, Glen E; Zachar, Peter; Underhill, Andrea T; LoBello, Steven G

2007-10-01

Studies of graduate students learning to administer the Wechsler scales have generally shown that training is not associated with the development of scoring proficiency. Many studies report on the reduction of aggregated administration and scoring errors, a strategy that does not highlight the reduction of errors on subtests identified as most prone to error. This study evaluated the development of scoring proficiency specifically on the Wechsler (WISC-IV and WAIS-III) Vocabulary, Comprehension, and Similarities subtests during training by comparing a set of 'early test administrations' to 'later test administrations.' Twelve graduate students enrolled in an intelligence-testing course participated in the study. Scoring errors (e.g., incorrect point assignment) were evaluated on the students' actual practice administration test protocols. Errors on all three subtests declined significantly when scoring errors on 'early' sets of Wechsler scales were compared to those made on 'later' sets. However, correcting these subtest scoring errors did not cause significant changes in subtest scaled scores. Implications for clinical instruction and future research are discussed.
Phase III Simplified Integrated Test (SIT) results - Space Station ECLSS testing

Science.gov (United States)

Roberts, Barry C.; Carrasquillo, Robyn L.; Dubiel, Melissa Y.; Ogle, Kathryn Y.; Perry, Jay L.; Whitley, Ken M.

1990-01-01

During 1989, phase III testing of Space Station Freedom Environmental Control and Life Support Systems (ECLSS) began at Marshall Space Flight Center (MSFC) with the Simplified Integrated Test. This test, conducted at the MSFC Core Module Integration Facility (CMIF), was the first time the four baseline air revitalization subsystems were integrated together. This paper details the results and lessons learned from the phase III SIT. Future plans for testing at the MSFC CMIF are also discussed.
Do efficiency scores depend on input mix?

DEFF Research Database (Denmark)

Asmild, Mette; Hougaard, Jens Leth; Kronborg, Dorte

2013-01-01

In this paper we examine the possibility of using the standard Kruskal-Wallis (KW) rank test in order to evaluate whether the distribution of efficiency scores resulting from Data Envelopment Analysis (DEA) is independent of the input (or output) mix of the observations. Since the DEA frontier...... is estimated, many standard assumptions for evaluating the KW test statistic are violated. Therefore, we propose to explore its statistical properties by the use of simulation studies. The simulations are performed conditional on the observed input mixes. The method, unlike existing approaches...... the assumption of mix independence is rejected the implication is that it, for example, is impossible to determine whether machine intensive project are more or less efficient than labor intensive projects....
'False-positive' and 'false-negative' test results in clinical urine drug testing.

Science.gov (United States)

Reisfield, Gary M; Goldberger, Bruce A; Bertholf, Roger L

2009-08-01

The terms 'false-positive' and 'false-negative' are widely used in discussions of urine drug test (UDT) results. These terms are inadequate because they are used in different ways by physicians and laboratory professionals and they are too narrow to encompass the larger universe of potentially misleading, inappropriate and unexpected drug test results. This larger universe, while not solely comprised of technically 'true' or 'false' positive or negative test results, presents comparable interpretive challenges with corresponding clinical implications. In this review, we propose the terms 'potentially inappropriate' positive or negative test results in reference to UDT results that are ambiguous or unexpected and subject to misinterpretation. Causes of potentially inappropriate positive UDT results include in vivo metabolic conversions of a drug, exposure to nonillicit sources of a drug and laboratory error. Causes of potentially inappropriate negative UDT results include limited assay specificity, absence of drug in the urine, presence of drug in the urine, but below established assay cutoff, specimen manipulation and laboratory error. Clinical UDT interpretation is a complicated task requiring knowledge of recent prescription, over-the-counter and herbal drug administration, drug metabolism and analytical sensitivities and specificities.
Risk stratification in non-ST elevation acute coronary syndromes: Risk scores, biomarkers and clinical judgment

Directory of Open Access Journals (Sweden)

David Corcoran

2015-09-01

Clinical guidelines recommend an early invasive strategy in higher risk NSTE-ACS. The Global Registry of Acute Coronary Events (GRACE risk score is a validated risk stratification tool which has incremental prognostic value for risk stratification compared with clinical assessment or troponin testing alone. In emergency medicine, there has been a limited adoption of the GRACE score in some countries (e.g. United Kingdom, in part related to a delay in obtaining timely blood biochemistry results. Age makes an exponential contribution to the GRACE score, and on an individual patient basis, the risk of younger patients with a flow-limiting culprit coronary artery lesion may be underestimated. The future incorporation of novel cardiac biomarkers into this diagnostic pathway may allow for earlier treatment stratification. The cost-effectiveness of the new diagnostic pathways based on high-sensitivity troponin and copeptin must also be established. Finally, diagnostic tests and risk scores may optimize patient care but they cannot replace patient-focused good clinical judgment.
Renal dysfunction in liver cirrhosis and its correlation with Child-Pugh score and MELD score

Science.gov (United States)

Siregar, G. A.; Gurning, M.

2018-03-01

Renal dysfunction (RD) is a serious and common complication in a patient with liver cirrhosis. It provides a poor prognosis. The aim of our study was to evaluate the renal function in liver cirrhosis, also to determine the correlation with the graduation of liver disease assessed by Child-Pugh Score (CPS) and MELD score. This was a cross-sectional study included patients with liver cirrhosis admitted to Adam Malik Hospital Medan in June - August 2016. We divided them into two groups as not having renal dysfunction (serum creatinine SPSS 22.0 was used. Statistical methods used: Chi-square, Fisher exact, one way ANOVA, Kruskal Wallis test and Pearson coefficient of correlation. The level of significance was p<0.05. 55 patients with presented with renal dysfunction were 16 (29.1 %). There was statistically significant inverse correlation between GFR and CPS (r = -0.308), GFR and MELD score (r = -0.278). There was a statistically significant correlation between creatinine and MELD score (r = 0.359), creatinine and CPS (r = 0.382). The increase of the degree of liver damage is related to the increase of renal dysfunction.
Test anxiety and academic performance in chiropractic students.

Science.gov (United States)

Zhang, Niu; Henderson, Charles N R

2014-01-01

Objective : We assessed the level of students' test anxiety, and the relationship between test anxiety and academic performance. Methods : We recruited 166 third-quarter students. The Test Anxiety Inventory (TAI) was administered to all participants. Total scores from written examinations and objective structured clinical examinations (OSCEs) were used as response variables. Results : Multiple regression analysis shows that there was a modest, but statistically significant negative correlation between TAI scores and written exam scores, but not OSCE scores. Worry and emotionality were the best predictive models for written exam scores. Mean total anxiety and emotionality scores for females were significantly higher than those for males, but not worry scores. Conclusion : Moderate-to-high test anxiety was observed in 85% of the chiropractic students examined. However, total test anxiety, as measured by the TAI score, was a very weak predictive model for written exam performance. Multiple regression analysis demonstrated that replacing total anxiety (TAI) with worry and emotionality (TAI subscales) produces a much more effective predictive model of written exam performance. Sex, age, highest current academic degree, and ethnicity contributed little additional predictive power in either regression model. Moreover, TAI scores were not found to be statistically significant predictors of physical exam skill performance, as measured by OSCEs.
Evaluation of a prospective scoring system designed for a multicenter breast MR imaging screening study.

Science.gov (United States)

Warren, Ruth M L; Thompson, Deborah; Pointon, Linda J; Hoff, Rebecca; Gilbert, Fiona J; Padhani, Anwar R; Easton, Douglas F; Lakhani, Sunil R; Leach, Martin O

2006-06-01

To evaluate prospectively the accuracy of a lesion classification system designed for use in a magnetic resonance (MR) imaging high-breast-cancer-risk screening study. All participating patients provided written informed consent. Ethics committee approval was obtained. The results of 1541 contrast material-enhanced breast MR imaging examinations were analyzed; 1441 screening examinations were performed in 638 women aged 24-51 years at high risk for breast cancer, and 100 examinations were performed in 100 women aged 23-81 years. Lesion analysis was performed in 991 breasts, which were divided into design (491 breasts) and testing (500 breasts) sets. The reference standard was histologic analysis of biopsy samples, fine-needle aspiration cytology, or minimal follow-up of 24 months. The scoring system involved the use of five features: morphology (MOR), pattern of enhancement (POE), percentage of maximal focal enhancement (PMFE), maximal signal intensity-time ratio (MITR), and pattern of contrast material washout (POCW). The system was evaluated by means of (a) assessment of interreader agreement, as expressed in kappa statistics, for 315 breasts in which both readers analyzed the same lesion, (b) assessment of the diagnostic accuracy of the scored components with receiver operating characteristic curve analysis, and (c) logistic regression analysis to determine which components of the scoring system were critical to the final score. A new simplified scoring system developed with the design set was applied to the testing set. There was moderate reader agreement regarding overall lesion outcome (ie, malignant, suspicious, or benign) (kappa=0.58) and less agreement regarding the scored components. The area under the receiver operating characteristic curve (AUC) for the overall lesion score, 0.88, was higher than the AUC for any one component. The components MOR, POE, and POCW yielded the best overall result. PMFE and MITR did not contribute to diagnostic utility
Conditional standard errors of measurement for composite scores on the Wechsler Preschool and Primary Scale of Intelligence-Third Edition.

Science.gov (United States)

Price, Larry R; Raju, Nambury; Lurie, Anna; Wilkins, Charles; Zhu, Jianjun

2006-02-01

A specific recommendation of the 1999 Standards for Educational and Psychological Testing by the American Educational Research Association, the American Psychological Association, and the National Council on Measurement in Education is that test publishers report estimates of the conditional standard error of measurement (SEM). Procedures for calculating the conditional (score-level) SEM based on raw scores are well documented; however, few procedures have been developed for estimating the conditional SEM of subtest or composite scale scores resulting from a nonlinear transformation. Item response theory provided the psychometric foundation to derive the conditional standard errors of measurement and confidence intervals for composite scores on the Wechsler Preschool and Primary Scale of Intelligence-Third Edition.
Marital status and optimism score among breast cancer survivors.

Science.gov (United States)

Croft, Lindsay; Sorkin, John; Gallicchio, Lisa

2014-11-01

There are an increasing number of breast cancer survivors, but their psychosocial and supportive care needs are not well-understood. Recent work has found marital status, social support, and optimism to be associated with quality of life, but little research has been conducted to understand how these factors relate to one another. Survey data from 722 breast cancer survivors were analyzed to estimate the association between marital status and optimism score, as measured using the Life Orientation Test-Revised. Linear regression was used to estimate the relationship of marital status and optimism, controlling for potential confounding variables and assessing effect modification. The results showed that the association between marital status and optimism was modified by time since breast cancer diagnosis. Specifically, in those most recently diagnosed (within 5 years), married breast cancer survivors had a 1.50 higher mean optimism score than unmarried survivors (95 % confidence interval (CI) 0.37, 2.62; p = 0.009). The difference in optimism score by marital status was not present more than 5 years from breast cancer diagnosis. Findings suggest that among breast cancer survivors within 5 years since diagnosis, those who are married have higher optimism scores than their unmarried counterparts; this association was not observed among longer-term breast cancer survivors. Future research should examine whether the difference in optimism score among this subgroup of breast cancer survivors is clinically relevant.
Method of administration of PROMIS scales did not significantly impact score level, reliability, or validity

DEFF Research Database (Denmark)

Bjorner, Jakob B; Rose, Matthias; Gandek, Barbara

2014-01-01

OBJECTIVES: To test the impact of the method of administration (MOA) on score level, reliability, and validity of scales developed in the Patient Reported Outcomes Measurement Information System (PROMIS). STUDY DESIGN AND SETTING: Two nonoverlapping parallel forms each containing eight items from......, no significant mode differences were found and all confidence intervals were within the prespecified minimal important difference of 0.2 standard deviation. Parallel-forms reliabilities were very high (ICC = 0.85-0.93). Only one across-mode ICC was significantly lower than the same-mode ICC. Tests of validity...... questionnaire (PQ), personal digital assistant (PDA), or personal computer (PC) and a second form by PC, in the same administration. Method equivalence was evaluated through analyses of difference scores, intraclass correlations (ICCs), and convergent/discriminant validity. RESULTS: In difference score analyses...
Citizen Science: The Small World Initiative Improved Lecture Grades and California Critical Thinking Skills Test Scores of Nonscience Major Students at Florida Atlantic University.

Science.gov (United States)

Caruso, Joseph P; Israel, Natalie; Rowland, Kimberly; Lovelace, Matthew J; Saunders, Mary Jane

2016-03-01

Course-based undergraduate research is known to improve science, technology, engineering, and mathematics student achievement. We tested "The Small World Initiative, a Citizen-Science Project to Crowdsource Novel Antibiotic Discovery" to see if it also improved student performance and the critical thinking of non-science majors in Introductory Biology at Florida Atlantic University (a large, public, minority-dominant institution) in academic year 2014-15. California Critical Thinking Skills Test pre- and posttests were offered to both Small World Initiative (SWI) and control lab students for formative amounts of extra credit. SWI lab students earned significantly higher lecture grades than control lab students, had significantly fewer lecture grades of D+ or lower, and had significantly higher critical thinking posttest total scores than control students. Lastly, more SWI students were engaged while taking critical thinking tests. These results support the hypothesis that utilizing independent course-based undergraduate science research improves student achievement even in nonscience students.
Ripasa score: a new diagnostic score for diagnosis of acute appendicitis

International Nuclear Information System (INIS)

Butt, M.Q.

2014-01-01

Objective: To determine the usefulness of RIPASA score for the diagnosis of acute appendicitis using histopathology as a gold standard. Study Design: Cross-sectional study. Place and Duration of Study: Department of General Surgery, Combined Military Hospital, Kohat, from September 2011 to March 2012. Methodology: A total of 267 patients were included in this study. RIPASA score was assessed. The diagnosis of appendicitis was made clinically aided by routine sonography of abdomen. After appendicectomies, resected appendices were sent for histopathological examination. The 15 parameters and the scores generated were age (less than 40 years = 1 point; greater than 40 years = 0.5 point), gender (male = 1 point; female = 0.5 point), Right Iliac Fossa (RIF) pain (0.5 point), migration of pain to RIF (0.5 point), nausea and vomiting (1 point), anorexia (1 point), duration of symptoms (less than 48 hours = 1 point; more than 48 hours = 0.5 point), RIF tenderness (1 point), guarding (2 points), rebound tenderness (1 point), Rovsing's sign (2 points), fever (1 point), raised white cell count (1 point), negative urinalysis (1 point) and foreign national registration identity card (1 point). The optimal cut-off threshold score from the ROC was 7.5. Sensitivity analysis was done. Results: Out of 267 patients, 156 (58.4%) were male while remaining 111 patients (41.6%) were female with mean age of 23.5 +- 9.1 years. Sensitivity of RIPASA score was 96.7%, specificity 93.0%, diagnostic accuracy was 95.1%, positive predictive value was 94.8% and negative predictive value was 95.54%. Conclusion: RIPASA score at a cut-off total score of 7.5 was a useful tool to diagnose appendicitis, in equivocal cases of pain. (author)
Assessment of calcium scoring performance in cardiac computed tomography

International Nuclear Information System (INIS)

Ulzheimer, Stefan; Kalender, Willi A.

2003-01-01

Electron beam tomography (EBT) has been used for cardiac diagnosis and the quantitative assessment of coronary calcium since the late 1980s. The introduction of mechanical multi-slice spiral CT (MSCT) scanners with shorter rotation times opened new possibilities of cardiac imaging with conventional CT scanners. The purpose of this work was to qualitatively and quantitatively evaluate the performance for EBT and MSCT for the task of coronary artery calcium imaging as a function of acquisition protocol, heart rate, spiral reconstruction algorithm (where applicable) and calcium scoring method. A cardiac CT semi-anthropomorphic phantom was designed and manufactured for the investigation of all relevant image quality parameters in cardiac CT. This phantom includes various test objects, some of which can be moved within the anthropomorphic phantom in a manner that mimics realistic heart motion. These tools were used to qualitatively and quantitatively demonstrate the accuracy of coronary calcium imaging using typical protocols for an electron beam (Evolution C-150XP, Imatron, South San Francisco, Calif.) and a 0.5-s four-slice spiral CT scanner (Sensation 4, Siemens, Erlangen, Germany). A special focus was put on the method of quantifying coronary calcium, and three scoring systems were evaluated (Agatston, volume, and mass scoring). Good reproducibility in coronary calcium scoring is always the result of a combination of high temporal and spatial resolution; consequently, thin-slice protocols in combination with retrospective gating on MSCT scanners yielded the best results. The Agatston score was found to be the least reproducible scoring method. The hydroxyapatite mass, being better reproducible and comparable on different scanners and being a physical quantitative measure, appears to be the method of choice for future clinical studies. The hydroxyapatite mass is highly correlated to the Agatston score. The introduced phantoms can be used to quantitatively assess the
Forecasting the value of credit scoring

Science.gov (United States)

Saad, Shakila; Ahmad, Noryati; Jaffar, Maheran Mohd

2017-08-01

Nowadays, credit scoring system plays an important role in banking sector. This process is important in assessing the creditworthiness of customers requesting credit from banks or other financial institutions. Usually, the credit scoring is used when customers send the application for credit facilities. Based on the score from credit scoring, bank will be able to segregate the "good" clients from "bad" clients. However, in most cases the score is useful at that specific time only and cannot be used to forecast the credit worthiness of the same applicant after that. Hence, bank will not know if "good" clients will always be good all the time or "bad" clients may become "good" clients after certain time. To fill up the gap, this study proposes an equation to forecast the credit scoring of the potential borrowers at a certain time by using the historical score related to the assumption. The Mean Absolute Percentage Error (MAPE) is used to measure the accuracy of the forecast scoring. Result shows the forecast scoring is highly accurate as compared to actual credit scoring.
Explaining Discrepancies Between the Digit Triplet Speech-in-Noise Test Score and Self-Reported Hearing Problems in Older Adults.

Science.gov (United States)

Pronk, Marieke; Deeg, Dorly J H; Kramer, Sophia E

2018-04-17

The purpose of this study is to determine which demographic, health-related, mood, personality, or social factors predict discrepancies between older adults' functional speech-in-noise test result and their self-reported hearing problems. Data of 1,061 respondents from the Longitudinal Aging Study Amsterdam were used (ages ranged from 57 to 95 years). Functional hearing problems were measured using a digit triplet speech-in-noise test. Five questions were used to assess self-reported hearing problems. Scores of both hearing measures were dichotomized. Two discrepancy outcomes were created: (a) being unaware: those with functional but without self-reported problems (reference is aware: those with functional and self-reported problems); (b) reporting false complaints: those without functional but with self-reported problems (reference is well: those without functional and self-reported hearing problems). Two multivariable prediction models (logistic regression) were built with 19 candidate predictors. The speech reception threshold in noise was kept (forced) as a predictor in both models. Persons with higher self-efficacy (to initiate behavior) and higher self-esteem had a higher odds to being unaware than persons with lower self-efficacy scores (odds ratio [OR] = 1.13 and 1.11, respectively). Women had a higher odds than men (OR = 1.47). Persons with more chronic diseases and persons with worse (i.e., higher) speech-in-noise reception thresholds in noise had a lower odds to being unaware (OR = 0.85 and 0.91, respectively) than persons with less diseases and better thresholds, respectively. A higher odds to reporting false complaints was predicted by more depressive symptoms (OR = 1.06), more chronic diseases (OR = 1.21), and a larger social network (OR = 1.02). Persons with higher self-efficacy (to complete behavior) had a lower odds (OR = 0.86), whereas persons with higher self-esteem had a higher odds to report false complaints (OR = 1.21). The explained variance

Some links on this page may take you to non-federal websites. Their policies may differ from this site.