test score interpretation: Topics by WorldWideScience.org

Sample records for test score interpretation

Validating the Interpretations and Uses of Test Scores

Science.gov (United States)

Kane, Michael T.

2013-01-01

To validate an interpretation or use of test scores is to evaluate the plausibility of the claims based on the scores. An argument-based approach to validation suggests that the claims based on the test scores be outlined as an argument that specifies the inferences and supporting assumptions needed to get from test responses to score-based…
Validating Score Interpretations and Uses: Messick Lecture, Language Testing Research Colloquium, Cambridge, April 2010

Science.gov (United States)

Kane, Michael

2012-01-01

The argument-based approach to validation involves two steps; specification of the proposed interpretations and uses of the test scores as an interpretive argument, and the evaluation of the plausibility of the proposed interpretive argument. More ambitious interpretations and uses tend to involve an extended network of inferences and assumptions…
What Do Test Scores Really Mean? A Latent Class Analysis of Danish Test Score Performance

DEFF Research Database (Denmark)

Munk, Martin D.; McIntosh, James

2014-01-01

Latent class Poisson count models are used to analyze a sample of Danish test score results from a cohort of individuals born in 1954-55, tested in 1968, and followed until 2011. The procedure takes account of unobservable effects as well as excessive zeros in the data. We show that the test scores...... of intelligence explain a significant proportion of the variation in test scores. This adds to the complexity of interpreting test scores and suggests that school culture and possible incentive problems make it more di¢ cult to understand what the tests measure....
Facilitating the Interpretation of English Language Proficiency Scores: Combining Scale Anchoring and Test Score Mapping Methodologies

Science.gov (United States)

Powers, Donald; Schedl, Mary; Papageorgiou, Spiros

2017-01-01

The aim of this study was to develop, for the benefit of both test takers and test score users, enhanced "TOEFL ITP"® test score reports that go beyond the simple numerical scores that are currently reported. To do so, we applied traditional scale anchoring (proficiency scaling) to item difficulty data in order to develop performance…
Summary of Score Changes (in other Tests).

Science.gov (United States)

Cleary, T. Anne; McCandless, Sam A.

Scholastic Aptitude Test (SAT) scores have declined during the last 14 years. Similar score declines have been observed in many different testing programs, many groups, and tested areas. The declines, while not large in any given year, have been consistent over time, area, and group. The period around 1965 is critical for the interpretation of…
Interpreting force concept inventory scores: Normalized gain and SAT scores

Directory of Open Access Journals (Sweden)

Jeffrey J. Steinert

2007-05-01

Full Text Available Preinstruction SAT scores and normalized gains (G on the force concept inventory (FCI were examined for individual students in interactive engagement (IE courses in introductory mechanics at one high school (N=335 and one university (N=292 , and strong, positive correlations were found for both populations ( r=0.57 and r=0.46 , respectively. These correlations are likely due to the importance of cognitive skills and abstract reasoning in learning physics. The larger correlation coefficient for the high school population may be a result of the much shorter time interval between taking the SAT and studying mechanics, because the SAT may provide a more current measure of abilities when high school students begin the study of mechanics than it does for college students, who begin mechanics years after the test is taken. In prior research a strong correlation between FCI G and scores on Lawson’s Classroom Test of Scientific Reasoning for students from the same two schools was observed. Our results suggest that, when interpreting class average normalized FCI gains and comparing different classes, it is important to take into account the variation of students’ cognitive skills, as measured either by the SAT or by Lawson’s test. While Lawson’s test is not commonly given to students in most introductory mechanics courses, SAT scores provide a readily available alternative means of taking account of students’ reasoning abilities. Knowing the students’ cognitive level before instruction also allows one to alter instruction or to use an intervention designed to improve students’ cognitive level.
Interpreting force concept inventory scores: Normalized gain and SAT scores

Directory of Open Access Journals (Sweden)

Vincent P. Coletta

2007-05-01

Full Text Available Preinstruction SAT scores and normalized gains (G on the force concept inventory (FCI were examined for individual students in interactive engagement (IE courses in introductory mechanics at one high school (N=335 and one university (N=292, and strong, positive correlations were found for both populations (r=0.57 and r=0.46, respectively. These correlations are likely due to the importance of cognitive skills and abstract reasoning in learning physics. The larger correlation coefficient for the high school population may be a result of the much shorter time interval between taking the SAT and studying mechanics, because the SAT may provide a more current measure of abilities when high school students begin the study of mechanics than it does for college students, who begin mechanics years after the test is taken. In prior research a strong correlation between FCI G and scores on Lawson’s Classroom Test of Scientific Reasoning for students from the same two schools was observed. Our results suggest that, when interpreting class average normalized FCI gains and comparing different classes, it is important to take into account the variation of students’ cognitive skills, as measured either by the SAT or by Lawson’s test. While Lawson’s test is not commonly given to students in most introductory mechanics courses, SAT scores provide a readily available alternative means of taking account of students’ reasoning abilities. Knowing the students’ cognitive level before instruction also allows one to alter instruction or to use an intervention designed to improve students’ cognitive level.
Interval Coded Scoring: a toolbox for interpretable scoring systems

Directory of Open Access Journals (Sweden)

Lieven Billiet

2018-04-01

Full Text Available Over the last decades, clinical decision support systems have been gaining importance. They help clinicians to make effective use of the overload of available information to obtain correct diagnoses and appropriate treatments. However, their power often comes at the cost of a black box model which cannot be interpreted easily. This interpretability is of paramount importance in a medical setting with regard to trust and (legal responsibility. In contrast, existing medical scoring systems are easy to understand and use, but they are often a simplified rule-of-thumb summary of previous medical experience rather than a well-founded system based on available data. Interval Coded Scoring (ICS connects these two approaches, exploiting the power of sparse optimization to derive scoring systems from training data. The presented toolbox interface makes this theory easily applicable to both small and large datasets. It contains two possible problem formulations based on linear programming or elastic net. Both allow to construct a model for a binary classification problem and establish risk profiles that can be used for future diagnosis. All of this requires only a few lines of code. ICS differs from standard machine learning through its model consisting of interpretable main effects and interactions. Furthermore, insertion of expert knowledge is possible because the training can be semi-automatic. This allows end users to make a trade-off between complexity and performance based on cross-validation results and expert knowledge. Additionally, the toolbox offers an accessible way to assess classification performance via accuracy and the ROC curve, whereas the calibration of the risk profile can be evaluated via a calibration curve. Finally, the colour-coded model visualization has particular appeal if one wants to apply ICS manually on new observations, as well as for validation by experts in the specific application domains. The validity and applicability
A Human Capital Model of Educational Test Scores

DEFF Research Database (Denmark)

McIntosh, James; D. Munk, Martin

Latent class Poisson count models are used to analyze a sample of Danish test score results from a cohort of individuals born in 1954-55 and tested in 1968. The procedure takes account of unobservable effects as well as excessive zeros in the data. The bulk of unobservable effects are uncorrelated...... with observable parental attributes and, thus, are environmental rather than genetic in origin. We show that the test scores measure manifest or measured ability as it has evolved over the life of the respondent and is, thus, more a product of the human capital formation process than some latent or fundamental...... measure of pure cognitive ability. We find that variables which are not closely associated with traditional notions of intelligence explain a significant proportion of the variation in test scores. This adds to the complexity of interpreting test scores and suggests that school culture, attitudes...
Development of a valid and reliable test to assess trauma radiograph interpretation performance

International Nuclear Information System (INIS)

Neep, M.J.; Steffens, T.; Riley, V.; Eastgate, P.; McPhail, S.M.

2017-01-01

Objectives: The purpose of this investigation was to develop and examine the preliminary validity and reliability among radiographers of a test to assess trauma radiograph interpretation performance suitable for use among health professionals. Methods: Stage 1 examined 14,159 consecutive appendicular and axial examinations from a hospital emergency department over a 12 month period to quantify a typical anatomical region case-mix of trauma radiographs. A sample of radiographic cases representative of affected anatomical regions was then developed into the Image Interpretation Test (IIT). Stage 2 involved prospective investigations of the IIT's reliability (inter-rater, intra-rater, internal consistency) and validity (concurrent) among 41 radiographers. Results: The IIT included 60 cases. The median (interquartile range) clinical experience of participants was 5 (2–10) years. Case scores were internally consistent (Cronbach's alpha = 0.90). Favourable inter-rater reliability (kappa > 0.70 for 58/60 cases, Intra-class correlation coefficient (ICC) > 0.99 for total score) and intra-rater reliability (kappa > 0.90 for 60/60 cases, ICC > 0.99 for total score) was observed. There was a positive association between radiographers' confidence in image interpretation and IIT score (coefficient = 1.52, r-squared = 0.60, p < 0.001). Conclusions: The IIT developed during this investigation included a selection of radiographic cases consistent with anatomical regions represented in an adult trauma case-mix. This study has also provided foundational preliminary evidence to support the reliability and validity of the IIT among radiographers. The findings suggest that it is possible to assess image interpretation performance of adult trauma radiographs with this test. - Highlights: • Development of an Image Interpretation Test (IIT). • Cases consistent with anatomical regions represented in a typical adult trauma case-mix. • Development of a
Psychometric Properties of Raw and Scale Scores on Mixed-Format Tests

Science.gov (United States)

Kolen, Michael J.; Lee, Won-Chan

2011-01-01

This paper illustrates that the psychometric properties of scores and scales that are used with mixed-format educational tests can impact the use and interpretation of the scores that are reported to examinees. Psychometric properties that include reliability and conditional standard errors of measurement are considered in this paper. The focus is…
A scored human protein-protein interaction network to catalyze genomic interpretation

DEFF Research Database (Denmark)

Li, Taibo; Wernersson, Rasmus; Hansen, Rasmus B

2017-01-01

Genome-scale human protein-protein interaction networks are critical to understanding cell biology and interpreting genomic data, but challenging to produce experimentally. Through data integration and quality control, we provide a scored human protein-protein interaction network (InWeb_InBioMap,......Genome-scale human protein-protein interaction networks are critical to understanding cell biology and interpreting genomic data, but challenging to produce experimentally. Through data integration and quality control, we provide a scored human protein-protein interaction network (In...
Interpreting Low Personality Psychopathology--Five Aggressiveness Scores on the MMPI-2: Graphical, Robust, and Resistant Data Analysis

Science.gov (United States)

Weisenburger, Susan M.; Harkness, Allan R.; McNulty, John L.; Graham, John R.; Ben-Porath, Yossef S.

2008-01-01

The Minnesota Mutiphasic Personality Inventory-2 (MMPI-2)-based Personality Psychopathology-Five (PSY-5) scales provide an overview of personality individual differences. Several textbooks and a test report offer instruction on interpreting MMPI-2 PSY-5 scores. On the basis of an earlier item response theory article (S. V. Rouse, M. S. Finger,…
Volumetric CT-images improve testing of radiological image interpretation skills

Energy Technology Data Exchange (ETDEWEB)

Ravesloot, Cécile J., E-mail: C.J.Ravesloot@umcutrecht.nl [Radiology Department at University Medical Center Utrecht, Heidelberglaan 100, 3508 GA Utrecht, Room E01.132 (Netherlands); Schaaf, Marieke F. van der, E-mail: M.F.vanderSchaaf@uu.nl [Department of Pedagogical and Educational Sciences at Utrecht University, Heidelberglaan 1, 3584 CS Utrecht (Netherlands); Schaik, Jan P.J. van, E-mail: J.P.J.vanSchaik@umcutrecht.nl [Radiology Department at University Medical Center Utrecht, Heidelberglaan 100, 3508 GA Utrecht, Room E01.132 (Netherlands); Cate, Olle Th.J. ten, E-mail: T.J.tenCate@umcutrecht.nl [Center for Research and Development of Education at University Medical Center Utrecht, Heidelberglaan 100, 3508 GA Utrecht (Netherlands); Gijp, Anouk van der, E-mail: A.vanderGijp-2@umcutrecht.nl [Radiology Department at University Medical Center Utrecht, Heidelberglaan 100, 3508 GA Utrecht, Room E01.132 (Netherlands); Mol, Christian P., E-mail: C.Mol@umcutrecht.nl [Image Sciences Institute at University Medical Center Utrecht, Heidelberglaan 100, 3508 GA Utrecht (Netherlands); Vincken, Koen L., E-mail: K.Vincken@umcutrecht.nl [Image Sciences Institute at University Medical Center Utrecht, Heidelberglaan 100, 3508 GA Utrecht (Netherlands)

2015-05-15

Rationale and objectives: Current radiology practice increasingly involves interpretation of volumetric data sets. In contrast, most radiology tests still contain only 2D images. We introduced a new testing tool that allows for stack viewing of volumetric images in our undergraduate radiology program. We hypothesized that tests with volumetric CT-images enhance test quality, in comparison with traditional completely 2D image-based tests, because they might better reflect required skills for clinical practice. Materials and methods: Two groups of medical students (n = 139; n = 143), trained with 2D and volumetric CT-images, took a digital radiology test in two versions (A and B), each containing both 2D and volumetric CT-image questions. In a questionnaire, they were asked to comment on the representativeness for clinical practice, difficulty and user-friendliness of the test questions and testing program. Students’ test scores and reliabilities, measured with Cronbach's alpha, of 2D and volumetric CT-image tests were compared. Results: Estimated reliabilities (Cronbach's alphas) were higher for volumetric CT-image scores (version A: .51 and version B: .54), than for 2D CT-image scores (version A: .24 and version B: .37). Participants found volumetric CT-image tests more representative of clinical practice, and considered them to be less difficult than volumetric CT-image questions. However, in one version (A), volumetric CT-image scores (M 80.9, SD 14.8) were significantly lower than 2D CT-image scores (M 88.4, SD 10.4) (p < .001). The volumetric CT-image testing program was considered user-friendly. Conclusion: This study shows that volumetric image questions can be successfully integrated in students’ radiology testing. Results suggests that the inclusion of volumetric CT-images might improve the quality of radiology tests by positively impacting perceived representativeness for clinical practice and increasing reliability of the test.
Do Test Scores Buy Happiness?

Science.gov (United States)

McCluskey, Neal

2017-01-01

Since at least the enactment of No Child Left Behind in 2002, standardized test scores have served as the primary measures of public school effectiveness. Yet, such scores fail to measure the ultimate goal of education: maximizing happiness. This exploratory analysis assesses nation level associations between test scores and happiness, controlling…
Interpreting Quality of Life after Brain Injury Scores: Cross-Walk with the Short Form-36.

Science.gov (United States)

Wilson, Lindsay; Marsden-Loftus, Isaac; Koskinen, Sanna; Bakx, Wilbert; Bullinger, Monika; Formisano, Rita; Maas, Andrew; Neugebauer, Edmund; Powell, Jane; Sarajuuri, Jaana; Sasse, Nadine; von Steinbuechel, Nicole; von Wild, Klaus; Truelle, Jean-Luc

2017-01-01

The Quality of Life after Brain Injury (QOLIBRI) instruments are traumatic brain injury (TBI)-specific assessments of health-related quality of life (HRQoL), with established validity and reliability. The purpose of the study is to help improve the interpretability of the two QOLIBRI summary scores (the QOLIBRI Total score and the QOLBRI Overall Scale [OS] score). An analysis was conducted of 761 patients with TBI who took part in the QOLIBRI validation studies. A cross-walk between QOLIBRI scores and the SF-36 Mental Component Summary norm-based scoring system was performed using geometric mean regression analysis. The exercise supports a previous suggestion that QOLIBRI Total scores GOSE), as a measure of global function, are presented in the form of means and standard deviations that allow comparison with other studies, and data on age and sex are presented for the QOLIBRI-OS. While bearing in mind the potential imprecision of the comparison, the findings provide a framework for evaluating QOLIBRI summary scores in relation to generic HRQoL that improves their interpretability.
Predicting occupational personality test scores.

Science.gov (United States)

Furnham, A; Drakeley, R

2000-01-01

The relationship between students' actual test scores and their self-estimated scores on the Hogan Personality Inventory (HPI; R. Hogan & J. Hogan, 1992), an omnibus personality questionnaire, was examined. Despite being given descriptive statistics and explanations of each of the dimensions measured, the students tended to overestimate their scores; yet all correlations between actual and estimated scores were positive and significant. Correlations between self-estimates and actual test scores were highest for sociability, ambition, and adjustment (r = .62 to r = .67). The results are discussed in terms of employers' use and abuse of personality assessment for job recruitment.
Myth of the Master Detective: Reliability of Interpretations for Kaufman's "Intelligent Testing" Approach to the WISC-III.

Science.gov (United States)

Macmann, Gregg M.; Barnett, David W.

1997-01-01

Used computer simulation to examine the reliability of interpretations for Kaufman's "intelligent testing" approach to the Wechsler Intelligence Scale for Children (3rd ed.) (WISC-III). Findings indicate that factor index-score differences and other measures could not be interpreted with confidence. Argues that limitations of IQ testing…
Exploring a Source of Uneven Score Equity across the Test Score Range

Science.gov (United States)

Huggins-Manley, Anne Corinne; Qiu, Yuxi; Penfield, Randall D.

2018-01-01

Score equity assessment (SEA) refers to an examination of population invariance of equating across two or more subpopulations of test examinees. Previous SEA studies have shown that score equity may be present for examinees scoring at particular test score ranges but absent for examinees scoring at other score ranges. No studies to date have…
An Online Synchronous Test for Professional Interpreters

Science.gov (United States)

Chen, Nian-Shing; Ko, Leong

2010-01-01

This article is based on an experiment designed to conduct an interpreting test for multiple candidates online, using web-based synchronous cyber classrooms. The test model was based on the accreditation test for Professional Interpreters produced by the National Accreditation Authority of Translators and Interpreters (NAATI) in Australia.…

Infusing Counseling Skills in Test Interpretation.

Science.gov (United States)

Rawlins, Melanie E.; And Others

1991-01-01

Presents an instructional model based on Neurolinguistic Programming that links counseling student course work in measurement and test interpretation with counseling techniques and theory. A process incorporating Neurolinguistic Programming patterns is outlined for teaching graduate students the counseling skills helpful in test interpretation.…
Comparing the use and interpretation of PGMI scoring to assess the technical quality of screening mammograms in the UK and Norway

International Nuclear Information System (INIS)

Boyce, M.; Gullien, R.; Parashar, D.; Taylor, K.

2015-01-01

Objectives: To compare PGMI systems used in the UK and Norway, determine levels of agreement in its interpretation for radiographers within and between centres, informing further research towards developing a more quantitative, uniform system. Methods: Mammograms from 112 women consecutively screened in the UK and Norway were anonymised, numbered and enriched to include all four PGMI categories. Cases were scored by five mammographers from each centre using local PGMI. Sets were exchanged and the process repeated. Distribution of categories was recorded and faults documented for images scored less than perfect. These were compared within and between centres and agreement analysed using non-weighted kappa statistic. Results: Norway uses 38 assessment criteria, the UK uses 15. Best agreement was between Norway raters scoring MLO views from both UK(RMLO k = 0.57, LMLO k = 0.490) and Norway (RMLO k = 0.48, LMLO k = 0.470). Least agreement was between UK raters scoring CC views from both UK(RCC k = 0.007, LCC k = 0.01) and Norway(RCC k = −0.04, LCC k = −0.003). There were no other apparent trends in inter-rater assessment. Most frequent faults in both test sets were on MLO views. Two out of three most common faults were the same for UK and Norway raters. Conclusions: Use of PGMI varied between centres in both number and interpretation of criteria employed. We identified the most common mammographic faults highlighting possible training needs. We suggest further work to provide a consensus list of visual criteria with accurate descriptors for each classification category. A validated way of applying them could help to standardise the process. - Highlights: • No previous published work comparing PGMI use between different countries. • Variation in number of assessment criteria used and their interpretation. • Best agreement was Norway scoring MLO views from both centres-moderate. • Least agreement was UK raters scoring CC views from both
Prediction of true test scores from observed item scores and ancillary data.

Science.gov (United States)

Haberman, Shelby J; Yao, Lili; Sinharay, Sandip

2015-05-01

In many educational tests which involve constructed responses, a traditional test score is obtained by adding together item scores obtained through holistic scoring by trained human raters. For example, this practice was used until 2008 in the case of GRE(®) General Analytical Writing and until 2009 in the case of TOEFL(®) iBT Writing. With use of natural language processing, it is possible to obtain additional information concerning item responses from computer programs such as e-rater(®). In addition, available information relevant to examinee performance may include scores on related tests. We suggest application of standard results from classical test theory to the available data to obtain best linear predictors of true traditional test scores. In performing such analysis, we require estimation of variances and covariances of measurement errors, a task which can be quite difficult in the case of tests with limited numbers of items and with multiple measurements per item. As a consequence, a new estimation method is suggested based on samples of examinees who have taken an assessment more than once. Such samples are typically not random samples of the general population of examinees, so that we apply statistical adjustment methods to obtain the needed estimated variances and covariances of measurement errors. To examine practical implications of the suggested methods of analysis, applications are made to GRE General Analytical Writing and TOEFL iBT Writing. Results obtained indicate that substantial improvements are possible both in terms of reliability of scoring and in terms of assessment reliability. © 2015 The British Psychological Society.
Utility of proverb interpretation measures with cardiac transplant candidates.

Science.gov (United States)

Dugbartey, A T

1998-12-01

To assess metaphorical understanding and proverb interpretation in cardiac transplant candidates, the neuropsychological assessment records of 22 adults with end-stage cardiac disease under consideration for transplantation were analyzed. Neuropsychological tests consisted of the Controlled Oral Word Association Test, Halstead Category Test, Rey-Osterrieth Complex Figure Test (Copy), Trial Making Test, and summed scores for the proverb items of the WAIS-R Comprehension subtest. Analysis showed that the group tended to interpret proverbs literally. Proverb scores were significantly associated with scores on the Similarities and Picture Arrangement subtests of the WAIS-R. There was a moderate negative association between number of reported heart attacks and Proverb scores. The need for brief yet robust assessments including measures of inferential thinking and conceptualization in transplant candidates are highlighted.
Test/score/report: Simulation techniques for automating the test process

Science.gov (United States)

Hageman, Barbara H.; Sigman, Clayton B.; Koslosky, John T.

1994-01-01

A Test/Score/Report capability is currently being developed for the Transportable Payload Operations Control Center (TPOCC) Advanced Spacecraft Simulator (TASS) system which will automate testing of the Goddard Space Flight Center (GSFC) Payload Operations Control Center (POCC) and Mission Operations Center (MOC) software in three areas: telemetry decommutation, spacecraft command processing, and spacecraft memory load and dump processing. Automated computer control of the acceptance test process is one of the primary goals of a test team. With the proper simulation tools and user interface, the task of acceptance testing, regression testing, and repeatability of specific test procedures of a ground data system can be a simpler task. Ideally, the goal for complete automation would be to plug the operational deliverable into the simulator, press the start button, execute the test procedure, accumulate and analyze the data, score the results, and report the results to the test team along with a go/no recommendation to the test team. In practice, this may not be possible because of inadequate test tools, pressures of schedules, limited resources, etc. Most tests are accomplished using a certain degree of automation and test procedures that are labor intensive. This paper discusses some simulation techniques that can improve the automation of the test process. The TASS system tests the POCC/MOC software and provides a score based on the test results. The TASS system displays statistics on the success of the POCC/MOC system processing in each of the three areas as well as event messages pertaining to the Test/Score/Report processing. The TASS system also provides formatted reports documenting each step performed during the tests and the results of each step. A prototype of the Test/Score/Report capability is available and currently being used to test some POCC/MOC software deliveries. When this capability is fully operational it should greatly reduce the time necessary
Adaptive testing with equated number-correct scoring

NARCIS (Netherlands)

van der Linden, Willem J.

1999-01-01

A constrained CAT algorithm is presented that automatically equates the number-correct scores on adaptive tests. The algorithm can be used to equate number-correct scores across different administrations of the same adaptive test as well as to an external reference test. The constraints are derived
ITC Guidelines on Quality Control in Scoring, Test Analysis, and Reporting of Test Scores

Science.gov (United States)

Allalouf, Avi

2014-01-01

The Quality Control (QC) Guidelines are intended to increase the efficiency, precision, and accuracy of the scoring, analysis, and reporting process of testing. The QC Guidelines focus on large-scale testing operations where multiple forms of tests are created for use on set dates. However, they may also be used for a wide variety of other testing…
Data-driven efficient score tests for deconvolution hypotheses

NARCIS (Netherlands)

Langovoy, M.

2008-01-01

We consider testing statistical hypotheses about densities of signals in deconvolution models. A new approach to this problem is proposed. We constructed score tests for the deconvolution density testing with the known noise density and efficient score tests for the case of unknown density. The
Can linear regression modeling help clinicians in the interpretation of genotypic resistance data? An application to derive a lopinavir-score.

Science.gov (United States)

Cozzi-Lepri, Alessandro; Prosperi, Mattia C F; Kjær, Jesper; Dunn, David; Paredes, Roger; Sabin, Caroline A; Lundgren, Jens D; Phillips, Andrew N; Pillay, Deenan

2011-01-01

The question of whether a score for a specific antiretroviral (e.g. lopinavir/r in this analysis) that improves prediction of viral load response given by existing expert-based interpretation systems (IS) could be derived from analyzing the correlation between genotypic data and virological response using statistical methods remains largely unanswered. We used the data of the patients from the UK Collaborative HIV Cohort (UK CHIC) Study for whom genotypic data were stored in the UK HIV Drug Resistance Database (UK HDRD) to construct a training/validation dataset of treatment change episodes (TCE). We used the average square error (ASE) on a 10-fold cross-validation and on a test dataset (the EuroSIDA TCE database) to compare the performance of a newly derived lopinavir/r score with that of the 3 most widely used expert-based interpretation rules (ANRS, HIVDB and Rega). Our analysis identified mutations V82A, I54V, K20I and I62V, which were associated with reduced viral response and mutations I15V and V91S which determined lopinavir/r hypersensitivity. All models performed equally well (ASE on test ranging between 1.1 and 1.3, p = 0.34). We fully explored the potential of linear regression to construct a simple predictive model for lopinavir/r-based TCE. Although, the performance of our proposed score was similar to that of already existing IS, previously unrecognized lopinavir/r-associated mutations were identified. The analysis illustrates an approach of validation of expert-based IS that could be used in the future for other antiretrovirals and in other settings outside HIV research.
Prognostic validation of a 17-segment score derived from a 20-segment score for myocardial perfusion SPECT interpretation.

Science.gov (United States)

Berman, Daniel S; Abidov, Aiden; Kang, Xingping; Hayes, Sean W; Friedman, John D; Sciammarella, Maria G; Cohen, Ishac; Gerlach, James; Waechter, Parker B; Germano, Guido; Hachamovitch, Rory

2004-01-01

Recently, a 17-segment model of the left ventricle has been recommended as an optimally weighted approach for interpreting myocardial perfusion single photon emission computed tomography (SPECT). Methods to convert databases from previous 20- to new 17-segment data and criteria for abnormality for the 17-segment scores are needed. Initially, for derivation of the conversion algorithm, 65 patients were studied (algorithm population) (pilot group, n = 28; validation group, n = 37). Three conversion algorithms were derived: algorithm 1, which used mid, distal, and apical scores; algorithm 2, which used distal and apical scores alone; and algorithm 3, which used maximal scores of the distal septal, lateral, and apical segments in the 20-segment model for 3 corresponding segments of the 17-segment model. The prognosis population comprised 16,020 consecutive patients (mean age, 65 +/- 12 years; 41% women) who had exercise or vasodilator stress technetium 99m sestamibi myocardial perfusion SPECT and were followed up for 2.1 +/- 0.8 years. In this population, 17-segment scores were derived from 20-segment scores by use of algorithm 2, which demonstrated the best agreement with expert 17-segment reading in the algorithm population. The prognostic value of the 20- and 17-segment scores was compared by converting the respective summed scores into percent myocardium abnormal. Conversion algorithm 2 was found to be highly concordant with expert visual analysis by the 17-segment model (r = 0.982; kappa = 0.866) in the algorithm population. In the prognosis population, 456 cardiac deaths occurred during follow-up. When the conversion algorithm was applied, extent and severity of perfusion defects were nearly identical by 20- and derived 17-segment scores. The receiver operating characteristic curve areas by 20- and 17-segment perfusion scores were identical for predicting cardiac death (both 0.77 +/- 0.02, P = not significant). The optimal prognostic cutoff value for either 20
Reformulation of the Children's Eating Attitudes Test (ChEAT): factor structure and scoring method in a non-clinical population.

Science.gov (United States)

Anton, S D; Han, H; Newton, R L; Martin, C K; York-Crowe, E; Stewart, T M; Williamson, D A

2006-12-01

The primary aims of this study were to empirically test the factor structure of the Children's Eating Attitudes Test (ChEAT) through both exploratory and confirmatory factor analyses and to interpret the factor structure of the ChEAT within the context of a new scoring method. The ChEAT was administered to 728 children in the 2nd through 6th grades (from five schools) at two different time points. Exactly half the students were male and half were female. To the best of our knowledge, this is the first study to empirically test the merits of an alternative 6-point scoring system as compared to the traditionally used 4-point scoring system. With the new scoring procedure, the skewness for all factor scores decreased, which resulted in increased variance in the item scores, as well as the total ChEAT score. Since the internal consistency of two factors in a recently proposed model was not acceptable (ChEAT reported by previous investigations. Intercorrelations among the factors suggested three higher order constructs. These findings indicate that the ChEAT subscales may be sufficiently stable to allow use in non-clinical samples of children.
Electrocardiogram interpretation skills among ambulance nurses.

Science.gov (United States)

Werner, Kristoffer; Kander, Kristofer; Axelsson, Christer

2016-06-01

To describe ambulance nurses' practical electrocardiogram (ECG) interpretation skills and to measure the correlation between these skills and factors that may impact on the level of knowledge. This study was conducted using a prospective quantitative survey with questionnaires and a knowledge test. A convenience sample collection was conducted among ambulance nurses in three different districts in western Sweden. The knowledge test consisted of nine different ECGs. The score of the ECG test were correlated against the questions in the questionnaire regarding both general ECG interpretation skill and ability to identify acute myocardial infarction using Mann-Whitney U test, Kruskal-Wallis test and Spearman's rank correlation. On average, the respondents had 54% correct answers on the test and identified 46% of the ECGs indicating acute myocardial infarction. The median total score was 9 of 16 (interquartile range 7-11) and 1 of 3 (IQR 1-2) in infarction points. No correlation between ECG interpretation skill and factors such as education and professional experience was found, except that coronary care unit experience was associated with better results on the ECG test. Ambulance nurses have deficiencies in their ECG interpretation skills. This also applies to conditions where the ambulance crew has great potential to improve the outcome of the patient's health, such as myocardial infarction and cardiac arrest. Neither education, extensive experience in ambulance service nor in nursing contributed to an improved result. The only factor of importance for higher ECG interpretation knowledge was prior experience of working in a coronary care unit. © The European Society of Cardiology 2014.
Predictors of Knowledge and Image Interpretation Skill Development in Radiology Residents.

Science.gov (United States)

Ravesloot, Cécile J; van der Schaaf, Marieke F; Kruitwagen, Cas L J J; van der Gijp, Anouk; Rutgers, Dirk R; Haaring, Cees; Ten Cate, Olle; van Schaik, Jan P J

2017-09-01

Purpose To investigate knowledge and image interpretation skill development in residency by studying scores on knowledge and image questions on radiology tests, mediated by the training environment. Materials and Methods Ethical approval for the study was obtained from the ethical review board of the Netherlands Association for Medical Education. Longitudinal test data of 577 of 2884 radiology residents who took semiannual progress tests during 5 years were retrospectively analyzed by using a nonlinear mixed-effects model taking training length as input variable. Tests included nonimage and image questions that assessed knowledge and image interpretation skill. Hypothesized predictors were hospital type (academic or nonacademic), training hospital, enrollment age, sex, and test date. Results Scores showed a curvilinear growth during residency. Image scores increased faster during the first 3 years of residency and reached a higher maximum than knowledge scores (55.8% vs 45.1%). The slope of image score development versus knowledge question scores of 1st-year residents was 16.8% versus 12.4%, respectively. Training hospital environment appeared to be an important predictor in both knowledge and image interpretation skill development (maximum score difference between training hospitals was 23.2%; P radiology residency and leveled off in the 3rd and 4th training year. The shape of the curve was mainly influenced by the specific training hospital. © RSNA, 2017 Online supplemental material is available for this article.
The Truth about Scores Children Achieve on Tests.

Science.gov (United States)

Brown, Jonathan R.

1989-01-01

The importance of using the standard error of measurement (SEm) in determining reliability in test scores is emphasized. The SEm is compared to the hypothetical true score for standardized tests, and procedures for calculation of the SEm are explained. (JDD)
Deep convolutional neural networks for interpretable analysis of EEG sleep stage scoring

DEFF Research Database (Denmark)

Vilamala, Albert; Madsen, Kristoffer Hougaard; Hansen, Lars K.

2017-01-01

to purse for an automatic stage scoring based on machine learning techniques have been carried out over the last years. In this work, we resort to multitaper spectral analysis to create visually interpretable images of sleep patterns from EEG signals as inputs to a deep convolutional network trained...... to solve visual recognition tasks. As a working example of transfer learning, a system able to accurately classify sleep stages in new unseen patients is presented. Evaluations in a widely-used publicly available dataset favourably compare to state-of-the-art results, while providing a framework for visual...
Scoring in genetically modified organism proficiency tests based on log-transformed results.

Science.gov (United States)

Thompson, Michael; Ellison, Stephen L R; Owen, Linda; Mathieson, Kenneth; Powell, Joanne; Key, Pauline; Wood, Roger; Damant, Andrew P

2006-01-01

The study considers data from 2 UK-based proficiency schemes and includes data from a total of 29 rounds and 43 test materials over a period of 3 years. The results from the 2 schemes are similar and reinforce each other. The amplification process used in quantitative polymerase chain reaction determinations predicts a mixture of normal, binomial, and lognormal distributions dominated by the latter 2. As predicted, the study results consistently follow a positively skewed distribution. Log-transformation prior to calculating z-scores is effective in establishing near-symmetric distributions that are sufficiently close to normal to justify interpretation on the basis of the normal distribution.
Radiology Residents' Performance in Screening Mammography Interpretation

International Nuclear Information System (INIS)

Lee, Eun Hye; Lyou, Chae Yeon

2013-01-01

To evaluate radiology residents' performance in screening mammography interpretation and to analyze the factors affecting performance. We enrolled 203 residents from 21 institutions and performed mammography interpretation tests. Between the trainee and non-trainee groups, we compared the interpretation score, recall rate, sensitivity, positive predictive value (PPV) and false-positive rate (FPR). We estimated the training effect using the score differences between trainee and non-trainee groups. We analyzed the factors affecting performance between training-effective and non-effective groups. Trainees were superior to non-trainees regarding interpretation score (43.1 vs. 37.1), recall rate (11.0 vs. 15.5%), sensitivity (83.6 vs. 72.0%), PPV (53.0 vs. 32.4%) and FPR (13.5 vs. 25.5). The longer the training period, the better were the interpretation score, recall rate, sensitivity, PPV and FPR (rho = 0.486, -0.375, 0.343, 0.504, -0.446, respectively). The training affected an increase by an average of 6 points; however, 31.6% of institutions showed no effect. A difference was noted in the volume of mammography interpretation during a month (594.0 vs. 476.9) and dedication of breast staff (61.5 vs. 0%) between training-effective and non-effective groups. Trainees showed better performance in mammography interpretation compared to non-trainees. Moreover, performance was correlated with the training period. The factors affecting performance were the volume of mammography interpretation and the dedication of the breast staff.
The effect of four instructional methods, gender, and time of testing on the achievement of sixth graders learning to interpret graphs

Science.gov (United States)

Young, Jerry Wayne

The purpose of this study was to determine the effects of four instructional methods (direct instruction, computer-aided instruction, video observation, and microcomputer-based lab activities), gender, and time of testing (pretest, immediate posttest for determining the immediate effect of instruction, and a delayed posttest two weeks later to determine the retained effect of the instruction) on the achievement of sixth graders who were learning to interpret graphs of displacement and velocity. The dependent variable of achievement was reflected in the scores earned by students on a testing instrument of established validity and reliability. The 107 students participating in the study were divided by gender and were then randomly assigned to the four treatment groups, each taught by a different teacher. Each group had approximately equal numbers of males and females. The students were pretested and then involved in two class periods of the instructional method which was unique to their group. Immediately following treatment they were posttested and two weeks later they were posttested again. The data in the form of test scores were analyzed with a two-way split-plot analysis of variance to determine if there was significant interaction among technique, gender, and time of testing. When significant interaction was indicated, the Tukey HSD test was used to determine specific mean differences. The results of the analysis indicated no gender effect. Only students in the direct instruction group and the microcomputer-based laboratory group had significantly higher posttest-1 scores than pretest scores. They also had significantly higher posttest-2 scores than pretest scores. This suggests that the learning was retained. The other groups experienced no significant differences among pretest, posttest-1, and posttest-2 scores. Recommendations are that direct instruction and microcomputer-based laboratory activities should be considered as effective stand-alone methods for
Laboratory test result interpretation for primary care doctors in South Africa

Directory of Open Access Journals (Sweden)

Naadira Vanker

2017-03-01

Full Text Available Background: Challenges and uncertainties with test result interpretation can lead to diagnostic errors. Primary care doctors are at a higher risk than specialists of making these errors, due to the range in complexity and severity of conditions that they encounter. Objectives: This study aimed to investigate the challenges that primary care doctors face with test result interpretation, and to identify potential countermeasures to address these. Methods: A survey was sent out to 7800 primary care doctors in South Africa. Questionnaire themes included doctors’ uncertainty with interpreting test results, mechanisms used to overcome this uncertainty, challenges with appropriate result interpretation, and perceived solutions for interpreting results. Results: Of the 552 responses received, the prevalence of challenges with result interpretation was estimated in an average of 17% of diagnostic encounters. The most commonly-reported challenges were not receiving test results in a timely manner (51% of respondents and previous results not being easily available (37%. When faced with diagnostic uncertainty, 84% of respondents would either follow-up and reassess the patient or discuss the case with a specialist, and 67% would contact a laboratory professional. The most useful test utilisation enablers were found to be: interpretive comments (78% of respondents, published guidelines (74%, and a dedicated laboratory phone line (72%. Conclusion: Primary care doctors acknowledge uncertainty with test result interpretation. Potential countermeasures include the addition of patient-specific interpretive comments, the availability of guidelines or algorithms, and a dedicated laboratory phone line. The benefit of enhanced test result interpretation would reduce diagnostic error rates.
[Propensity score matching in SPSS].

Science.gov (United States)

Huang, Fuqiang; DU, Chunlin; Sun, Menghui; Ning, Bing; Luo, Ying; An, Shengli

2015-11-01

To realize propensity score matching in PS Matching module of SPSS and interpret the analysis results. The R software and plug-in that could link with the corresponding versions of SPSS and propensity score matching package were installed. A PS matching module was added in the SPSS interface, and its use was demonstrated with test data. Score estimation and nearest neighbor matching was achieved with the PS matching module, and the results of qualitative and quantitative statistical description and evaluation were presented in the form of a graph matching. Propensity score matching can be accomplished conveniently using SPSS software.

Can Linear Regression Modeling Help Clinicians in the Interpretation of Genotypic Resistance Data? An Application to Derive a Lopinavir-Score

DEFF Research Database (Denmark)

Cozzi-Lepri, Alessandro; Prosperi, Mattia C F; Kjær, Jesper

2011-01-01

explored the potential of linear regression to construct a simple predictive model for lopinavir/r-based TCE. Although, the performance of our proposed score was similar to that of already existing IS, previously unrecognized lopinavir/r-associated mutations were identified. The analysis illustrates......BACKGROUND: The question of whether a score for a specific antiretroviral (e.g. lopinavir/r in this analysis) that improves prediction of viral load response given by existing expert-based interpretation systems (IS) could be derived from analyzing the correlation between genotypic data......). Our analysis identified mutations V82A, I54V, K20I and I62V, which were associated with reduced viral response and mutations I15V and V91S which determined lopinavir/r hypersensitivity. All models performed equally well (ASE on test ranging between 1.1 and 1.3, p¿=¿0.34). CONCLUSIONS: We fully...
The Art Of Interpretation – Chances And Risks On Interpretation In The Field Of Mobile Testing

Directory of Open Access Journals (Sweden)

Palatini Kerstin

2014-12-01

Full Text Available Carrying out a usability test is a demanding process per se. Mobile tests raise this claim because they are subject to real usage conditions and therefore unforeseeable factors. On the one hand there are the technical factors like tools, software and laboratory equipment, but on the other hand there are the human beeings with their knowledge and decision-making. They are taking the selection of tools, methods and data, and they decide in every situation of the process of testing. Using a mobile eye-tracking test, the authors will explain where the sources for interpretation are and when misinterpretation become an error. Technology philosophical considerations on interpretation and hermeneutics have to support the recognition of the potential of interpretation. As a result, misinterpretation can be minimized.
Interpretation of growth hormone provocative tests

DEFF Research Database (Denmark)

Andersson, A M; Orskov, H; Ranke, M B

1995-01-01

To compare interpretations of growth hormone (GH) provocative tests in laboratories using six different GH immunoassays (one enzymeimmunometric assay (EIMA, assay 1), one immunoradiometric assay (IRMA, assay 5), one time-resolved fluorimmunometric assay (TRFIA, assay 3) and three radioimmunoassays...
Improving personality facet scores with multidimensional computer adaptive testing

DEFF Research Database (Denmark)

Makransky, Guido; Mortensen, Erik Lykke; Glas, Cees A W

2013-01-01

personality tests contain many highly correlated facets. This article investigates the possibility of increasing the precision of the NEO PI-R facet scores by scoring items with multidimensional item response theory and by efficiently administering and scoring items with multidimensional computer adaptive...
The Effect of Mock Tests on Iranian EFL learners’ Test Scores

Directory of Open Access Journals (Sweden)

Hossein Khodabakhshzadeh

2016-07-01

Full Text Available The effect of using tests in test preparation courses has been subject to debate. While some scholars such as Yang and Badger (2015 believe it is a cause of positive washback effect, others argue that this issue is tentative and context-bound (Green, 2007. Therefore, this study investigated the effect of using Mock tests in International English Language Testing System (IELTS preparation courses on students’ overall IELTS scores. Fifty one IELTS students were selected non-randomly through the quota sampling approach out of 76 students at Mahan Language Institute in Birjand, Iran. These participants were distributed into Group 1 (n=25 and Group 2 (n=26. A complete IELTS test was administered to ensure that the Groups were homogeneous and to serve as pretest. After 10 sessions of intervention, a different IELTS test was administered as posttest. The results of between subject analysis through independent samples t-test revealed that using Mock tests in the IELTS preparation courses can positively affect the participants scores on IELTS exam. Pedagogical implications are discussed.
Reporting Diagnostic Scores in Educational Testing: Temptations, Pitfalls, and Some Solutions

Science.gov (United States)

Sinharay, Sandip; Puhan, Gautam; Haberman, Shelby J.

2010-01-01

Diagnostic scores are of increasing interest in educational testing due to their potential remedial and instructional benefit. Naturally, the number of educational tests that report diagnostic scores is on the rise, as are the number of research publications on such scores. This article provides a critical evaluation of diagnostic score reporting…
Item hierarchy-based analysis of the Rivermead Mobility Index resulted in improved interpretation and enabled faster scoring in patients undergoing rehabilitation after stroke.

Science.gov (United States)

Roorda, Leo D; Green, John R; Houwink, Annemieke; Bagley, Pam J; Smith, Jane; Molenaar, Ivo W; Geurts, Alexander C

2012-06-01

To enable improved interpretation of the total score and faster scoring of the Rivermead Mobility Index (RMI) by studying item ordering or hierarchy and formulating start-and-stop rules in patients after stroke. Cohort study. Rehabilitation center in the Netherlands; stroke rehabilitation units and the community in the United Kingdom. Item hierarchy of the RMI was studied in an initial group of patients (n=620; mean age ± SD, 69.2±12.5y; 297 [48%] men; 304 [49%] left hemisphere lesion, and 269 [43%] right hemisphere lesion), and the adequacy of the item hierarchy-based start-and-stop rules was checked in a second group of patients (n=237; mean age ± SD, 60.0±11.3y; 139 [59%] men; 103 [44%] left hemisphere lesion, and 93 [39%] right hemisphere lesion) undergoing rehabilitation after stroke. Not applicable. Mokken scale analysis was used to investigate the fit of the double monotonicity model, indicating hierarchical item ordering. The percentages of patients with a difference between the RMI total score and the scores based on the start-and-stop rules were calculated to check the adequacy of these rules. The RMI had good fit of the double monotonicity model (coefficient H(T)=.87). The interpretation of the total score improved. Item hierarchy-based start-and-stop rules were formulated. The percentages of patients with a difference between the RMI total score and the score based on the recommended start-and-stop rules were 3% and 5%, respectively. Ten of the original 15 items had to be scored after applying the start-and-stop rules. Item hierarchy was established, enabling improved interpretation and faster scoring of the RMI. Copyright © 2012 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Increasing the reliability of the fluid/crystallized difference score from the Kaufman Adolescent and Adult Intelligence Test with reliable component analysis.

Science.gov (United States)

Caruso, J C

2001-06-01

The unreliability of difference scores is a well documented phenomenon in the social sciences and has led researchers and practitioners to interpret differences cautiously, if at all. In the case of the Kaufman Adult and Adolescent Intelligence Test (KAIT), the unreliability of the difference between the Fluid IQ and the Crystallized IQ is due to the high correlation between the two scales. The consequences of the lack of precision with which differences are identified are wide confidence intervals and unpowerful significance tests (i.e., large differences are required to be declared statistically significant). Reliable component analysis (RCA) was performed on the subtests of the KAIT in order to address these problems. RCA is a new data reduction technique that results in uncorrelated component scores with maximum proportions of reliable variance. Results indicate that the scores defined by RCA have discriminant and convergent validity (with respect to the equally weighted scores) and that differences between the scores, derived from a single testing session, were more reliable than differences derived from equal weighting for each age group (11-14 years, 15-34 years, 35-85+ years). This reliability advantage results in narrower confidence intervals around difference scores and smaller differences required for statistical significance.
A process dissociation approach to objective-projective test score interrelationships.

Science.gov (United States)

Bornstein, Robert F

2002-02-01

Even when self-report and projective measures of a given trait or motive both predict theoretically related features of behavior, scores on the 2 tests correlate modestly with each other. This article describes a process dissociation framework for personality assessment, derived from research on implicit memory and learning, which can resolve these ostensibly conflicting results. Research on interpersonal dependency is used to illustrate 3 key steps in the process dissociation approach: (a) converging behavioral predictions, (b) modest test score intercorrelations, and (c) delineation of variables that differentially affect self-report and projective test scores. Implications of the process dissociation framework for personality assessment and test development are discussed.
A prognostic scoring system for arm exercise stress testing.

Science.gov (United States)

Xie, Yan; Xian, Hong; Chandiramani, Pooja; Bainter, Emily; Wan, Leping; Martin, Wade H

2016-01-01

Arm exercise stress testing may be an equivalent or better predictor of mortality outcome than pharmacological stress imaging for the ≥50% for patients unable to perform leg exercise. Thus, our objective was to develop an arm exercise ECG stress test scoring system, analogous to the Duke Treadmill Score, for predicting outcome in these individuals. In this retrospective observational cohort study, arm exercise ECG stress tests were performed in 443 consecutive veterans aged 64.1 (11.1) years. (mean (SD)) between 1997 and 2002. From multivariate Cox models, arm exercise scores were developed for prediction of 5-year and 12-year all-cause and cardiovascular mortality and 5-year cardiovascular mortality or myocardial infarction (MI). Arm exercise capacity in resting metabolic equivalents (METs), 1 min heart rate recovery (HRR) and ST segment depression ≥1 mm were the stress test variables independently associated with all-cause and cardiovascular mortality by step-wise Cox analysis (all pstatistic of 0.81 before and 0.88 after adjustment for significant demographic and clinical covariates. Arm exercise scores for the other outcome end points yielded C-statistic values of 0.77-0.79 before and 0.82-0.86 after adjustment for significant covariates versus 0.64-0.72 for best fit pharmacological myocardial perfusion imaging models in a cohort of 1730 veterans who were evaluated over the same time period. Arm exercise scores, analogous to the Duke Treadmill Score, have good power for prediction of mortality or MI in patients who cannot perform leg exercise.
Visual and confocal microscopic interpretation of patch tests to benzethonium chloride and benzalkonium chloride.

Science.gov (United States)

Benjamin, Bohaty; Chris, Fricker; Salvador, González; Melissa, Gill; Susan, Nedorost

2012-08-01

Quaternary ammonium compounds (Quats), such as benzalkonium chloride (BAC) and benzethonium chloride (BEC), are widely used as antibacterial active ingredients and preservatives in personal care products, disinfectants, and ophthalmic preparations. BAC is known to be a marginal irritant when patch tested at 0.15% aq. Data on BEC are limited. To differentiate irritant from allergic patch test reactions to quaternary ammonium compounds. Eight subjects who were considered likely to react based on history of rash after exposure to disinfectants or a history of prior positive patch test to BAC were recruited, as well as two patients undergoing routine patch testing. BAC (0.15% aq), BAC (0.15% pet), BEC (0.05% aq), BEC (0.15% pet), BEC (0.15% aq), BEC (0.5% aq), sodium lauryl sulfate (2.0%), and deionized water were applied under Finn chambers for 48 h. Four days and 7 days after application, the sites were examined visually and then by in vivo reflectance confocal microscopy (RCM) which was interpreted by blinded experts. Two patients with definite allergic reactions according to visual patch test reads and RCM were clinically relevant. Cross-reaction between BEC and BAC was demonstrated in one patient. RCM imaging correlated well with clinical scoring and interpretation of patch test reactions in terms of irritancy vs. allergy for BEC and BAC. Relevant allergic reactions to quats occur in humans. Possible cross-reaction was noted to occur between BAC and BEC. RCM appears to be a useful tool in distinguishing between irritancy and sensitization during patch testing to BAC and BEC. Further study of prevalence and best test concentration and vehicle is needed. © 2011 John Wiley & Sons A/S.
What do educational test scores really measure?

DEFF Research Database (Denmark)

McIntosh, James; D. Munk, Martin

Latent class Poisson count models are used to analyze a sample of Danish test score results from a cohort of individuals born in 1954-55 and tested in 1968. The procedure takes account of unobservable effects as well as excessive zeros in the data. The bulk of unobservable effects are uncorrelate......, and possible incentive problems make it more difficult to elicit true values of what the tests measure....
Increased correlation coefficient between the written test score and tutors' performance test scores after training of tutors for assessment of medical students during problem-based learning course in Malaysia.

Science.gov (United States)

Jaiprakash, Heethal; Min, Aung Ko Ko; Ghosh, Sarmishtha

2016-03-01

This paper is aimed at finding if there was a change of correlation between the written test score and tutors' performance test scores in the assessment of medical students during a problem-based learning (PBL) course in Malaysia. This is a cross-sectional observational study, conducted among 264 medical students in two groups from November 2010 to November 2012. The first group's tutors did not receive tutor training; while the second group's tutors were trained in the PBL process. Each group was divided into high, middle and low achievers based on their end-of-semester exam scores. PBL scores were taken which included written test scores and tutors' performance test scores. Pearson correlation coefficient was calculated between the two kinds of scores in each group. The correlation coefficient between the written scores and tutors' scores in group 1 was 0.099 (pcorrelation coefficient in the group where tutors received the PBL training reinforces the importance of tutor training before their participation in the PBL course.
Keeping Your Audience in Mind: Applying Audience Analysis to the Design of Interactive Score Reports

Science.gov (United States)

Zapata-Rivera, Juan Diego; Katz, Irvin R.

2014-01-01

Score reports have one or more intended audiences: the people who use the reports to make decisions about test takers, including teachers, administrators, parents and test takers. Attention to audience when designing a score report supports assessment validity by increasing the likelihood that score users will interpret and use assessment results…
Reference values for spirometry and their use in test interpretation: A Position Statement from the Australian and New Zealand Society of Respiratory Science.

Science.gov (United States)

Brazzale, Danny; Hall, Graham; Swanney, Maureen P

2016-10-01

Traditionally, spirometry testing tended to be confined to the realm of hospital-based laboratories but is now performed in a variety of health care settings. Regardless of the setting in which the test is conducted, the fundamental basis of spirometry is that the test is both performed and interpreted according to the international standards. The purpose of this Australian and New Zealand Society of Respiratory Science (ANZSRS) statement is to provide the background and recommendations for the interpretation of spirometry results in clinical practice. This includes the benchmarking of an individual's results to population reference data, as well as providing the platform for a statistically and conceptually based approach to the interpretation of spirometry results. Given the many limitations of older reference equations, it is imperative that the most up-to-date and relevant reference equations are used for test interpretation. Given this, the ANZSRS recommends the adoption of the Global Lung Function Initiative (GLI) 2012 spirometry reference values throughout Australia and New Zealand. The ANZSRS also recommends that interpretation of spirometry results is based on the lower limit of normal from the reference values and the use of Z-scores where available. © 2016 The Authors. Respirology published by John Wiley & Sons Australia, Ltd on behalf of Asian Pacific Society of Respirology.
Effects of white noise on Callsign Acquisition Test and Modified Rhyme Test scores.

Science.gov (United States)

Blue-Terry, Misty; Letowski, Tomasz

2011-02-01

The Callsign Acquisition Test (CAT) is a speech intelligibility test developed by the US Army Research Laboratory. The test has been used to evaluate speech transmission through various communication systems but has not been yet sufficiently standardised and validated. The aim of this study was to compare CAT and Modified Rhyme Test (MRT) performance in the presence of white noise across a range of signal-to-noise ratios (SNRs). A group of 16 normal-hearing listeners participated in the study. The speech items were presented at 65 dB(A) in the background of white noise at SNRs of -18, -15, -12, -9 and -6 dB. The results showed a strong positive association (75.14%) between the two tests, but significant differences between the CAT and MRT absolute scores in the range of investigated SNRs. Based on the data, a function to predict CAT scores based on existing MRT scores and vice versa was formulated. STATEMENT OF RELEVANCE: This work compares performance data of a common speech intelligibility test (MRT) with a new test (CAT) in the presence of white noise. The results here can be used as a part of the standardisation procedures and provide insights to the predictive capabilities of the CAT to quantify speech intelligibility communication in high-noise military environments.
ANOVA Analysis of Student Daily Test Scores in Multi-Day Test Periods

Science.gov (United States)

Mouritsen, Matthew L.; Davis, Jefferson T.; Jones, Steven C.

2016-01-01

Instructors are often concerned when giving multiple-day tests because students taking the test later in the exam period may have an advantage over students taking the test early in the exam period due to information leakage. However, exam scores seemed to decline as students took the same test later in a multi-day exam period (Mouritsen and…
The Effect of Pretest Exercise on Baseline Computerized Neurocognitive Test Scores.

Science.gov (United States)

Pawlukiewicz, Alec; Yengo-Kahn, Aaron M; Solomon, Gary

2017-10-01

Baseline neurocognitive assessment plays a critical role in return-to-play decision making following sport-related concussions. Prior studies have assessed the effect of a variety of modifying factors on neurocognitive baseline test scores. However, relatively little investigation has been conducted regarding the effect of pretest exercise on baseline testing. The aim of our investigation was to determine the effect of pretest exercise on baseline Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) scores in adolescent and young adult athletes. We hypothesized that athletes undergoing self-reported strenuous exercise within 3 hours of baseline testing would perform more poorly on neurocognitive metrics and would report a greater number of symptoms than those who had not completed such exercise. Cross-sectional study; Level of evidence, 3. The ImPACT records of 18,245 adolescent and young adult athletes were retrospectively analyzed. After application of inclusion and exclusion criteria, participants were dichotomized into groups based on a positive (n = 664) or negative (n = 6609) self-reported history of strenuous exercise within 3 hours of the baseline test. Participants with a positive history of exercise were then randomly matched, based on age, sex, education level, concussion history, and hours of sleep prior to testing, on a 1:2 basis with individuals who had reported no pretest exercise. The baseline ImPACT composite scores of the 2 groups were then compared. Significant differences were observed for the ImPACT composite scores of verbal memory, visual memory, reaction time, and impulse control as well as for the total symptom score. No significant between-group difference was detected for the visual motor composite score. Furthermore, pretest exercise was associated with a significant increase in the overall frequency of invalid test results. Our results suggest a statistically significant difference in ImPACT composite scores between
The Effect of Mock Tests on Iranian EFL learners’ Test Scores

OpenAIRE

Hossein Khodabakhshzadeh; Reza Zardkanloo

2016-01-01

The effect of using tests in test preparation courses has been subject to debate. While some scholars such as Yang and Badger (2015) believe it is a cause of positive washback effect, others argue that this issue is tentative and context-bound (Green, 2007). Therefore, this study investigated the effect of using Mock tests in International English Language Testing System (IELTS) preparation courses on students’ overall IELTS scores. Fifty one IELTS students were selected non-randomly through ...
How to interpret liver function tests

Directory of Open Access Journals (Sweden)

Christina Levick

2017-05-01

Full Text Available Careful interpretation of liver function tests within the clinical context can help elucidate the cause and severity of the underlying pathology. Predominantly raised alkaline phosphatase represents the cholestatic pattern of biliary pathology, whilst predominantly raised alanine aminotransferase and aspartate aminotransferase represent the hepatocellular pattern of hepatocellular pathology. The severity of liver dysfunction or biliary obstruction is reflected in the bilirubin level and the degree of liver synthetic function can also be indicated by the albumin level. Beyond the liver function tests, prothrombin time provides another marker of liver synthetic function and a low platelet count suggests portal hypertension.

Biering-Sorensen test scores in coal miners

Energy Technology Data Exchange (ETDEWEB)

Tekin, Y.; Ortancil, O.; Ankarali, H.; Basaran, A.; Sarikaya, S.; Ozdolap, S. [Zonguldak Karaelmas University, Zonguldak (Turkey)

2009-05-15

Biering-Sorensen test is an isometric back endurance test. Biering-Sorensen test scores have varied in different cultural and occupational groups. The aims of this study were to collect normative data on Biering-Sorensen holding times, to determine the discriminative ability of the Biering-Sorensen test in Turkish coal miners, and to examine the association between Biering-Sorensen test result and functional disability. One hundred and fifty male coal miners participated in this study. Trunk extensor muscle strength was measured using the Biering-Sorensen test. Oswestry disability index was used to measure the functional disability level of low back pain. The mean Biering-Sorensen holding time for the total subject group was 107.3 {+-} 22.5 s. The mean time of Biering-Sorensen test of the subjects with and without low back pain were 99.9 {+-} 19.8 and 128.6 {+-} 15.2 s, respectively. The difference between the subjects with and without low back pain was statistically significant (p < 0.001). There was a statistically significant negative correlation between Oswestry functional disability score and Biering-Sorensen holding time (R = -0.824, p < 0.001). Turkish coal miners have low mean back extensor endurance holding times. Biering-Sorensen test had a good discriminative ability in our study group. Trunk muscle strength has a significant effect on the disability level of low back pain. Thus trunk muscle endurance training exercise therapy may be effective for the reduction of disability in patients with low back pain.
TRAC, a collaborative computer tool for tracer-test interpretation

Directory of Open Access Journals (Sweden)

Fécamp C.

2013-05-01

Full Text Available Artificial tracer tests are widely used by consulting engineers for demonstrating water circulation, proving the existence of leakage, or estimating groundwater velocity. However, the interpretation of such tests is often very basic, with the result that decision makers and professionals commonly face unreliable results through hasty and empirical interpretation. There is thus an increasing need for a reliable interpretation tool, compatible with the latest operating systems and available in several languages. BRGM, the French Geological Survey, has developed a project together with hydrogeologists from various other organizations to build software assembling several analytical solutions in order to comply with various field contexts. This computer program, called TRAC, is very light and simple, allowing the user to add his own analytical solution if the formula is not yet included. It aims at collaborative improvement by sharing the tool and the solutions. TRAC can be used for interpreting data recovered from a tracer test as well as for simulating the transport of a tracer in the saturated zone (for the time being. Calibration of a site operation is based on considering the hydrodynamic and hydrodispersive features of groundwater flow as well as the amount, nature and injection mode of the artificial tracer. The software is available in French, English and Spanish, and the latest version can be downloaded from the web site http://trac.brgm.fr.
Marviken test-data interpretation, second project

International Nuclear Information System (INIS)

Collen, J.; Johansson, A.

1978-12-01

A brief description is given of the investigations carried out and the corclusions drawn within the MARTIN-II project, which involved the evaluation and interpretation of the data from the full scale containment response tests at the Marviken Power Station. The data from the tests, which were completed in 1976, provide information about the periodic pressure oscillations and rapid pressure spikes induced in the pressure-suppression containment during study comprise the following items: - Influence of test parameters on pressure oscillations and pressure spikes - Pressure spikes in the wetwell pool - High frequency oscillations - Comparisons between single-pipe and multi-pipe data The study was carried out by Studsvik Energiteknik AB with consulting efforts from AB ASEA-ATOM. It was financed by the Swedish Nuclear Power Inspectorate. (Auth.)
Increased correlation coefficient between the written test score and tutors’ performance test scores after training of tutors for assessment of medical students during problem-based learning course in Malaysia

Directory of Open Access Journals (Sweden)

Heethal Jaiprakash

2016-03-01

Full Text Available This paper is aimed at finding if there was a change of correlation between the written test score and tutors’ performance test scores in the assessment of medical students during a problem-based learning (PBL course in Malaysia. This is a cross-sectional observational study, conducted among 264 medical students in two groups from November 2010 to November 2012. The first group’s tutors did not receive tutor training; while the second group’s tutors were trained in the PBL process. Each group was divided into high, middle and low achievers based on their end-of-semester exam scores. PBL scores were taken which included written test scores and tutors’ performance test scores. Pearson correlation coefficient was calculated between the two kinds of scores in each group. The correlation coefficient between the written scores and tutors’ scores in group 1 was 0.099 (p<0.001 and for group 2 was 0.305 (p<0.001. The higher correlation coefficient in the group where tutors received the PBL training reinforces the importance of tutor training before their participation in the PBL course.
Gender, Stereotype Threat and Mathematics Test Scores

OpenAIRE

Ming Tsui; Xiao Y. Xu; Edmond Venator

2011-01-01

Problem statement: Stereotype threat has repeatedly been shown to depress womens scores on difficult math tests. An attempt to replicate these findings in China found no support for the stereotype threat hypothesis. Our math test was characterized as being personally important for the student participants, an atypical condition in most stereotype threat laboratory research. Approach: To evaluate the effects of this personal demand, we conducted three experiments. Results: ...
The Weighted Airman Promotion System: Standardizing Test Scores

Science.gov (United States)

2008-01-01

u th o ri ze d Top 3/E6 ratio, inventory 1401206040 100 70 130 5R 2F 2G 3N 2M 2A 4J 4C 4P 4T 4B 1W 2T 3P 1T 4A 2S 5J 1A 1S1C 6F 4N 7S 4R 4E 1N 3A 3V...System: Standardizing Test Scores AFHRL convened a panel to identify the relevant factors to consider, and then sit as a promotion board and rank...Costs If the Air Force decided to standardize test scores, there would be three basic types of costs: implementation costs, marketing costs, and
[Interpretation of proverbs and Alzheimer's disease].

Science.gov (United States)

Báez, S; Mendoza, L; Reyes, P; Matallana, D; Montañés, P

To evaluate the performance of patients with Alzheimer's disease (AD) in the mild-moderate stage in a verbal material abstraction task that involves interpreting the implicit meaning of proverbs and sayings. A qualitative-quantitative analysis was carried out of the performance of 30 patients with AD and 30 controls, paired by age, gender and level of education. Patients had significantly greater difficulties than the controls when it came to interpreting proverbs. A high correlation was found between subjects' years of schooling and the overall score on the proverb interpretation test. Results suggest that the processes that may be predominantly affected in patients with AD are the investigation of the conditions of the problem, together with selecting an alternative and formulating a cognitive plan to resolve the task. The results help to further our knowledge of the characteristics of performance of patients with AD in a test involving the interpretation of the implicit meaning of proverbs and also provide information about the processes that may be predominantly affected. Further research is needed, however, on this subject area in order to obtain more conclusive explanations.
Sensitizing Undergraduates to Potential Inaccuracies in Projective Test Interpretation.

Science.gov (United States)

Barret, Robert L.; Wachowiak, Dale G.

This paper describes a methodology developed to provide undergraduate students with direct experience in the process of impressionistic test interpretation. In the experiential exercise, students were shown Thematic Apperception Test cards and then read the responses given by an anonymous client. A discussion of the process by which the students…
Towards Intelligent Interpretation of Low Strain Pile Integrity Testing Results Using Machine Learning Techniques.

Science.gov (United States)

Cui, De-Mi; Yan, Weizhong; Wang, Xiao-Quan; Lu, Lie-Min

2017-10-25

Low strain pile integrity testing (LSPIT), due to its simplicity and low cost, is one of the most popular NDE methods used in pile foundation construction. While performing LSPIT in the field is generally quite simple and quick, determining the integrity of the test piles by analyzing and interpreting the test signals (reflectograms) is still a manual process performed by experienced experts only. For foundation construction sites where the number of piles to be tested is large, it may take days before the expert can complete interpreting all of the piles and delivering the integrity assessment report. Techniques that can automate test signal interpretation, thus shortening the LSPIT's turnaround time, are of great business value and are in great need. Motivated by this need, in this paper, we develop a computer-aided reflectogram interpretation (CARI) methodology that can interpret a large number of LSPIT signals quickly and consistently. The methodology, built on advanced signal processing and machine learning technologies, can be used to assist the experts in performing both qualitative and quantitative interpretation of LSPIT signals. Specifically, the methodology can ease experts' interpretation burden by screening all test piles quickly and identifying a small number of suspected piles for experts to perform manual, in-depth interpretation. We demonstrate the methodology's effectiveness using the LSPIT signals collected from a number of real-world pile construction sites. The proposed methodology can potentially enhance LSPIT and make it even more efficient and effective in quality control of deep foundation construction.
[Interpretation of laboratory tests for allergies in dogs].

Science.gov (United States)

Roosje, P

2010-03-01

There is widespread use of serum allergy tests which are promoted for identifying the reaction against certain allergens in atopic dermatitis, sarcoptes infestation and also food hypersensitivity in dogs. Around 20 years ago the first in-vitro tests were developed to identify allergen-specific IgE in dogs with atopic dermatitis. Since then, technical developments have markedly improved the quality of antibodies as well as the methods. The limitation of serum tests lies in the interpretation of test results as well as the diseases they are used for. This overview discusses usefulness and limitations in different skin diseases.
Generalization of the Lord-Wingersky Algorithm to Computing the Distribution of Summed Test Scores Based on Real-Number Item Scores

Science.gov (United States)

Kim, Seonghoon

2013-01-01

With known item response theory (IRT) item parameters, Lord and Wingersky provided a recursive algorithm for computing the conditional frequency distribution of number-correct test scores, given proficiency. This article presents a generalized algorithm for computing the conditional distribution of summed test scores involving real-number item…
Online pre-race education improves test scores for volunteers at a marathon.

Science.gov (United States)

Maxwell, Shane; Renier, Colleen; Sikka, Robby; Widstrom, Luke; Paulson, William; Christensen, Trent; Olson, David; Nelson, Benjamin

2017-09-01

This study examined whether an online course would lead to increased knowledge about the medical issues volunteers encounter during a marathon. Health care professionals who volunteered to provide medical coverage for an annual marathon were eligible for the study. Demographic information about medical volunteers including profession, specialty, education level and number of marathons they had volunteered for was collected. A 15-question test about the most commonly encountered medical issues was created by the authors and administered before and after the volunteers took the online educational course and compared to a pilot study the previous year. Seventy-four subjects completed the pre-test. Those who participated in the pilot study last year (N = 15) had pre-test scores that were an average of 2.4 points higher than those who did not (mean ranks: pilot study = 51.6 vs. non-pilot = 33.9, p = 0.004). Of the 74 subjects who completed the pre-test, 54 also completed the post-test. The overall post-pre mean score difference was 3.8 ± 2.7 (t = 10.5 df = 53 p online education demonstrated a long-term (one-year) increase in test scores. Testing also continued to show short-term improvement in post-course test scores, compared to pre-course test scores. In general, marathon medical volunteers who had no volunteer experience demonstrated greater improvement than those who had prior volunteer experience.
Using Raters from India to Score a Large-Scale Speaking Test

Science.gov (United States)

Xi, Xiaoming; Mollaun, Pam

2011-01-01

We investigated the scoring of the Speaking section of the Test of English as a Foreign Language[TM] Internet-based (TOEFL iBT[R]) test by speakers of English and one or more Indian languages. We explored the extent to which raters from India, after being trained and certified, were able to score the TOEFL examinees with mixed first languages…
Discriminant Validity of the WISC-IV Culture-Language Interpretive Matrix

Science.gov (United States)

Styck, Kara M.; Watkins, Marley W.

2014-01-01

The Culture-Language Interpretive Matrix (C-LIM) was developed to help practitioners determine the validity of test scores obtained from students who are culturally and linguistically different from the normative group of a test. The present study used an idiographic approach to investigate the diagnostic utility of the C-LIM for the Wechsler…
Group differences in the heritability of items and test scores

NARCIS (Netherlands)

Wicherts, J.M.; Johnson, W.

2009-01-01

It is important to understand potential sources of group differences in the heritability of intelligence test scores. On the basis of a basic item response model we argue that heritabilities which are based on dichotomous item scores normally do not generalize from one sample to the next. If groups
Adequate proverb interpretation is associated with performance on the independent living scales.

Science.gov (United States)

Ahmed, Fayeza S; Miller, L Stephen

2015-01-01

The purpose of this study was to examine proverb interpretation performance and functional independence in older adults. From the limited literature on proverb interpretation in aging and its conceptualization as an executive function, it was hypothesized that proverb interpretation would be related to functional independence similar to other executive functions. Tests of proverb interpretation, additional executive functions, and functional ability were administered to nondemented older adults. Results showed that proverb interpretation accounted for a significant amount of unique variance of functional ability scores. This supports including a measure of proverb interpretation to the assessment of older adults.
Interpretation and inverse analysis of the wedge splitting test

DEFF Research Database (Denmark)

Østergaard, Lennart; Stang, Henrik

2002-01-01

to the wedge splitting test and that it is well suited for the interpretation of test results in terms of s(w). A fine agreement between the hinge and FEM-models has been found. It has also been found that the test and the hinge model form a solid basis for inverse analysis. The paper also discusses possible...... three dimensional problems in the experiment as well as the influence of specimen size....
Interpretation of Chemical Pathology Test Results in Paediatrics ...

African Journals Online (AJOL)

At any time we interprete paediatric chemical pathology test results we must take into consideration a number of factors, which are related with and restricted to paediatric patients. Such factors include the paediatric patient's age that may change from prematurity to above 18 years, and the paediatric patient's body weight ...
Testing the applicability of the SASS5 scoring procedure for ...

African Journals Online (AJOL)

A study was undertaken between 29th January and 17th February 2004 to test the applicability of the South African Scoring System Version 5 (SASS5) scoring and calculation procedure in nutrient-enriched palustrine wetlands in the midlands of KwaZulu-Natal, South Africa. Four reference wetlands and three dairy-effluent ...
Evaluating the Predictive Validity of Graduate Management Admission Test Scores

Science.gov (United States)

Sireci, Stephen G.; Talento-Miller, Eileen

2006-01-01

Admissions data and first-year grade point average (GPA) data from 11 graduate management schools were analyzed to evaluate the predictive validity of Graduate Management Admission Test[R] (GMAT[R]) scores and the extent to which predictive validity held across sex and race/ethnicity. The results indicated GMAT verbal and quantitative scores had…

Indications and interpretation of esophageal function testing.

Science.gov (United States)

Gyawali, C Prakash; de Bortoli, Nicola; Clarke, John; Marinelli, Carla; Tolone, Salvatore; Roman, Sabine; Savarino, Edoardo

2018-05-12

Esophageal symptoms are common, and can arise from mucosal, motor, functional, and neoplastic processes, among others. Judicious use of diagnostic testing can help define the etiology of symptoms and can direct management. Endoscopy, esophageal high-resolution manometry (HRM), ambulatory pH or pH-impedance manometry, and barium radiography are commonly used for esophageal function testing; functional lumen imaging probe is an emerging option. Recent consensus guidelines have provided direction in using test findings toward defining mechanisms of esophageal symptoms. The Chicago Classification describes hierarchical steps in diagnosing esophageal motility disorders. The Lyon Consensus characterizes conclusive evidence on esophageal testing for a diagnosis of gastroesophageal reflux disease (GERD), and establishes a motor classification of GERD. Taking these recent advances into consideration, our discussion focuses primarily on the indications, technique, equipment, and interpretation of esophageal HRM and ambulatory reflux monitoring in the evaluation of esophageal symptoms, and describes indications for alternative esophageal tests. © 2018 New York Academy of Sciences.
Design of a single-borehole hydraulic test programme allowing for interpretation-based errors

International Nuclear Information System (INIS)

Black, J.H.

1987-07-01

Hydraulic testing using packers in single boreholes is one of the most important sources of data to safety assessment modelling in connection with the disposal of radioactive waste. It is also one of the most time-consuming and expensive. It is important that the results are as reliable as possible and as accurate as necessary for the use that is made of them. There are many causes of possible error and inaccuracy ranging from poor field practice to inappropriate interpretation procedure. The report examines and attempts to quantify the size of error arising from the accidental use of an inappropriate or inadequate interpretation procedure. In doing so, it can be seen which interpretation procedure or combination of procedures results in least error. Lastly, the report attempts to use the previous conclusions from interpretation to propose forms of field test procedure where interpretation-based errors will be minimised. Hydraulic tests (sometimes known as packer tests) come in three basic forms: slug/pulse, constant flow and constant head. They have different characteristics, some measuring a variable volume of rock (dependent on hydraulic conductivity) and some having a variable duration (dependent on hydraulic conductivity). A combination of different tests in the same interval is seen as desirable. For the purposes of assessing interpretation-based errors, slug and pulse tests are considered together as are constant flow and constant head tests. The same method is used in each case to assess errors. The method assumes that the simplest analysis procedure (cylindrical flow in homogeneous isotropic porous rock) will be used on each set of field data. The error is assessed by calculating synthetic data for alternative configurations (e.g. fissured rock, anisotropic rock, inhomogeneous rock - i.e. skin - etc.) and then analyzing this data using the simplest analysis procedure. 28 refs., 26 figs
Determination of difficult concepts in the interpretation of musculoskeletal radiographs using a web-based learning/teaching tool

International Nuclear Information System (INIS)

Nunn, Heidi; Nunn, David L.

2011-01-01

Aim: To identify which aspects of musculoskeletal radiograph image interpretation users of a web-based learning resource found to be most difficult. Method: The resource provides modular online training, based on twelve musculoskeletal anatomical and pathological areas. At the end of each module is a multiple choice self-test, which users can utilize to consolidate their learning. There are 217 questions within the tests. The results for all questions answered on or before 1st February 2011 were analyzed, and the lowest scoring 25% of questions subsequently reviewed. A low-scoring question implies that the subject was difficult. Results: Users provided a total of 117,097 answers. The range of scores provided by the test questions varied significantly (P < 0.0001), from 15.8% to 93.8%. Topics appearing in the lowest quartile were analyzed in detail. They included interpretation of paediatric radiographs, the Salter-Harris classification, soft-tissue signs and the identification of multiple injuries. The lowest scoring modules were the shoulder and ankle. Conclusion: The results of this study will help to guide educators both within radiography and other health professions in providing more targeted teaching in musculoskeletal image interpretation.
Radiographic interpretation of the appendicular skeleton: A comparison between casualty officers, nurse practitioners and radiographers

International Nuclear Information System (INIS)

Coleman, Liz; Piper, Keith

2009-01-01

Aim: To assess how accurately and confidently casualty officers, nurse practitioners and radiographers, practicing within the emergency department (ED), recognize and describe radiographic trauma within an image test bank of 20 appendicular radiographs. Method: The participants consisted of 7 casualty officers, 13 nurse practitioners and 18 radiographers. All 20 radiographic examinations selected for the image test bank had been acquired following trauma and included some subtle, yet clinically significant abnormalities. The test bank score (maximum 40 marks), sensitivity and specificity percentages were calculated against an agreed radiological diagnosis (reference standard). Alternative Free-response Receiver Operating Characteristic (AFROC) analysis was used to assess the overall performance of the diagnostic accuracy of these professional groups. The variation in performance between each group was measured using the analysis of variance (ANOVA) test, to identify any statistical significant differences in the performance in interpretation between these groups. The relationship between the participants' perceived image interpretation accuracy during clinical practice and the actual accuracy of their image test bank score was examined using Pearson's Correlation Coefficient (r). Results: The results revealed that the radiographers gained the highest mean test bank score (28.5/40; 71%). This score was statistically higher than the mean test bank scores attained by the participating nurse practitioners (21/40; 53%) and casualty officers (21.5/40; 54%), with p < 0.01 and p = 0.02, respectively. When compared with each other, the scores from these latter groups showed no significant difference (p = 0.91). The mean 'area under the curve' (AUC) value achieved by the radiographers was also significantly higher (p < 0.01) in comparison to the AUC values demonstrated by the nurse practitioners and casualty officers, whose results, when compared, showed no significant
Effects of Test Media on Different EFL Test-Takers in Writing Scores and in the Cognitive Writing Process

Science.gov (United States)

Zou, Xiao-Ling; Chen, Yan-Min

2016-01-01

The effects of computer and paper test media on EFL test-takers with different computer familiarity in writing scores and in the cognitive writing process have been comprehensively explored from the learners' aspect as well as on the basis of related theories and practice. The results indicate significant differences in test scores among the…
Does breastfeeding contribute to the racial gap in reading and math test scores?

Science.gov (United States)

Peters, Kristen E; Huang, Jin; Vaughn, Michael G; Witko, Christopher

2013-10-01

The aim of this study was to examine the impact of divergent breastfeeding practices between Caucasian and African American mothers on the lingering achievement test gap between Caucasian and African American children. The Child Development Supplement of the Panel Study of Income Dynamics, beginning in 1997, followed a cohort of 3563 children aged 0-12 years. Reading and math test scores from 2002 for 1928 children were linked with breastfeeding history. Regression analysis was used to examine associations between ever having been breastfed and duration of breastfeeding and test scores, controlling for characteristics of child, mother, and household. African American students scored significantly lower than Caucasian children by 10.6 and 10.9 points on reading and math tests, respectively. After accounting for the impact of having been breastfed during infancy, the racial test gap decreased by 17% for reading scores and 9% for math scores. Study findings indicate that breastfeeding explains 17% and 9% of the observed gaps in reading and math scores, respectively, between African Americans and Caucasians, an effect larger than most recent educational policy interventions. Renewed efforts around policies and clinical practices that promote and remove barriers for African American mothers to breastfeed should be implemented. Copyright © 2013 Elsevier Inc. All rights reserved.
Validation of new prognostic and predictive scores by sequential testing approach

International Nuclear Information System (INIS)

Nieder, Carsten; Haukland, Ellinor; Pawinski, Adam; Dalhaug, Astrid

2010-01-01

Background and Purpose: For practitioners, the question arises how their own patient population differs from that used in large-scale analyses resulting in new scores and nomograms and whether such tools actually are valid at a local level and thus can be implemented. A recent article proposed an easy-to-use method for the in-clinic validation of new prediction tools with a limited number of patients, a so-called sequential testing approach. The present study evaluates this approach in scores related to radiation oncology. Material and Methods: Three different scores were used, each predicting short overall survival after palliative radiotherapy (bone metastases, brain metastases, metastatic spinal cord compression). For each scenario, a limited number of consecutive patients entered the sequential testing approach. The positive predictive value (PPV) was used for validation of the respective score and it was required that the PPV exceeded 80%. Results: For two scores, validity in the own local patient population could be confirmed after entering 13 and 17 patients, respectively. For the third score, no decision could be reached even after increasing the sample size to 30. Conclusion: In-clinic validation of new predictive tools with sequential testing approach should be preferred over uncritical adoption of tools which provide no significant benefit to local patient populations. Often the necessary number of patients can be reached within reasonable time frames even in small oncology practices. In addition, validation is performed continuously as the data are collected. (orig.)
Validation of new prognostic and predictive scores by sequential testing approach

Energy Technology Data Exchange (ETDEWEB)

Nieder, Carsten [Radiation Oncology Unit, Nordland Hospital, Bodo (Norway); Inst. of Clinical Medicine, Univ. of Tromso (Norway); Haukland, Ellinor; Pawinski, Adam; Dalhaug, Astrid [Radiation Oncology Unit, Nordland Hospital, Bodo (Norway)

2010-03-15

Background and Purpose: For practitioners, the question arises how their own patient population differs from that used in large-scale analyses resulting in new scores and nomograms and whether such tools actually are valid at a local level and thus can be implemented. A recent article proposed an easy-to-use method for the in-clinic validation of new prediction tools with a limited number of patients, a so-called sequential testing approach. The present study evaluates this approach in scores related to radiation oncology. Material and Methods: Three different scores were used, each predicting short overall survival after palliative radiotherapy (bone metastases, brain metastases, metastatic spinal cord compression). For each scenario, a limited number of consecutive patients entered the sequential testing approach. The positive predictive value (PPV) was used for validation of the respective score and it was required that the PPV exceeded 80%. Results: For two scores, validity in the own local patient population could be confirmed after entering 13 and 17 patients, respectively. For the third score, no decision could be reached even after increasing the sample size to 30. Conclusion: In-clinic validation of new predictive tools with sequential testing approach should be preferred over uncritical adoption of tools which provide no significant benefit to local patient populations. Often the necessary number of patients can be reached within reasonable time frames even in small oncology practices. In addition, validation is performed continuously as the data are collected. (orig.)
Accountancy, teaching methods, sex, and American College Test scores.

Science.gov (United States)

Heritage, J; Harper, B S; Harper, J P

1990-10-01

This study examines the significance of sex, methodology, academic preparation, and age as related to development of judgmental and problem-solving skills. Sex, American College Test (ACT) Mathematics scores, Composite ACT scores, grades in course work, grade point average (GPA), and age were used in studying the effects of teaching method on 96 students' ability to analyze data in financial statements. Results reflect positively on accounting students compared to the general college population and the women students in particular.
Interpreting Mathematics Scores on the New Jersey College Basic Skills Placement Test.

Science.gov (United States)

Dass, Jane; Pine, Charles

The New Jersey College Basic Skills Placement Test (NJCBSPT) is designed to measure certain basic language and mathematics skills of students entering New Jersey colleges. The primary purpose of the two mathematics sections is to determine whether students are prepared to begin certain college-level work without a handicap in computation or…
Reduce, Reuse, Recycle: The Longitudinal Value of Local Cut Scores Using State Test Data

Science.gov (United States)

Nelson, Peter M.; Van Norman, Ethan R.; VanDerHeyden, Amanda

2017-01-01

We used existing reading (n = 1,498) and math (n = 2,260) data to evaluate state test scores for screening middle school students. In Phase 1, state test data were used to create a research-derived cut score that was optimal for predicting state test performance the following year. In Phase 2, those cut scores were applied with future cohorts.…
A weighted generalized score statistic for comparison of predictive values of diagnostic tests.

Science.gov (United States)

Kosinski, Andrzej S

2013-03-15

Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations that are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we presented, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic that incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, always reduces to the score statistic in the independent samples situation, and preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe that the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the WGS test statistic in a general GEE setting. Copyright © 2012 John Wiley & Sons, Ltd.
Validity of GRE General Test scores and TOEFL scores for graduate admission to a technical university in Western Europe

Science.gov (United States)

Zimmermann, Judith; von Davier, Alina A.; Buhmann, Joachim M.; Heinimann, Hans R.

2018-01-01

Graduate admission has become a critical process in tertiary education, whereby selecting valid admissions instruments is key. This study assessed the validity of Graduate Record Examination (GRE) General Test scores for admission to Master's programmes at a technical university in Europe. We investigated the indicative value of GRE scores for the Master's programme grade point average (GGPA) with and without the addition of the undergraduate GPA (UGPA) and the TOEFL score, and of GRE scores for study completion and Master's thesis performance. GRE scores explained 20% of the variation in the GGPA, while additional 7% were explained by the TOEFL score and 3% by the UGPA. Contrary to common belief, the GRE quantitative reasoning score showed only little explanatory power. GRE scores were also weakly related to study progress but not to thesis performance. Nevertheless, GRE and TOEFL scores were found to be sensible admissions instruments. Rigorous methodology was used to obtain highly reliable results.
The Performance of the Upper Limb scores correlate with pulmonary function test measures and Egen Klassifikation scores in Duchenne muscular dystrophy.

Science.gov (United States)

Lee, Ha Neul; Sawnani, Hemant; Horn, Paul S; Rybalsky, Irina; Relucio, Lani; Wong, Brenda L

2016-01-01

The Performance of the Upper Limb scale was developed as an outcome measure specifically for ambulant and non-ambulant patients with Duchenne muscular dystrophy and is implemented in clinical trials needing longitudinal data. The aim of this study is to determine whether this novel tool correlates with functional ability using pulmonary function test, cardiac function test and Egen Klassifikation scale scores as clinical measures. In this cross-sectional study, 43 non-ambulatory Duchenne males from ages 10 to 30 years and on long-term glucocorticoid treatment were enrolled. Cardiac and pulmonary function test results were analyzed to assess cardiopulmonary function, and Egen Klassifikation scores were analyzed to assess functional ability. The Performance of the Upper Limb scores correlated with pulmonary function measures and had inverse correlation with Egen Klassifikation scores. There was no correlation with left ventricular ejection fraction and left ventricular dysfunction. Body mass index and decreased joint range of motion affected total Performance of the Upper Limb scores and should be considered in clinical trial designs. Copyright © 2016 Elsevier B.V. All rights reserved.
Relative User Ratings of MMPI-2 Computer-Based Test Interpretations

Science.gov (United States)

Williams, John E.; Weed, Nathan C.

2004-01-01

There are eight commercially available computer-based test interpretations (CBTIs) for the Minnesota Multiphasic Personality Inventory-2 (MMPI-2), of which few have been empirically evaluated. Prospective users of these programs have little scientific data to guide choice of a program. This study compared ratings of these eight CBTIs. Test users…
Relative Merits of Four Methods for Scoring Cloze Tests.

Science.gov (United States)

Brown, James Dean

1980-01-01

Describes study comparing merits of exact answer, acceptable answer, clozentropy and multiple choice methods for scoring tests. Results show differences among reliability, mean item facility, discrimination and usability, but not validity. (BK)
The evaluation of an open source online training system for teaching 12 lead electrocardiographic interpretation.

Science.gov (United States)

Breen, Cathal; Zhu, Tingting; Bond, Raymond; Finlay, Dewar; Clifford, Gari

2016-01-01

The aim of this study is to present and evaluate the integration of a low resource JavaScript based ECG training interface (CrowdLabel) and a standardised curriculum for self-guided tuition in ECG interpretation. Participants practiced interpreting ECGs weekly using the CrowdLabel interface to assist with the learning of the traditional didactic taught course material during a 6 week training period. To determine competency students were tested during week 7. A total of 245 unique ECG cases were submitted by each student. Accuracy scores during the training period ranged from 0-59.5% (median = 33.3%). Conversely accuracy scores during the test ranged from 30 - 70% (median = 37.5%) (p < 0.05). There was no correlation between students who interpreted high numbers of ECGs during the training period and their marks obtained. CrowdLabel is shown to be a readily accessible dedicated learning platform to support ECG interpretation competency. Copyright © 2016 Elsevier Inc. All rights reserved.
The Mediating Effect of Listening Metacognitive Awareness between Test-Taking Motivation and Listening Test Score: An Expectancy-Value Theory Approach.

Science.gov (United States)

Xu, Jian

2017-01-01

The present study investigated test-taking motivation in L2 listening testing context by applying Expectancy-Value Theory as the framework. Specifically, this study was intended to examine the complex relationships among expectancy, importance, interest, listening anxiety, listening metacognitive awareness, and listening test score using data from a large-scale and high-stakes language test among Chinese first-year undergraduates. Structural equation modeling was used to examine the mediating effect of listening metacognitive awareness on the relationship between expectancy, importance, interest, listening anxiety, and listening test score. According to the results, test takers' listening scores can be predicted by expectancy, interest, and listening anxiety significantly. The relationship between expectancy, interest, listening anxiety, and listening test score was mediated by listening metacognitive awareness. The findings have implications for test takers to improve their test taking motivation and listening metacognitive awareness, as well as for L2 teachers to intervene in L2 listening classrooms.
The Mediating Effect of Listening Metacognitive Awareness between Test-Taking Motivation and Listening Test Score: An Expectancy-Value Theory Approach

Directory of Open Access Journals (Sweden)

Jian Xu

2017-12-01

Full Text Available The present study investigated test-taking motivation in L2 listening testing context by applying Expectancy-Value Theory as the framework. Specifically, this study was intended to examine the complex relationships among expectancy, importance, interest, listening anxiety, listening metacognitive awareness, and listening test score using data from a large-scale and high-stakes language test among Chinese first-year undergraduates. Structural equation modeling was used to examine the mediating effect of listening metacognitive awareness on the relationship between expectancy, importance, interest, listening anxiety, and listening test score. According to the results, test takers’ listening scores can be predicted by expectancy, interest, and listening anxiety significantly. The relationship between expectancy, interest, listening anxiety, and listening test score was mediated by listening metacognitive awareness. The findings have implications for test takers to improve their test taking motivation and listening metacognitive awareness, as well as for L2 teachers to intervene in L2 listening classrooms.
The Dental Hygiene Aptitude Tests and the American College Testing Program Tests as Predictors of Scores on the National Board Dental Hygiene Examination.

Science.gov (United States)

Longenbecker, Sueann; Wood, Peter H.

1984-01-01

Scores from the National Board Dental Hygiene Examination (NBDHE) served as the criterion variable in a comparison of the predictive validity of the Dental Hygiene Aptitude Tests (DHAT) and the ACT Assessment tests. The DHAT-Science and Verbal tests combined to produce the highest multiple correlation with NBDHE scores. (Author/DWH)

Contributions of Hamstring Stiffness to Straight-Leg-Raise and Sit-and-Reach Test Scores.

Science.gov (United States)

Miyamoto, Naokazu; Hirata, Kosuke; Kimura, Noriko; Miyamoto-Mikami, Eri

2018-02-01

The passive straight-leg-raise (PSLR) and the sit-and-reach (SR) tests have been widely used to assess hamstring extensibility. However, it remains unclear to what extent hamstring stiffness (a measure of material properties) contributes to PSLR and SR test scores. Therefore, we aimed to clarify the relationship between hamstring stiffness and PSLR and SR scores using ultrasound shear wave elastography. Ninety-eight healthy subjects completed the study. Each subject completed PSLR testing, and classic and modified SR testing of the right leg. Muscle shear modulus of the biceps femoris, semitendinosus, and semimembranosus was quantified as an index of muscle stiffness. The relationships between shear modulus of each muscle and PSLR or SR scores were calculated using Pearson's product-moment correlation coefficients. Shear modulus of the semitendinosus and semimembranosus showed negative correlations with the two PSLR and two SR scores (absolute r value≤0.484). Shear modulus of the biceps femoris was significantly correlated with the PSLR score determined by the examiner and the modified SR score (absolute r value≤0.308). The present findings suggest that PSLR and SR test scores are strongly influenced by factors other than hamstring stiffness and therefore might not accurately evaluate hamstring stiffness. © Georg Thieme Verlag KG Stuttgart · New York.
Manual for Scoring the Test of Directed Imagination.

Science.gov (United States)

Veldman, Donald J.; And Others

A scoring manual for the Directed Imagination Test, a projective technique wherein the subject is instructed to write four fictional stories (four minutes are allowed for each) about teachers and their experiences, is presented. The manual provides detailed instructions for rating each story by fifteen dimensions relevant to teacher education…
AP Trends: Tests Soar, Scores Slip--Gaps between Groups Spur Equity Concerns

Science.gov (United States)

Cech, Scott J.

2008-01-01

More students are taking Advanced Placement tests, but the proportion of tests receiving what is deemed a passing score has dipped, and the mean score is down for the fourth year in a row. Data released here this week by the New York City-based nonprofit organization that owns the AP brand shows that a greater-than-ever proportion of students…
Simulation and interpretation of inter-well tracer tests

Directory of Open Access Journals (Sweden)

Dugstad Øyvind

2013-05-01

Full Text Available In inter-well tracer tests (IWTT, chemical compounds or radioactive isotopes are used to label injection water and gas to establish well connections and fluid patterns in petroleum reservoirs. Tracer simulation is an invaluable tool to ease the interpretation of IWTT results and is also required for assisted history matching application of tracer data. In this paper we present a new simulation technique to analyse and interpret tracer results. Laboratory results are used to establish and test formulations of the tracer conservation equations, and the technique is used to provide simulated tracer responses that are compared with observed tracer data from an extensive tracer program. The implemented tracer simulation methodology use a fast post-processing of previously simulated reservoir simulation runs. This provides a fast, flexible and powerful method for analysing gas tracer behaviour in reservoirs. We show that simulation time for tracers can be reduced by factor 100 compared to solving the tracer flow equations simultaneously with the reservoir fluid flow equations. The post-processing technique, combined with a flexible built-in local tracer-grid refinement is exploited to reduce numerical smearing, particularly severe for narrow tracer pulses.
Validity of GRE General Test Scores and TOEFL Scores for Graduate Admission to a Technical University in Western Europe

Science.gov (United States)

Zimmermann, Judith; von Davier, Alina A.; Buhmann, Joachim M.; Heinimann, Hans R.

2018-01-01

Graduate admission has become a critical process in tertiary education, whereby selecting valid admissions instruments is key. This study assessed the validity of Graduate Record Examination (GRE) General Test scores for admission to Master's programmes at a technical university in Europe. We investigated the indicative value of GRE scores for the…
The Formalization of Fairness: Issues in Testing for Measurement Invariance Using Subtest Scores

Science.gov (United States)

Molenaar, Dylan; Borsboom, Denny

2013-01-01

Measurement invariance is an important prerequisite for the adequate comparison of group differences in test scores. In psychology, measurement invariance is typically investigated by means of linear factor analyses of subtest scores. These subtest scores typically result from summing the item scores. In this paper, we discuss 4 possible problems…
Embodiment in tests of cognitive functioning: A study of an interpreter-mediated dementia evaluation.

Science.gov (United States)

Majlesi, Ali Reza; Plejert, Charlotta

2018-02-01

This study explores how manners of mediation, and the use of embodiment in interpreter-mediated conversation have an impact on tests of cognitive functioning in a dementia evaluation. By a detailed analysis of video recordings, we show how participants-an occupational therapist, an interpreter, and a patient-use embodied practices to make the tasks of a test of cognitive functioning intelligible, and how participants collaboratively put the instructions of the tasks into practice. We demonstrate that both instructions and instructed actions-and the whole procedure of accomplishing the tasks-are shaped co-operatively by embodied practices of all three participants involved in the test situation. Consequently, the accomplishment of the tasks should be viewed as the outcome of a collaborative achievement of instructed actions, rather than an individual product. The result of the study calls attention to issues concerning interpretations of, and the reliability of interpreter-mediated tests and their bearings for diagnostic procedures in dementia evaluations.
ECG interpretation in Emergency Department residents: an update and e-learning as a resource to improve skills.

Science.gov (United States)

Barthelemy, Francois X; Segard, Julien; Fradin, Philippe; Hourdin, Nicolas; Batard, Eric; Pottier, Pierre; Potel, Gilles; Montassier, Emmanuel

2017-04-01

ECG interpretation is a pivotal skill to acquire during residency, especially for Emergency Department (ED) residents. Previous studies reported that ECG interpretation competency among residents was rather low. However, the optimal resource to improve ECG interpretation skills remains unclear. The aim of our study was to compare two teaching modalities to improve the ECG interpretation skills of ED residents: e-learning and lecture-based courses. The participants were first-year and second-year ED residents, assigned randomly to the two groups. The ED residents were evaluated by means of a precourse test at the beginning of the study and a postcourse test after the e-learning and lecture-based courses. These evaluations consisted of the interpretation of 10 different ECGs. We included 39 ED residents from four different hospitals. The precourse test showed that the overall average score of ECG interpretation was 40%. Nineteen participants were then assigned to the e-learning course and 20 to the lecture-based course. Globally, there was a significant improvement in ECG interpretation skills (accuracy score=55%, P=0.0002). However, this difference was not significant between the two groups (P=0.14). Our findings showed that the ECG interpretation was not optimal and that our e-learning program may be an effective tool for enhancing ECG interpretation skills among ED residents. A large European study should be carried out to evaluate ECG interpretation skills among ED residents before the implementation of ECG learning, including e-learning strategies, during ED residency.
Explaining the black-white gap in cognitive test scores: Toward a theory of adverse impact.

Science.gov (United States)

Cottrell, Jonathan M; Newman, Daniel A; Roisman, Glenn I

2015-11-01

In understanding the causes of adverse impact, a key parameter is the Black-White difference in cognitive test scores. To advance theory on why Black-White cognitive ability/knowledge test score gaps exist, and on how these gaps develop over time, the current article proposes an inductive explanatory model derived from past empirical findings. According to this theoretical model, Black-White group mean differences in cognitive test scores arise from the following racially disparate conditions: family income, maternal education, maternal verbal ability/knowledge, learning materials in the home, parenting factors (maternal sensitivity, maternal warmth and acceptance, and safe physical environment), child birth order, and child birth weight. Results from a 5-wave longitudinal growth model estimated on children in the NICHD Study of Early Child Care and Youth Development from ages 4 through 15 years show significant Black-White cognitive test score gaps throughout early development that did not grow significantly over time (i.e., significant intercept differences, but not slope differences). Importantly, the racially disparate conditions listed above can account for the relation between race and cognitive test scores. We propose a parsimonious 3-Step Model that explains how cognitive test score gaps arise, in which race relates to maternal disadvantage, which in turn relates to parenting factors, which in turn relate to cognitive test scores. This model and results offer to fill a need for theory on the etiology of the Black-White ethnic group gap in cognitive test scores, and attempt to address a missing link in the theory of adverse impact. (c) 2015 APA, all rights reserved).
The impact of image test bank construction on radiographic interpretation outcomes: A comparison study

International Nuclear Information System (INIS)

Hardy, M.; Flintham, K.; Snaith, B.; Lewis, E.F.

2016-01-01

Introduction: Assessment of image interpretation competency is commonly undertaken through review of a defined image test bank. Content of these image banks has been criticised for the high percentage of abnormal examinations which contrasts with lower reported incidences of abnormal radiographs in clinical practice. As a result, questions have been raised regarding the influence of prevalence bias on the accuracy of interpretive decision making. This article describes a new and novel approach to the design of musculoskeletal image test banks. Methods: Three manufactured image banks were compiled following a standard academic menu in keeping with previous studies. Three further image test banks were constructed to reflect local clinical workload within a single NHS Trust. Eighteen radiographers, blinded to the method of test bank composition, were randomly assigned 2 test banks to review (1 manufactured, 1 clinical workload). Comparison of interpretive accuracy was undertaken. Results: Inter-rater agreement was moderate to good for all image banks (manufactured: range k = 0.45–0.68; clinical workload: k = 0.49–0.62). A significant difference in mean radiographer sensitivity was noted between test bank designs (manufactured 87.1%; clinical workload 78.5%; p = 0.040, 95% CI = 0.4–16.8; t = 2.223). Relative parity in radiographer specificity and overall accuracy was observed. Conclusion: This study confirms the findings of previous research that high abnormality prevalence image banks over-estimate the ability of observers to identify abnormalities. Assessment of interpretive competency using an image bank that reflects local clinical practice is a better approach to accurately establish interpretive competency and the learning development needs of individual practitioners. - Highlights: • High prevalence image test banks over-estimate the ability of observers. • Clinical workload test banks may better reflect image interpretation competency. �
Design and interpretation of anthropometric and fitness testing of basketball players.

Science.gov (United States)

Drinkwater, Eric J; Pyne, David B; McKenna, Michael J

2008-01-01

The volume of literature on fitness testing in court sports such as basketball is considerably less than for field sports or individual sports such as running and cycling. Team sport performance is dependent upon a diverse range of qualities including size, fitness, sport-specific skills, team tactics, and psychological attributes. The game of basketball has evolved to have a high priority on body size and physical fitness by coaches and players. A player's size has a large influence on the position in the team, while the high-intensity, intermittent nature of the physical demands requires players to have a high level of fitness. Basketball coaches and sport scientists often use a battery of sport-specific physical tests to evaluate body size and composition, and aerobic fitness and power. This testing may be used to track changes within athletes over time to evaluate the effectiveness of training programmes or screen players for selection. Sports science research is establishing typical (or 'reference') values for both within-athlete changes and between-athlete differences. Newer statistical approaches such as magnitude-based inferences have emerged that are providing more meaningful interpretation of fitness testing results in the field for coaches and athletes. Careful selection and implementation of tests, and more pertinent interpretation of data, will enhance the value of fitness testing in high-level basketball programmes. This article presents reference values of fitness and body size in basketball players, and identifies practical methods of interpreting changes within players and differences between players beyond the null-hypothesis.
A Maturing Global Testing Regime Meets the World Economy: Test Scores and Economic Growth, 1960-2012

Science.gov (United States)

Kamens, David H.

2015-01-01

This article considers the growth of the international testing regime. It discusses sources of growth and empirically examines two related sets of issues: (1) the stability of countries' achievement scores, and (2) the influence of those national scores on subsequent economic development over different time lags. The article suggests that…
The effects of a novel hostile interpretation bias modification paradigm on hostile interpretations, mood, and aggressive behavior.

Science.gov (United States)

AlMoghrabi, Nouran; Huijding, Jorg; Franken, Ingmar H A

2018-03-01

Cognitive theories of aggression propose that biased information processing is causally related to aggression. To test these ideas, the current study investigated the effects of a novel cognitive bias modification paradigm (CBM-I) designed to target interpretations associated with aggressive behavior. Participants aged 18-33 years old were randomly assigned to either a single session of positive training (n = 40) aimed at increasing prosocial interpretations or negative training (n = 40) aimed at increasing hostile interpretations. The results revealed that the positive training resulted in an increase in prosocial interpretations while the negative training seemed to have no effect on interpretations. Importantly, in the positive condition, a positive change in interpretations was related to lower anger and verbal aggression scores after the training. In this condition, participants also reported an increase in happiness. In the negative training no such effects were found. However, the better participants performed on the negative training, the more their interpretations were changed in a negative direction and the more aggression they showed on the behavioral aggression task. Participants were healthy university students. Therefore, results should be confirmed within a clinical population. These findings provide support for the idea that this novel CBM-I paradigm can be used to modify interpretations, and suggests that these interpretations are related to mood and aggressive behavior. Copyright © 2017 Elsevier Ltd. All rights reserved.
Examining the Impact of Unscorable Item Responses on the Validity and Interpretability of MMPI-2/MMPI-2-RF Restructured Clinical (RC) Scale Scores

Science.gov (United States)

Dragon, Wendy R.; Ben-Porath, Yossef S.; Handel, Richard W.

2012-01-01

This article examined the impact of unscorable item responses on the psychometric validity and practical interpretability of scores on the Restructured Clinical (RC) Scales of the Minnesota Multiphasic Personality Inventory-2/Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2/MMPI-2-RF). In analyses conducted with five…
Guidelines to Interpret Results of Mechanical Blade Test

International Nuclear Information System (INIS)

Arias Vega, F.; Sanz Martin, J. C.

1999-01-01

This report shows the interpretation of full scale rotor blade test results and describes the engineering testing models and coefficients for any feasible rotor blade design, in order to accept and to certify any final manufactured blade as an allowable product, fit for use and working with a completely security during all the wind turbines lifetime. This work was carried out at the Wind Energy Division of the CIEMAT.DER and it is based on the authors technical experience in this field, after many years working on testing blades. Also, this paper contains results of the European wind turbine Standards II relevant to the European Project: JOULE III R.D. where the Wind Energy Division took part as participant too. (Author)
Guidelines to Interpret Results of Mechanical Blade Test

Energy Technology Data Exchange (ETDEWEB)

Arias Vega, F.; Sanz Martin, J. C. [Ciemat, Madrid (Spain)

2000-07-01

This report shows the interpretation of full scale rotor blade test results and describes the engineering testing models and coefficients for any feasible rotor blade design, in order to accept and to certify any final manufactured blades as an allowable product, fit for use and working with a completely security during all the windturbine's lifetime. This work was carried out at the Wind Energy Division of the CIEMAT.DER and it is based on the author's technical experience in this field, after many years working on testing blades. Also, this paper contains results of the European wind turbine Standards II relevant to the European Project: JOULE III R.D. where the Wind Energy Division took part as participant too. (Author)
An improved method for interpreting API filter press hydraulic conductivity test results

International Nuclear Information System (INIS)

Heslin, G.M.; Baxter, D.Y.; Filz, G.M.; Davidson, R.R.

1997-01-01

The American Petroleum Institute (API) filter press is frequently used to measure the hydraulic conductivity of soil-bentonite backfill during the mix design process and as part of construction quality controls. However, interpretation of the test results is complicated by the fact that the seepage-induced consolidation pressure varies from zero at the top of the specimen to a maximum value at the bottom of the specimen. An analytical solution is available which relates the stress, compressibility, and hydraulic conductivity in soil consolidated by seepage forces. This paper presents the results of a laboratory investigation undertaken to support application of this theory to API hydraulic conductivity tests. When the API test results are interpreted using seepage consolidation theory, they are in good agreement with the results of consolidometer permeameter tests. Limitations of the API test are also discussed
Effects of Classroom Ventilation Rate and Temperature on Students' Test Scores.

Science.gov (United States)

Haverinen-Shaughnessy, Ulla; Shaughnessy, Richard J

2015-01-01

Using a multilevel approach, we estimated the effects of classroom ventilation rate and temperature on academic achievement. The analysis is based on measurement data from a 70 elementary school district (140 fifth grade classrooms) from Southwestern United States, and student level data (N = 3109) on socioeconomic variables and standardized test scores. There was a statistically significant association between ventilation rates and mathematics scores, and it was stronger when the six classrooms with high ventilation rates that were indicated as outliers were filtered (> 7.1 l/s per person). The association remained significant when prior year test scores were included in the model, resulting in less unexplained variability. Students' mean mathematics scores (average 2286 points) were increased by up to eleven points (0.5%) per each liter per second per person increase in ventilation rate within the range of 0.9-7.1 l/s per person (estimated effect size 74 points). There was an additional increase of 12-13 points per each 1°C decrease in temperature within the observed range of 20-25°C (estimated effect size 67 points). Effects of similar magnitude but higher variability were observed for reading and science scores. In conclusion, maintaining adequate ventilation and thermal comfort in classrooms could significantly improve academic achievement of students.
[Do different interpretative methods used for evaluation of checkerboard synergy test affect the results?].

Science.gov (United States)

Ozseven, Ayşe Gül; Sesli Çetin, Emel; Ozseven, Levent

2012-07-01

In recent years, owing to the presence of multi-drug resistant nosocomial bacteria, combination therapies are more frequently applied. Thus there is more need to investigate the in vitro activity of drug combinations against multi-drug resistant bacteria. Checkerboard synergy testing is among the most widely used standard technique to determine the activity of antibiotic combinations. It is based on microdilution susceptibility testing of antibiotic combinations. Although this test has a standardised procedure, there are many different methods for interpreting the results. In many previous studies carried out with multi-drug resistant bacteria, different rates of synergy have been reported with various antibiotic combinations using checkerboard technique. These differences might be attributed to the different features of the strains. However, different synergy rates detected by checkerboard method have also been reported in other studies using the same drug combinations and same types of bacteria. It was thought that these differences in synergy rates might be due to the different methods of interpretation of synergy test results. In recent years, multi-drug resistant Acinetobacter baumannii has been the most commonly encountered nosocomial pathogen especially in intensive-care units. For this reason, multidrug resistant A.baumannii has been the subject of a considerable amount of research about antimicrobial combinations. In the present study, the in vitro activities of frequently preferred combinations in A.baumannii infections like imipenem plus ampicillin/sulbactam, and meropenem plus ampicillin/sulbactam were tested by checkerboard synergy method against 34 multi-drug resistant A.baumannii isolates. Minimum inhibitory concentration (MIC) values for imipenem, meropenem and ampicillin/sulbactam were determined by the broth microdilution method. Subsequently the activity of two different combinations were tested in the dilution range of 4 x MIC and 0.03 x MIC in
A randomized control trial comparing use of a novel electrocardiogram simulator with traditional teaching in the acquisition of electrocardiogram interpretation skill.

Science.gov (United States)

Fent, Graham; Gosai, Jivendra; Purva, Makani

2016-01-01

Accurate interpretation of the electrocardiogram (ECG) remains an essential skill for medical students and junior doctors. While many techniques for teaching ECG interpretation are described, no single method has been shown to be superior. This randomized control trial is the first to investigate whether teaching ECG interpretation using a computer simulator program or traditional teaching leads to improved scores in a test of ECG interpretation among medical students and postgraduate doctors immediately after and 3months following teaching. Participants' opinions of the program were assessed using a questionnaire. There were no differences in ECG interpretation test scores immediately after or 3months after teaching in the lecture or simulator groups. At present therefore, there is insufficient evidence to suggest that ECG simulator programs are superior to traditional teaching. Copyright © 2016 Elsevier Inc. All rights reserved.

Optimal Scoring Methods of Hand-Strength Tests in Patients with Stroke

Science.gov (United States)

Huang, Sheau-Ling; Hsieh, Ching-Lin; Lin, Jau-Hong; Chen, Hui-Mei

2011-01-01

The purpose of this study was to determine the optimal scoring methods for measuring strength of the more-affected hand in patients with stroke by examining the effect of reducing measurement errors. Three hand-strength tests of grip, palmar pinch, and lateral pinch were administered at two sessions in 56 patients with stroke. Five scoring methods…
Testing statistical significance scores of sequence comparison methods with structure similarity

Directory of Open Access Journals (Sweden)

Leunissen Jack AM

2006-10-01

Full Text Available Abstract Background In the past years the Smith-Waterman sequence comparison algorithm has gained popularity due to improved implementations and rapidly increasing computing power. However, the quality and sensitivity of a database search is not only determined by the algorithm but also by the statistical significance testing for an alignment. The e-value is the most commonly used statistical validation method for sequence database searching. The CluSTr database and the Protein World database have been created using an alternative statistical significance test: a Z-score based on Monte-Carlo statistics. Several papers have described the superiority of the Z-score as compared to the e-value, using simulated data. We were interested if this could be validated when applied to existing, evolutionary related protein sequences. Results All experiments are performed on the ASTRAL SCOP database. The Smith-Waterman sequence comparison algorithm with both e-value and Z-score statistics is evaluated, using ROC, CVE and AP measures. The BLAST and FASTA algorithms are used as reference. We find that two out of three Smith-Waterman implementations with e-value are better at predicting structural similarities between proteins than the Smith-Waterman implementation with Z-score. SSEARCH especially has very high scores. Conclusion The compute intensive Z-score does not have a clear advantage over the e-value. The Smith-Waterman implementations give generally better results than their heuristic counterparts. We recommend using the SSEARCH algorithm combined with e-values for pairwise sequence comparisons.
The MMPI Assistant: A Microcomputer Based Expert System to Assist in Interpreting MMPI Profiles

Science.gov (United States)

Tanner, Barry A.

1989-01-01

The Assistant is an MS DOS program to aid clinical psychologists in interpreting the results of the Minnesota Multiphasic Personality Inventory (MMPI). Interpretive hypotheses are based on the professional literature and the author's experience. After scores are entered manually, the Assistant produces a hard copy which is intended for use by a psychologist knowledgeable about the MMPI. The rules for each hypothesis appear first on the monitor, and then in the printed output, followed by the patient's scores on the relevant scales, and narrative hypotheses for the scores. The data base includes hypotheses for 23 validity configurations, 45 two-point clinical codes, 10 high scoring single-point clinical scales, and 10 low scoring single-point clinical scales. The program can accelerate the production of test reports, while insuring that actuarial rules are not overlooked. It has been especially useful as a teaching tool with graduate students. The Assistant requires an IBM PC compatible with 128k available memory, DOS 2.x or higher, and a printer.
High Test Scores: The Wrong Road to National Economic Success

Science.gov (United States)

Baker, Keith

2011-01-01

A widely held view is that good schools are essential to a nation's international economic success and that high test scores on international tests of academic skills and knowledge indicate how good a nation's schools are. The widespread belief that good schools are an important contributor to a nation's economic success in the world is supported…
Relationships between spatial activities and scores on the mental rotation test as a function of sex.

Science.gov (United States)

Ginn, Sheryl R; Pickens, Stefanie J

2005-06-01

Previous results suggested that female college students' scores on the Mental Rotations Test might be related to their prior experience with spatial tasks. For example, women who played video games scored better on the test than their non-game-playing peers, whereas playing video games was not related to men's scores. The present study examined whether participation in different types of spatial activities would be related to women's performance on the Mental Rotations Test. 31 men and 59 women enrolled at a small, private church-affiliated university and majoring in art or music as well as students who participated in intercollegiate athletics completed the Mental Rotations Test. Women's scores on the Mental Rotations Test benefitted from experience with spatial activities; the more types of experience the women had, the better their scores. Thus women who were athletes, musicians, or artists scored better than those women who had no experience with these activities. The opposite results were found for the men. Efforts are currently underway to assess how length of experience and which types of experience are related to scores.
Dichotomous scoring of Trails B in patients referred for a dementia evaluation.

Science.gov (United States)

Schmitt, Andrew L; Livingston, Ronald B; Smernoff, Eric N; Waits, Bethany L; Harris, James B; Davis, Kent M

2010-04-01

The Trail Making Test is a popular neuropsychological test and its interpretation has traditionally used time-based scores. This study examined an alternative approach to scoring that is simply based on the examinees' ability to complete the test. If an examinee is able to complete Trails B successfully, they are coded as "completers"; if not, they are coded as "noncompleters." To assess this approach to scoring Trails B, the performance of 97 diagnostically heterogeneous individuals referred for a dementia evaluation was examined. In this sample, 55 individuals successfully completed Trails B and 42 individuals were unable to complete it. Point-biserial correlations indicated a moderate-to-strong association (r(pb)=.73) between the Trails B completion variable and the Total Scale score of the Repeatable Battery for the Assessment of Neurological Status (RBANS), which was larger than the correlation between the Trails B time-based score and the RBANS Total Scale score (r(pb)=.60). As a screen for dementia status, Trails B completion showed a sensitivity of 69% and a specificity of 100% in this sample. These results suggest that dichotomous scoring of Trails B might provide a brief and clinically useful measure of dementia status.
Effects of Classroom Ventilation Rate and Temperature on Students' Test Scores.

Directory of Open Access Journals (Sweden)

Ulla Haverinen-Shaughnessy

Full Text Available Using a multilevel approach, we estimated the effects of classroom ventilation rate and temperature on academic achievement. The analysis is based on measurement data from a 70 elementary school district (140 fifth grade classrooms from Southwestern United States, and student level data (N = 3109 on socioeconomic variables and standardized test scores. There was a statistically significant association between ventilation rates and mathematics scores, and it was stronger when the six classrooms with high ventilation rates that were indicated as outliers were filtered (> 7.1 l/s per person. The association remained significant when prior year test scores were included in the model, resulting in less unexplained variability. Students' mean mathematics scores (average 2286 points were increased by up to eleven points (0.5% per each liter per second per person increase in ventilation rate within the range of 0.9-7.1 l/s per person (estimated effect size 74 points. There was an additional increase of 12-13 points per each 1°C decrease in temperature within the observed range of 20-25°C (estimated effect size 67 points. Effects of similar magnitude but higher variability were observed for reading and science scores. In conclusion, maintaining adequate ventilation and thermal comfort in classrooms could significantly improve academic achievement of students.
Effects of Classroom Ventilation Rate and Temperature on Students’ Test Scores

Science.gov (United States)

2015-01-01

Using a multilevel approach, we estimated the effects of classroom ventilation rate and temperature on academic achievement. The analysis is based on measurement data from a 70 elementary school district (140 fifth grade classrooms) from Southwestern United States, and student level data (N = 3109) on socioeconomic variables and standardized test scores. There was a statistically significant association between ventilation rates and mathematics scores, and it was stronger when the six classrooms with high ventilation rates that were indicated as outliers were filtered (> 7.1 l/s per person). The association remained significant when prior year test scores were included in the model, resulting in less unexplained variability. Students’ mean mathematics scores (average 2286 points) were increased by up to eleven points (0.5%) per each liter per second per person increase in ventilation rate within the range of 0.9–7.1 l/s per person (estimated effect size 74 points). There was an additional increase of 12–13 points per each 1°C decrease in temperature within the observed range of 20–25°C (estimated effect size 67 points). Effects of similar magnitude but higher variability were observed for reading and science scores. In conclusion, maintaining adequate ventilation and thermal comfort in classrooms could significantly improve academic achievement of students. PMID:26317643
Comparison of the Classifier Oriented Gait Score and the Gait Profile Score based on imitated gait impairments.

Science.gov (United States)

Christian, Josef; Kröll, Josef; Schwameder, Hermann

2017-06-01

Common summary measures of gait quality such as the Gait Profile Score (GPS) are based on the principle of measuring a distance from the mean pattern of a healthy reference group in a gait pattern vector space. The recently introduced Classifier Oriented Gait Score (COGS) is a pathology specific score that measures this distance in a unique direction, which is indicated by a linear classifier. This approach has potentially improved the discriminatory power to detect subtle changes in gait patterns but does not incorporate a profile of interpretable sub-scores like the GPS. The main aims of this study were to extend the COGS by decomposing it into interpretable sub-scores as realized in the GPS and to compare the discriminative power of the GPS and COGS. Two types of gait impairments were imitated to enable a high level of control of the gait patterns. Imitated impairments were realized by restricting knee extension and inducing leg length discrepancy. The results showed increased discriminatory power of the COGS for differentiating diverse levels of impairment. Comparison of the GPS and COGS sub-scores and their ability to indicate changes in specific variables supports the validity of both scores. The COGS is an overall measure of gait quality with increased power to detect subtle changes in gait patterns and might be well suited for tracing the effect of a therapeutic treatment over time. The newly introduced sub-scores improved the interpretability of the COGS, which is helpful for practical applications. Copyright © 2017 Elsevier B.V. All rights reserved.
Tracers and Tracer Testing: Design, Implementation, Tracer Selection, and Interpretation Methods

Energy Technology Data Exchange (ETDEWEB)

G. Michael Shook; Shannon L.; Allan Wylie

2004-01-01

Conducting a successful tracer test requires adhering to a set of steps. The steps include identifying appropriate and achievable test goals, identifying tracers with the appropriate properties, and implementing the test as designed. When these steps are taken correctly, a host of tracer test analysis methods are available to the practitioner. This report discusses the individual steps required for a successful tracer test and presents methods for analysis. The report is an overview of tracer technology; the Suggested Reading section offers references to the specifics of test design and interpretation.
The importance of proper administration and interpretation of neuropsychological baseline and postconcussion computerized testing.

Science.gov (United States)

Moser, Rosemarie Scolaro; Schatz, Philip; Lichtenstein, Jonathan D

2015-01-01

Media coverage, litigation, and new legislation have resulted in a heightened awareness of the prevalence of sports concussion in both adult and youth athletes. Baseline and postconcussion testing is now commonly used for the assessment and management of sports-related concussion in schools and in youth sports leagues. With increased use of computerized neurocognitive sports concussion testing, there is a need for standards for proper administration and interpretation. To date, there has been a lack of standardized procedures by which assessments are administered. More specifically, individuals who are not properly trained often interpret test results, and their methods of interpretation vary considerably. The purpose of this article is to outline factors affecting the validity of test results, to provide examples of misuse and misinterpretation of test results, and to communicate the need to administer testing in the most effective and useful manner. An increase in the quality of test administration and application may serve to decrease the prevalence of invalid test results and increase the accuracy and utility of baseline test results if an athlete sustains a concussion. Standards for test use should model the American Psychological Association and Centers for Disease Control and Prevention guidelines, as well as the recent findings of the joint position paper on computerized neuropsychological assessment devices.
Effort, symptom validity testing, performance validity testing and traumatic brain injury.

Science.gov (United States)

Bigler, Erin D

2014-01-01

To understand the neurocognitive effects of brain injury, valid neuropsychological test findings are paramount. This review examines the research on what has been referred to a symptom validity testing (SVT). Above a designated cut-score signifies a 'passing' SVT performance which is likely the best indicator of valid neuropsychological test findings. Likewise, substantially below cut-point performance that nears chance or is at chance signifies invalid test performance. Significantly below chance is the sine qua non neuropsychological indicator for malingering. However, the interpretative problems with SVT performance below the cut-point yet far above chance are substantial, as pointed out in this review. This intermediate, border-zone performance on SVT measures is where substantial interpretative challenges exist. Case studies are used to highlight the many areas where additional research is needed. Historical perspectives are reviewed along with the neurobiology of effort. Reasons why performance validity testing (PVT) may be better than the SVT term are reviewed. Advances in neuroimaging techniques may be key in better understanding the meaning of border zone SVT failure. The review demonstrates the problems with rigidity in interpretation with established cut-scores. A better understanding of how certain types of neurological, neuropsychiatric and/or even test conditions may affect SVT performance is needed.
Predisposing factors of pneumothorax in percutaneous transthoracic fine needle aspiration biopsy: comparison between CT emphysema score and pulmonary function test

Energy Technology Data Exchange (ETDEWEB)

Lee, Chang Ho; Park, Kyung Joo; Park, Dong Won; Jung, Kyung Il; Suh, Jung Ho [Ajou Univ. College of Medicine, Seoul (Korea, Republic of)

1997-11-01

To compare the CT emphysema score with various factors of pulmonary function test by simple spirometry and to use the result as a predictor of pneumothorax in percutaneous transthoracic fine needle aspiration biopsy. The CT scans of 106 patients who had undergone percutaneous transthoracic fine needle aspiration biopsy of lung lesions within the previous 18 months were retrospectively reviewed. In 75 of these 106 cases, the results of the pulmonary function test were also reviewed. On plain chest radiography, pneumothorax was noted in 20 cases (19%). Emphysema was blindly evaluated. We divided each lung into four segments and determined the severity and involved volume of emphysema, as seen on CT. Severity was classified as one of four grades, as follow : absence of emphysema=0 ; low attenuation area of less than 5mm=1 ; low attenuation area of more than 5mm, and vascular pruning with normal lung intervening=2 ; and diffuse low attenuation without intervening normal lung, and larger confluent low attenuation with vascular pruning and distortion of branching pattern occupying all or almost all the involved parenchyma=3. The involved area was also classified as one of four grades : less than 25%=1 ; 25 - 49%=2 ; 51 - 74%=3 ; and more than 75%=4. The CT emphysema score was defined as the average of the grade of severity multiplied by the grade of involved area. Pulmonary function tests, consisting of simple spirometry and a pulmonologist's interpretation, were evaluated. We also evaluated depth and size of lesion as known predisposing factors in postbioptic pneumothorax. Statistical analysis was performed using the chi-square test, Wilcoxon ranks sum W test and the student t test. A comparison between the two groups of occurrence(with or without pneumothorax) showed the emphysema scores to be 1.69{+-}2.0 and 1.11{+-}2.9, respectively ; there was thus no significant difference between the two groups (z= - 0.048, p>0.10). Nor were differences revealed by the
Predisposing factors of pneumothorax in percutaneous transthoracic fine needle aspiration biopsy: comparison between CT emphysema score and pulmonary function test

International Nuclear Information System (INIS)

Lee, Chang Ho; Park, Kyung Joo; Park, Dong Won; Jung, Kyung Il; Suh, Jung Ho

1997-01-01

To compare the CT emphysema score with various factors of pulmonary function test by simple spirometry and to use the result as a predictor of pneumothorax in percutaneous transthoracic fine needle aspiration biopsy. The CT scans of 106 patients who had undergone percutaneous transthoracic fine needle aspiration biopsy of lung lesions within the previous 18 months were retrospectively reviewed. In 75 of these 106 cases, the results of the pulmonary function test were also reviewed. On plain chest radiography, pneumothorax was noted in 20 cases (19%). Emphysema was blindly evaluated. We divided each lung into four segments and determined the severity and involved volume of emphysema, as seen on CT. Severity was classified as one of four grades, as follow : absence of emphysema=0 ; low attenuation area of less than 5mm=1 ; low attenuation area of more than 5mm, and vascular pruning with normal lung intervening=2 ; and diffuse low attenuation without intervening normal lung, and larger confluent low attenuation with vascular pruning and distortion of branching pattern occupying all or almost all the involved parenchyma=3. The involved area was also classified as one of four grades : less than 25%=1 ; 25 - 49%=2 ; 51 - 74%=3 ; and more than 75%=4. The CT emphysema score was defined as the average of the grade of severity multiplied by the grade of involved area. Pulmonary function tests, consisting of simple spirometry and a pulmonologist's interpretation, were evaluated. We also evaluated depth and size of lesion as known predisposing factors in postbioptic pneumothorax. Statistical analysis was performed using the chi-square test, Wilcoxon ranks sum W test and the student t test. A comparison between the two groups of occurrence(with or without pneumothorax) showed the emphysema scores to be 1.69±2.0 and 1.11±2.9, respectively ; there was thus no significant difference between the two groups (z= - 0.048, p>0.10). Nor were differences revealed by the pulmonary
[Interpreting change scores of the Behavioural Rating Scale for Geriatric Inpatients (GIP)].

Science.gov (United States)

Diesfeldt, H F A

2013-09-01

The Behavioural Rating Scale for Geriatric Inpatients (GIP) consists of fourteen, Rasch modelled subscales, each measuring different aspects of behavioural, cognitive and affective disturbances in elderly patients. Four additional measures are derived from the GIP: care dependency, apathy, cognition and affect. The objective of the study was to determine the reproducibility of the 18 measures. A convenience sample of 56 patients in psychogeriatric day care was assessed twice by the same observer (a professional caregiver). The median time interval between rating occasions was 45 days (interquartile range 34-58 days). Reproducibility was determined by calculating intraclass correlation coefficients (ICC agreement) for test-retest reliability. The minimal detectable difference (MDD) was calculated based on the standard error of measurement (SEM agreement). Test-retest reliability expressed by the ICCs varied from 0.57 (incoherent behaviour) to 0.93 (anxious behaviour). Standard errors of measurement varied from 0.28 (anxious behaviour) to 1.63 (care dependency). The results show how the GIP can be applied when interpreting individual change in psychogeriatric day care participants.
A knowledge-based theory of rising scores on "culture-free" tests.

Science.gov (United States)

Fox, Mark C; Mitchum, Ainsley L

2013-08-01

Secular gains in intelligence test scores have perplexed researchers since they were documented by Flynn (1984, 1987). Gains are most pronounced on abstract, so-called culture-free tests, prompting Flynn (2007) to attribute them to problem-solving skills availed by scientifically advanced cultures. We propose that recent-born individuals have adopted an approach to analogy that enables them to infer higher level relations requiring roles that are not intrinsic to the objects that constitute initial representations of items. This proposal is translated into item-specific predictions about differences between cohorts in pass rates and item-response patterns on the Raven's Matrices (Flynn, 1987), a seemingly culture-free test that registers the largest Flynn effect. Consistent with predictions, archival data reveal that individuals born around 1940 are less able to map objects at higher levels of relational abstraction than individuals born around 1990. Polytomous Rasch models verify predicted violations of measurement invariance, as raw scores are found to underestimate the number of analogical rules inferred by members of the earlier cohort relative to members of the later cohort who achieve the same overall score. The work provides a plausible cognitive account of the Flynn effect, furthers understanding of the cognition of matrix reasoning, and underscores the need to consider how test-takers select item responses. PsycINFO Database Record (c) 2013 APA, all rights reserved.
A Latent Class Approach to Estimating Test-Score Reliability

Science.gov (United States)

van der Ark, L. Andries; van der Palm, Daniel W.; Sijtsma, Klaas

2011-01-01

This study presents a general framework for single-administration reliability methods, such as Cronbach's alpha, Guttman's lambda-2, and method MS. This general framework was used to derive a new approach to estimating test-score reliability by means of the unrestricted latent class model. This new approach is the latent class reliability…
Similar predictions of etravirine sensitivity regardless of genotypic testing method used: comparison of available scoring systems.

Science.gov (United States)

Vingerhoets, Johan; Nijs, Steven; Tambuyzer, Lotke; Hoogstoel, Annemie; Anderson, David; Picchio, Gaston

2012-01-01

The aims of this study were to compare various genotypic scoring systems commonly used to predict virological outcome to etravirine, and examine their concordance with etravirine phenotypic susceptibility. Six etravirine genotypic scoring systems were assessed: Tibotec 2010 (based on 20 mutations; TBT 20), Monogram, Stanford HIVdb, ANRS, Rega (based on 37, 30, 27 and 49 mutations, respectively) and virco(®)TYPE HIV-1 (predicted fold change based on genotype). Samples from treatment-experienced patients who participated in the DUET trials and with both genotypic and phenotypic data (n=403) were assessed using each scoring system. Results were retrospectively correlated with virological response in DUET. κ coefficients were calculated to estimate the degree of correlation between the different scoring systems. Correlation between the five scoring systems and the TBT 20 system was approximately 90%. Virological response by etravirine susceptibility was comparable regardless of which scoring system was utilized, with 70-74% of DUET patients determined as susceptible to etravirine by the different scoring systems achieving plasma viral load <50 HIV-1 RNA copies/ml. In samples classed as phenotypically susceptible to etravirine (fold change in 50% effective concentration ≤3), correlations with genotypic score were consistently high across scoring systems (≥70%). In general, the etravirine genotypic scoring systems produced similar results, and genotype-phenotype concordance was high. As such, phenotypic interpretations, and in their absence all genotypic scoring systems investigated, may be used to reliably predict the activity of etravirine.
Computerized scoring algorithms for the Autobiographical Memory Test.

Science.gov (United States)

Takano, Keisuke; Gutenbrunner, Charlotte; Martens, Kris; Salmon, Karen; Raes, Filip

2018-02-01

Reduced specificity of autobiographical memories is a hallmark of depressive cognition. Autobiographical memory (AM) specificity is typically measured by the Autobiographical Memory Test (AMT), in which respondents are asked to describe personal memories in response to emotional cue words. Due to this free descriptive responding format, the AMT relies on experts' hand scoring for subsequent statistical analyses. This manual coding potentially impedes research activities in big data analytics such as large epidemiological studies. Here, we propose computerized algorithms to automatically score AM specificity for the Dutch (adult participants) and English (youth participants) versions of the AMT by using natural language processing and machine learning techniques. The algorithms showed reliable performances in discriminating specific and nonspecific (e.g., overgeneralized) autobiographical memories in independent testing data sets (area under the receiver operating characteristic curve > .90). Furthermore, outcome values of the algorithms (i.e., decision values of support vector machines) showed a gradient across similar (e.g., specific and extended memories) and different (e.g., specific memory and semantic associates) categories of AMT responses, suggesting that, for both adults and youth, the algorithms well capture the extent to which a memory has features of specific memories. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Robust joint score tests in the application of DNA methylation data analysis.

Science.gov (United States)

Li, Xuan; Fu, Yuejiao; Wang, Xiaogang; Qiu, Weiliang

2018-05-18

Recently differential variability has been showed to be valuable in evaluating the association of DNA methylation to the risks of complex human diseases. The statistical tests based on both differential methylation level and differential variability can be more powerful than those based only on differential methylation level. Anh and Wang (2013) proposed a joint score test (AW) to simultaneously detect for differential methylation and differential variability. However, AW's method seems to be quite conservative and has not been fully compared with existing joint tests. We proposed three improved joint score tests, namely iAW.Lev, iAW.BF, and iAW.TM, and have made extensive comparisons with the joint likelihood ratio test (jointLRT), the Kolmogorov-Smirnov (KS) test, and the AW test. Systematic simulation studies showed that: 1) the three improved tests performed better (i.e., having larger power, while keeping nominal Type I error rates) than the other three tests for data with outliers and having different variances between cases and controls; 2) for data from normal distributions, the three improved tests had slightly lower power than jointLRT and AW. The analyses of two Illumina HumanMethylation27 data sets GSE37020 and GSE20080 and one Illumina Infinium MethylationEPIC data set GSE107080 demonstrated that three improved tests had higher true validation rates than those from jointLRT, KS, and AW. The three proposed joint score tests are robust against the violation of normality assumption and presence of outlying observations in comparison with other three existing tests. Among the three proposed tests, iAW.BF seems to be the most robust and effective one for all simulated scenarios and also in real data analyses.

In situ impulse test: an experimental and analytical evaluation of data interpretation procedures

International Nuclear Information System (INIS)

1975-08-01

Special experimental field testing and analytical studies were undertaken at Fort Lawton in Seattle, Washington, to study ''close-in'' wave propagation and evaluate data interpretation procedures for a new in situ impulse test. This test was developed to determine the shear wave velocity and dynamic modulus of soils underlying potential nuclear power plant sites. The test is different from conventional geophysical testing in that the velocity variation with strain is determined for each test. In general, strains between 10 -1 and 10 -3 percent are achieved. The experimental field work consisted of performing special tests in a large test sand fill to obtain detailed ''close-in'' data. Six recording transducers were placed at various points on the energy source, while approximately 37 different transducers were installed within the soil fill, all within 7 feet of the energy source. Velocity measurements were then taken simultaneously under controlled test conditions to study shear wave propagation phenomenology and help evaluate data interpretation procedures. Typical test data are presented along with detailed descriptions of the results
America's Mediocre Test Scores: Education Crisis or Poverty Crisis?

Science.gov (United States)

Petrilli, Michael J.; Wright, Brandon L.

2016-01-01

At a time when the national conversation is focused on lagging upward mobility, it is no surprise that many educators point to poverty as the explanation for mediocre test scores among U.S. students compared to those of students in other countries. If American teachers in struggling U.S. schools taught in Finland, says Finnish educator Pasi…
The interpretation of Charpy impact test data using hyper-logistic fitting functions

International Nuclear Information System (INIS)

Helm, J.L.

1996-01-01

The hyperbolic tangent function is used almost exclusively for computer assisted curve fitting of Charpy impact test data. Unfortunately, there is no physical basis to justify the use of this function and it cannot be generalized to test data that exhibits asymmetry. Using simple physical arguments, a semi-empirical model is derived and identified as a special case of the so called hyper-logistic equation. Although one solution of this equation is the hyperbolic tangent, other more physically interpretable solutions are provided. From the mathematics of the family of functions derived from the hyper-logistic equation, several useful generalizations are made such that asymmetric and wavy Charpy data can be physically interpreted
Power and sample size evaluation for the Cochran-Mantel-Haenszel mean score (Wilcoxon rank sum) test and the Cochran-Armitage test for trend.

Science.gov (United States)

Lachin, John M

2011-11-10

The power of a chi-square test, and thus the required sample size, are a function of the noncentrality parameter that can be obtained as the limiting expectation of the test statistic under an alternative hypothesis specification. Herein, we apply this principle to derive simple expressions for two tests that are commonly applied to discrete ordinal data. The Wilcoxon rank sum test for the equality of distributions in two groups is algebraically equivalent to the Mann-Whitney test. The Kruskal-Wallis test applies to multiple groups. These tests are equivalent to a Cochran-Mantel-Haenszel mean score test using rank scores for a set of C-discrete categories. Although various authors have assessed the power function of the Wilcoxon and Mann-Whitney tests, herein it is shown that the power of these tests with discrete observations, that is, with tied ranks, is readily provided by the power function of the corresponding Cochran-Mantel-Haenszel mean scores test for two and R > 2 groups. These expressions yield results virtually identical to those derived previously for rank scores and also apply to other score functions. The Cochran-Armitage test for trend assesses whether there is an monotonically increasing or decreasing trend in the proportions with a positive outcome or response over the C-ordered categories of an ordinal independent variable, for example, dose. Herein, it is shown that the power of the test is a function of the slope of the response probabilities over the ordinal scores assigned to the groups that yields simple expressions for the power of the test. Copyright © 2011 John Wiley & Sons, Ltd.
The Implications of Family Size and Birth Order for Test Scores and Behavioral Development

Science.gov (United States)

Silles, Mary A.

2010-01-01

This article, using longitudinal data from the National Child Development Study, presents new evidence on the effects of family size and birth order on test scores and behavioral development at age 7, 11 and 16. Sibling size is shown to have an adverse causal effect on test scores and behavioral development. For any given family size, first-borns…
Can interpreting sediment toxicity tests a mega sites benefit from novel approaches to normalization to address batching of tests?

Science.gov (United States)

Sediment toxicity tests are a key tool used in Ecological Risk Assessments for contaminated sediment sites. Interpreting test results and defining toxicity is often a challenge. This is particularly true at mega sites where the testing regime is large, and by necessity performed ...
Important Details in Performing and Interpreting the Scratch Collapse Test.

Science.gov (United States)

Kahn, Lorna C; Yee, Andrew; Mackinnon, Susan E

2018-02-01

The utility of the scratch collapse test has been demonstrated in examination of patients with carpal and cubital tunnel syndromes and long thoracic and peroneal nerve compressions. In the authors' clinic, this lesser known test plays a key role in peripheral nerve examination where localization of the nerve irritation or injury is not fully understood. Test utility and accuracy in patients with more challenging presentations likely correlate with tester understanding and experience. This article offers a clear outline of all stages of the test to improve interrater reliability. The nuances of test performance are described, including a description of situations where the scratch collapse test is deemed inappropriate. Four clinical scenarios where the scratch collapse test may be useful are included. Corresponding video content is provided to improve performance and interpretation of the scratch collapse test. Diagnostic, V.
Microindentation hardness testing of coatings: techniques and interpretation of data

Science.gov (United States)

Blau, P. J.

1986-09-01

This paper addresses the problems and promises of micro-indentation testing of thin solid films. It has discussed basic penetration hardness testing philosophy, the peculiarities of low load-shallow penetration tests of uncoated metals, and it has compared coated with uncoated behavior so that some of the unique responses of coatings can be distinguished from typical hardness versus load behavior. As the uses of thin solid coatings with technological interest continue to proliferate, microindentation testing methodology will increasingly be challenged to provide useful tools for their characterization. The understanding of microindentation response must go hand-in-hand with machine design so that the capability of measurement precision does not outstrip our abilities to interpret test results in a meaningful way.
Interpretation of Consolidation Test on Søvind Marl

DEFF Research Database (Denmark)

Grønbech, Gitte Lyng; Ibsen, Lars Bo; Nielsen, Benjaminn Nordahl

2012-01-01

The article deals with the interpretation of consolidation test in order to determine the preconsolidation stress; this is done by reviewing different methods. A main point in the article is the interaction between the consolidation and the secondary consolidation strains, and the methods used...... to separate the two strain types. This is in Denmark traditionally done by a √(t)-log(t) description, where the secondary consolidation first starts when the consolidation process is over. This assumption gives an uncertain description of the strain process, since the two processes in reality run...
Measuring Primary Students' Graph Interpretation Skills Via a Performance Assessment: A case study in instrument development

Science.gov (United States)

Peterman, Karen; Cranston, Kayla A.; Pryor, Marie; Kermish-Allen, Ruth

2015-11-01

This case study was conducted within the context of a place-based education project that was implemented with primary school students in the USA. The authors and participating teachers created a performance assessment of standards-aligned tasks to examine 6-10-year-old students' graph interpretation skills as part of an exploratory research project. Fifty-five students participated in a performance assessment interview at the beginning and end of a place-based investigation. Two forms of the assessment were created and counterbalanced within class at pre and post. In situ scoring was conducted such that responses were scored as correct versus incorrect during the assessment's administration. Criterion validity analysis demonstrated an age-level progression in student scores. Tests of discriminant validity showed that the instrument detected variability in interpretation skills across each of three graph types (line, bar, dot plot). Convergent validity was established by correlating in situ scores with those from the Graph Interpretation Scoring Rubric. Students' proficiency with interpreting different types of graphs matched expectations based on age and the standards-based progression of graphs across primary school grades. The assessment tasks were also effective at detecting pre-post gains in students' interpretation of line graphs and dot plots after the place-based project. The results of the case study are discussed in relation to the common challenges associated with performance assessment. Implications are presented in relation to the need for authentic and performance-based instructional and assessment tasks to respond to the Common Core State Standards and the Next Generation Science Standards.
A score based on screening tests to differentiate mild cognitive impairment from subjective memory complaints

Directory of Open Access Journals (Sweden)

Fábio Henrique de Gobbi Porto

2013-09-01

Full Text Available It is not easy to differentiate patients with mild cognitive impairment (MCI from subjective memory complainers (SMC. Assessments with screening cognitive tools are essential, particularly in primary care where most patients are seen. The objective of this study was to evaluate the diagnostic accuracy of screening cognitive tests and to propose a score derived from screening tests. Elderly subjects with memory complaints were evaluated using the Mini Mental State Examination (MMSE and the Brief Cognitive Battery (BCB. We added two delayed recalls in the MMSE (a delayed recall and a late-delayed recall, LDR, and also a phonemic fluency test of letter P fluency (LPF. A score was created based on these tests. The diagnoses were made on the basis of clinical consensus and neuropsychological testing. Receiver operating characteristic curve analyses were used to determine area under the curve (AUC, the sensitivity and specificity for each test separately and for the final proposed score. MMSE, LDR, LPF and delayed recall of BCB scores reach statistically significant differences between groups (P=0.000, 0.03, 0.001 and 0.01, respectively. Sensitivity, specificity and AUC were MMSE: 64%, 79% and 0.75 (cut off <29; LDR: 56%, 62% and 0.62 (cut off <3; LPF: 71%, 71% and 0.71 (cut off <14; delayed recall of BCB: 56%, 82% and 0.68 (cut off <9. The proposed score reached a sensitivity of 88% and 76% and specificity of 62% and 75% for cut off over 1 and over 2, respectively. AUC were 0.81. In conclusion, a score created from screening tests is capable of discriminating MCI from SMC with moderate to good accurancy.
Joint interpretation of two tracer tests with reversed flow fields

International Nuclear Information System (INIS)

Kunstmann, H.; Kinzelbach, W.; Marschall, P.; Li, G.

1995-01-01

Two dipole tracer experiments were performed in a fractured rock at the Grimsel Test Site in February/March 1993. In both experiments NaCl was used as a tracer. The extraction rate was twice the injection rate. In the second experiment injection and extraction were interchanged (Reverse-Experiment). Long tailing was characteristic for the breakthrough curves in both experiments. The tests were interpreted using a single fracture flow model. Tracer transport is described by advection/dispersion along the fracture allowing for diffusion into an immobile matrix. The authors were able to interpret the breakthrough curves for both experiments by one unique set of parameters, describing transport and baseflow. Uniqueness could only be achieved when using the information of both experiments. The authors conclude that performing a Reverse-Experiment is an indispensable tool for parameter identification in dipole tracer tests. A sensitivity analysis suggested that not only matrix diffusion is responsible for the tailing in the breakthrough curves but also transversal dispersivity. Further, the typical exchange time between mobile and immobile media was too small to be attributed to matrix diffusion in the strict sense which will cause tailing even at large spatial and temporal scales. Analysis of the covariance matrices showed that the parameters have small errors but high correlation
Diagnostic reliability of MMPI-2 computer-based test interpretations.

Science.gov (United States)

Pant, Hina; McCabe, Brian J; Deskovitz, Mark A; Weed, Nathan C; Williams, John E

2014-09-01

Reflecting the common use of the MMPI-2 to provide diagnostic considerations, computer-based test interpretations (CBTIs) also typically offer diagnostic suggestions. However, these diagnostic suggestions can sometimes be shown to vary widely across different CBTI programs even for identical MMPI-2 profiles. The present study evaluated the diagnostic reliability of 6 commercially available CBTIs using a 20-item Q-sort task developed for this study. Four raters each sorted diagnostic classifications based on these 6 CBTI reports for 20 MMPI-2 profiles. Two questions were addressed. First, do users of CBTIs understand the diagnostic information contained within the reports similarly? Overall, diagnostic sorts of the CBTIs showed moderate inter-interpreter diagnostic reliability (mean r = .56), with sorts for the 1/2/3 profile showing the highest inter-interpreter diagnostic reliability (mean r = .67). Second, do different CBTIs programs vary with respect to diagnostic suggestions? It was found that diagnostic sorts of the CBTIs had a mean inter-CBTI diagnostic reliability of r = .56, indicating moderate but not strong agreement across CBTIs in terms of diagnostic suggestions. The strongest inter-CBTI diagnostic agreement was found for sorts of the 1/2/3 profile CBTIs (mean r = .71). Limitations and future directions are discussed. PsycINFO Database Record (c) 2014 APA, all rights reserved.
The impact of interpreted flow regimes during constant head injection tests on the estimated transmissivity from injection tests and difference flow logging

Energy Technology Data Exchange (ETDEWEB)

Hjerne, Calle; Ludvigsson, Jan-Erik; Harrstroem, Johan [Geosigma AB, Uppsala (Sweden)

2013-04-15

A large number of constant head injection tests were carried out in the site investigation at Forsmark using the Pipe String System, PSS3. During the original evaluation of the tests the dominating transient flow regimes during both the injection and recovery period were interpreted together with estimation of hydraulic parameters. The flow regimes represent different flow and boundary conditions during the tests. Different boreholes or borehole intervals may display different distributions of flow regimes. In some boreholes good agreement was obtained between the results of the injection tests and difference flow logging with Posiva flow log (PFL) but in other boreholes significant discrepancies were found. The main objective of this project is to study the correlation between transient flow regimes from the injection tests and other borehole features such as transmissivity, depth, geology, fracturing etc. Another subject studied is whether observed discrepancies between estimated transmissivity from difference flow logging and injection tests can be correlated to interpreted flow regimes. Finally, a detailed comparison between transient and stationary evaluation of transmissivity from the injection tests in relation to estimated transmissivity from PFL tests in corresponding sections is made. Results from previous injection tests in 5 m sections in boreholes KFM04, KFM08A and KFM10A were used. Only injection tests above the (test-specific) measurement limit regarding flow rate are included in the analyses. For all of these tests transient flow regimes were interpreted. In addition, results from difference flow logging in the corresponding 5 m test sections were used. Finally, geological data of fractures together with rock and fracture zone properties have been used in the correlations. Flow regimes interpreted from the injection period of the tests are generally used in the correlations but deviations between the interpreted flow regimes from the injection and
A physical function test for use in the intensive care unit: validity, responsiveness, and predictive utility of the physical function ICU test (scored).

Science.gov (United States)

Denehy, Linda; de Morton, Natalie A; Skinner, Elizabeth H; Edbrooke, Lara; Haines, Kimberley; Warrillow, Stephen; Berney, Sue

2013-12-01

Several tests have recently been developed to measure changes in patient strength and functional outcomes in the intensive care unit (ICU). The original Physical Function ICU Test (PFIT) demonstrates reliability and sensitivity. The aims of this study were to further develop the original PFIT, to derive an interval score (the PFIT-s), and to test the clinimetric properties of the PFIT-s. A nested cohort study was conducted. One hundred forty-four and 116 participants performed the PFIT at ICU admission and discharge, respectively. Original test components were modified using principal component analysis. Rasch analysis examined the unidimensionality of the PFIT, and an interval score was derived. Correlations tested validity, and multiple regression analyses investigated predictive ability. Responsiveness was assessed using the effect size index (ESI), and the minimal clinically important difference (MCID) was calculated. The shoulder lift component was removed. Unidimensionality of combined admission and discharge PFIT-s scores was confirmed. The PFIT-s displayed moderate convergent validity with the Timed "Up & Go" Test (r=-.60), the Six-Minute Walk Test (r=.41), and the Medical Research Council (MRC) sum score (rho=.49). The ESI of the PFIT-s was 0.82, and the MCID was 1.5 points (interval scale range=0-10). A higher admission PFIT-s score was predictive of: an MRC score of ≥48, increased likelihood of discharge home, reduced likelihood of discharge to inpatient rehabilitation, and reduced acute care hospital length of stay. Scoring of sit-to-stand assistance required is subjective, and cadence cutpoints used may not be generalizable. The PFIT-s is a safe and inexpensive test of physical function with high clinical utility. It is valid, responsive to change, and predictive of key outcomes. It is recommended that the PFIT-s be adopted to test physical function in the ICU.
Allele-sharing models: LOD scores and accurate linkage tests.

Science.gov (United States)

Kong, A; Cox, N J

1997-11-01

Starting with a test statistic for linkage analysis based on allele sharing, we propose an associated one-parameter model. Under general missing-data patterns, this model allows exact calculation of likelihood ratios and LOD scores and has been implemented by a simple modification of existing software. Most important, accurate linkage tests can be performed. Using an example, we show that some previously suggested approaches to handling less than perfectly informative data can be unacceptably conservative. Situations in which this model may not perform well are discussed, and an alternative model that requires additional computations is suggested.
Comparing IM Residents with EM Resident for Their Skills of ECG Interpretation and Outlining Management Plan Accordingly

Directory of Open Access Journals (Sweden)

Hamid Reza Karimpoor Tari

2009-06-01

Full Text Available Background and Purpose: Electrocardiogram (ECG is one of the most commonly performed investigations in cardiac diseases and ECG abnormalities can reveal the early manifestations of cardiac ischemia, metabolic disorders, or life-threatening disrhythmias. Misinterpretation of ECG and its consequent mistreatment or performing inessential interventions may cause life-threatening cardiac events. Since EM residents and internal medicine (IM residents are usually the first to visit at bedside and start treatments based on patient’s ECG, we intended to evaluate the ability of EM residents to interpret ECGs and to compare it with that of IM residents using various ECG samples.Method: 63 participants including 33 IM residents and 30 EM residents from two education hospitals of Shahid Beheshti University of Medical Sciences were enrolled in our study. A diagnosis test consisting of 15 ECG samples associated with a questionnaire containing questions about gender, academic year and proficiency in ECG interpretation was taken from all participants. This study was conducted under the supervision of a cardiologist and an emergency specialist who supervised the ECG selection, answers and scoring of each ECG. The maximum score for each ECG was 6 which were given to a completely correct diagnosis and -0.25 negative point was given if the answer was wrong or any differential diagnosis was mentioned. After the test, the answer sheets were collected and wereanalyzed with SPSS program, by two of study authors who were kept blind to the real identities of participants.Results: After classification of groups, the overall mean score was 45.5/100 (38-60. The mean score of IM and EM residents was 56.0/100 (44.9-72 and 38.9/100 (31.5-45.5, respectively (p< 0.001.No significant correlation was found between the diagnosis scores and participant’s self-judgment on her/his ECG interpretation skills (p=0.897, r=0.017. Five ECGs were considered as the most important and
Identifying genetic marker sets associated with phenotypes via an efficient adaptive score test

KAUST Repository

Cai, T.

2012-06-25

In recent years, genome-wide association studies (GWAS) and gene-expression profiling have generated a large number of valuable datasets for assessing how genetic variations are related to disease outcomes. With such datasets, it is often of interest to assess the overall effect of a set of genetic markers, assembled based on biological knowledge. Genetic marker-set analyses have been advocated as more reliable and powerful approaches compared with the traditional marginal approaches (Curtis and others, 2005. Pathways to the analysis of microarray data. TRENDS in Biotechnology 23, 429-435; Efroni and others, 2007. Identification of key processes underlying cancer phenotypes using biologic pathway analysis. PLoS One 2, 425). Procedures for testing the overall effect of a marker-set have been actively studied in recent years. For example, score tests derived under an Empirical Bayes (EB) framework (Liu and others, 2007. Semiparametric regression of multidimensional genetic pathway data: least-squares kernel machines and linear mixed models. Biometrics 63, 1079-1088; Liu and others, 2008. Estimation and testing for the effect of a genetic pathway on a disease outcome using logistic kernel machine regression via logistic mixed models. BMC bioinformatics 9, 292-2; Wu and others, 2010. Powerful SNP-set analysis for case-control genome-wide association studies. American Journal of Human Genetics 86, 929) have been proposed as powerful alternatives to the standard Rao score test (Rao, 1948. Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation. Mathematical Proceedings of the Cambridge Philosophical Society, 44, 50-57). The advantages of these EB-based tests are most apparent when the markers are correlated, due to the reduction in the degrees of freedom. In this paper, we propose an adaptive score test which up- or down-weights the contributions from each member of the marker-set based on the Z-scores of
Rational use and interpretation of urine drug testing in chronic opioid therapy.

Science.gov (United States)

Reisfield, Gary M; Salazar, Elaine; Bertholf, Roger L

2007-01-01

Urine drug testing (UDT) has become an essential feature of pain management, as physicians seek to verify adherence to prescribed opioid regimens and to detect the use of illicit or unauthorized licit drugs. Results of urine drug tests have important consequences in regard to therapeutic decisions and the trust between physician and patient. However, reliance on UDT to confirm adherence can be problematic if the results are not interpreted correctly, and evidence suggests that many physicians lack an adequate understanding of the complexities of UDT and the factors that can affect test results. These factors include metabolic conversion between drugs, genetic variations in drug metabolism, the sensitivity and specificity of the analytical method for a particular drug or metabolite, and the effects of intentional and unintentional interferants. In this review, we focus on the technical features and limitations of analytical methods used for detecting drugs or their metabolites in urine, the statistical constructs that are pertinent to ordering UDT and interpreting test results, and the application of these concepts to the clinical monitoring of patients maintained on chronic opioid therapy.
Test equating, scaling, and linking methods and practices

CERN Document Server

Kolen, Michael J

2014-01-01

This book provides an introduction to test equating, scaling, and linking, including those concepts and practical issues that are critical for developers and all other testing professionals. In addition to statistical procedures, successful equating, scaling, and linking involves many aspects of testing, including procedures to develop tests, to administer and score tests, and to interpret scores earned on tests. Test equating methods are used with many standardized tests in education and psychology to ensure that scores from multiple test forms can be used interchangeably. Test scaling is the process of developing score scales that are used when scores on standardized tests are reported. In test linking, scores from two or more tests are related to one another. Linking has received much recent attention, due largely to investigations of linking similarly named tests from different test publishers or tests constructed for different purposes. In recent years, researchers from the education, psychology, and...

Comparing the Effects of Elementary Music and Visual Arts Lessons on Standardized Mathematics Test Scores

Science.gov (United States)

King, Molly Elizabeth

2016-01-01

The purpose of this quantitative, causal-comparative study was to compare the effect elementary music and visual arts lessons had on third through sixth grade standardized mathematics test scores. Inferential statistics were used to compare the differences between test scores of students who took in-school, elementary, music instruction during the…
Analysis and interpretation of borehole hydraulic tests in deep boreholes: principles, model development, and applications

International Nuclear Information System (INIS)

Pickens, J.F.; Grisak, G.E.; Avis, J.D.; Belanger, D.W.

1987-01-01

A review of the literature on hydraulic testing and interpretive methods, particularly in low-permeability media, indicates a need for a comprehensive hydraulic testing interpretive capability. Physical limitations on boreholes, such as caving and erosion during continued drilling, as well as the high costs associated with deep-hole rigs and testing equipment, often necessitate testing under nonideal conditions with respect to antecedent pressures and temperatures. In these situations, which are common in the high-level nuclear waste programs throughout the world, the interpretive requirements include the ability to quantitatively account for thermally induced pressure responses and borehole pressure history (resulting in a time-dependent pressure profile around the borehole) as well as equipment compliance effects in low-permeability intervals. A numerical model was developed to provide the capability to handle these antecedent conditions. Sensitivity studies and practical applications are provided to illustrate the importance of thermal effects and antecedent pressure history. It is demonstrated theoretically and with examples from the Swiss (National Genossenschaft fuer die Lagerung radioaktiver Abfaelle) regional hydrogeologic characterization program that pressure changes (expressed as hydraulic head) of the order of tens to hundreds of meters can results from 1 0 to 2 0 C temperature variations during shut-in (packer isolated) tests in low-permeability formations. Misinterpreted formation pressures and hydraulic conductivity can also result from inaccurate antecedent pressure history. Interpretation of representative formation properties and pressures requires that antecedent pressure information and test period temperature data be included as an integral part of the hydraulic test analyses
Reproducibility of scoring emphysema by HRCT

International Nuclear Information System (INIS)

Malinen, A.; Partanen, K.; Rytkoenen, H.; Vanninen, R.; Erkinjuntti-Pekkanen, R.

2002-01-01

Purpose: We evaluated the reproducibility of three visual scoring methods of emphysema and compared these methods with pulmonary function tests (VC, DLCO, FEV1 and FEV%) among farmer's lung patients and farmers. Material and Methods: Three radiologists examined high-resolution CT images of farmer's lung patients and their matched controls (n=70) for chronic interstitial lung diseases. Intraobserver reproducibility and interobserver variability were assessed for three methods: severity, Sanders' (extent) and Sakai. Pulmonary function tests as spirometry and diffusing capacity were measured. Results: Intraobserver -values for all three methods were good (0.51-0.74). Interobserver varied from 0.35 to 0.72. The Sanders' and the severity methods correlated strongly with pulmonary function tests, especially DLCO and FEV1. Conclusion: The Sanders' method proved to be reliable in evaluating emphysema, in terms of good consistency of interpretation and good correlation with pulmonary function tests
Test and Score Data Summary for TOEFL[R] Internet-Based and Paper-Based Tests. January 2008-December 2008 Test Data

Science.gov (United States)

Educational Testing Service, 2008

2008-01-01

The Test of English as a Foreign Language[TM], better known as TOEFL[R], is designed to measure the English-language proficiency of people whose native language is not English. TOEFL scores are accepted by more than 6,000 colleges, universities, and licensing agencies in 130 countries. The test is also used by governments, and scholarship and…
Use of Standardized Test Scores to Predict Success in a Computer Applications Course

Science.gov (United States)

Harris, Robert V.; King, Stephanie B.

2016-01-01

The purpose of this study was to see if a relationship existed between American College Testing (ACT) scores (i.e., English, reading, mathematics, science reasoning, and composite) and student success in a computer applications course at a Mississippi community college. The study showed that while the ACT scores were excellent predictors of…
A Comparison of the Approaches of Generalizability Theory and Item Response Theory in Estimating the Reliability of Test Scores for Testlet-Composed Tests

Science.gov (United States)

Lee, Guemin; Park, In-Yong

2012-01-01

Previous assessments of the reliability of test scores for testlet-composed tests have indicated that item-based estimation methods overestimate reliability. This study was designed to address issues related to the extent to which item-based estimation methods overestimate the reliability of test scores composed of testlets and to compare several…
Clock Drawing Test and the diagnosis of amnestic mild cognitive impairment: can more detailed scoring systems do the work?

Science.gov (United States)

Rubínová, Eva; Nikolai, Tomáš; Marková, Hana; Siffelová, Kamila; Laczó, Jan; Hort, Jakub; Vyhnálek, Martin

2014-01-01

The Clock Drawing Test is a frequently used cognitive screening test with several scoring systems in elderly populations. We compare simple and complex scoring systems and evaluate the usefulness of the combination of the Clock Drawing Test with the Mini-Mental State Examination to detect patients with mild cognitive impairment. Patients with amnestic mild cognitive impairment (n = 48) and age- and education-matched controls (n = 48) underwent neuropsychological examinations, including the Clock Drawing Test and the Mini-Mental State Examination. Clock drawings were scored by three blinded raters using one simple (6-point scale) and two complex (17- and 18-point scales) systems. The sensitivity and specificity of these scoring systems used alone and in combination with the Mini-Mental State Examination were determined. Complex scoring systems, but not the simple scoring system, were significant predictors of the amnestic mild cognitive impairment diagnosis in logistic regression analysis. At equal levels of sensitivity (87.5%), the Mini-Mental State Examination showed higher specificity (31.3%, compared with 12.5% for the 17-point Clock Drawing Test scoring scale). The combination of Clock Drawing Test and Mini-Mental State Examination scores increased the area under the curve (0.72; p Drawing Test did not differentiate between healthy elderly and patients with amnestic mild cognitive impairment in our sample. Complex scoring systems were slightly more efficient, yet still were characterized by high rates of false-positive results. We found psychometric improvement using combined scores from the Mini-Mental State Examination and the Clock Drawing Test when complex scoring systems were used. The results of this study support the benefit of using combined scores from simple methods.
Racial Differences in Mathematics Test Scores for Advanced Mathematics Students

Science.gov (United States)

Minor, Elizabeth Covay

2016-01-01

Research on achievement gaps has found that achievement gaps are larger for students who take advanced mathematics courses compared to students who do not. Focusing on the advanced mathematics student achievement gap, this study found that African American advanced mathematics students have significantly lower test scores and are less likely to be…
TOEFL iBT Speaking Test Scores as Indicators of Oral Communicative Language Proficiency

Science.gov (United States)

Bridgeman, Brent; Powers, Donald; Stone, Elizabeth; Mollaun, Pamela

2012-01-01

Scores assigned by trained raters and by an automated scoring system (SpeechRater[TM]) on the speaking section of the TOEFL iBT[TM] were validated against a communicative competence criterion. Specifically, a sample of 555 undergraduate students listened to speech samples from 184 examinees who took the Test of English as a Foreign Language…
Association between the gait pattern characteristics of older people and their two-step test scores.

Science.gov (United States)

Kobayashi, Yoshiyuki; Ogata, Toru

2018-04-27

The Two-Step test is one of three official tests authorized by the Japanese Orthopedic Association to evaluate the risk of locomotive syndrome (a condition of reduced mobility caused by an impairment of the locomotive organs). It has been reported that the Two-Step test score has a good correlation with one's walking ability; however, its association with the gait pattern of older people during normal walking is still unknown. Therefore, this study aims to clarify the associations between the gait patterns of older people observed during normal walking and their Two-Step test scores. We analyzed the whole waveforms obtained from the lower-extremity joint angles and joint moments of 26 older people in various stages of locomotive syndrome using principal component analysis (PCA). The PCA was conducted using a 260 × 2424 input matrix constructed from the participants' time-normalized pelvic and right-lower-limb-joint angles along three axes (ten trials of 26 participants, 101 time points, 4 angles, 3 axes, and 2 variable types per trial). The Pearson product-moment correlation coefficient between the scores of the principal component vectors (PCVs) and the scores of the Two-Step test revealed that only one PCV (PCV 2) among the 61 obtained relevant PCVs is significantly related to the score of the Two-Step test. We therefore concluded that the joint angles and joint moments related to PCV 2-ankle plantar-flexion, ankle plantar-flexor moments during the late stance phase, ranges of motion and moments on the hip, knee, and ankle joints in the sagittal plane during the entire stance phase-are the motions associated with the Two-Step test.
Validity and reliability of Abbreviated Mental Test Score (AMTS) among older Iranian.

Science.gov (United States)

Foroughan, Mahshid; Wahlund, Lars-Olof; Jafari, Zahra; Rahgozar, Mehdi; Farahani, Ida G; Rashedi, Vahid

2017-11-01

Cognitive impairment is common among older people and is associated with increased morbidity and mortality. The main aim of this study was to evaluate the validity of the Persian version of the Abbreviated Mental Test Score (AMTS) as a screening tool for dementia. Data were obtained from a cross-sectional study. One hundred and one older adults who were members of Iranian Alzheimer Association and 101 of their siblings were entered into this study by convenient sampling. The Diagnostic and Statistical Manual of Mental Disorders, 4th edition, criteria for diagnosing dementia and the Mini-Mental State Examination were used as the study tools. The gathered data were analyzed by the Mann-Whitney U-test, the Kruskal-Wallis test, Spearman's rank correlation coefficient, and the receiver-operating characteristic. The AMTS could successfully differentiate the dementia group from the non-dementia group. Scores were significantly correlated with Diagnostic and Statistical Manual of Mental Disorders diagnosis for dementia and Mini-Mental State Examination scores (P < 0.001). Educational level (P < 0.001) and male sex (P = 0.015) were positively associated with AMTS, whereas (P < 0.001) was negatively associated with AMTS. Total Cronbach's α coefficient was 0.90. The scores 6 and 7 showed the optimum balance between sensitivity (99% and 94%, respectively) and specificity (85% and 86%, respectively). The Persian version of the AMTS is a valid cognitive assessment tool for older Iranian adults and can be used for dementia screening in Iran. © 2017 Japanese Psychogeriatric Society.
A general equation to obtain multiple cut-off scores on a test from multinomial logistic regression.

Science.gov (United States)

Bersabé, Rosa; Rivas, Teresa

2010-05-01

The authors derive a general equation to compute multiple cut-offs on a total test score in order to classify individuals into more than two ordinal categories. The equation is derived from the multinomial logistic regression (MLR) model, which is an extension of the binary logistic regression (BLR) model to accommodate polytomous outcome variables. From this analytical procedure, cut-off scores are established at the test score (the predictor variable) at which an individual is as likely to be in category j as in category j+1 of an ordinal outcome variable. The application of the complete procedure is illustrated by an example with data from an actual study on eating disorders. In this example, two cut-off scores on the Eating Attitudes Test (EAT-26) scores are obtained in order to classify individuals into three ordinal categories: asymptomatic, symptomatic and eating disorder. Diagnoses were made from the responses to a self-report (Q-EDD) that operationalises DSM-IV criteria for eating disorders. Alternatives to the MLR model to set multiple cut-off scores are discussed.
School accountability and the black-white test score gap.

Science.gov (United States)

Gaddis, S Michael; Lauen, Douglas Lee

2014-03-01

Since at least the 1960s, researchers have closely examined the respective roles of families, neighborhoods, and schools in producing the black-white achievement gap. Although many researchers minimize the ability of schools to eliminate achievement gaps, the No Child Left Behind Act (NCLB) increased pressure on schools to do so by 2014. In this study, we examine the effects of NCLB's subgroup-specific accountability pressure on changes in black-white math and reading test score gaps using a school-level panel dataset on all North Carolina public elementary and middle schools between 2001 and 2009. Using difference-in-difference models with school fixed effects, we find that accountability pressure reduces black-white achievement gaps by raising mean black achievement without harming mean white achievement. We find no differential effects of accountability pressure based on the racial composition of schools, but schools with more affluent populations are the most successful at reducing the black-white math achievement gap. Thus, our findings suggest that school-based interventions have the potential to close test score gaps, but differences in school composition and resources play a significant role in the ability of schools to reduce racial inequality. Copyright © 2013 Elsevier Inc. All rights reserved.
Source Country Differences in Test Score Gaps: Evidence from Denmark

Science.gov (United States)

Rangvid, Beatrice Schindler

2010-01-01

We combine data from three studies for Denmark in the PISA 2000 framework to investigate differences in the native-immigrant test score gap by country of origin. In addition to the controls available from PISA data sources, we use student-level data on home background and individual migration histories linked from administrative registers. We find…
Comparing heat flow models for interpretation of precast quadratic pile heat exchanger thermal response tests

DEFF Research Database (Denmark)

Alberdi Pagola, Maria; Poulsen, Søren Erbs; Loveridge, Fleur

2018-01-01

This paper investigates the applicability of currently available analytical, empirical and numerical heat flow models for interpreting thermal response tests (TRT) of quadratic cross section precast pile heat exchangers. A 3D finite element model (FEM) is utilised for interpreting five TRTs by in...
Interpreting Statistical Significance Test Results: A Proposed New "What If" Method.

Science.gov (United States)

Kieffer, Kevin M.; Thompson, Bruce

As the 1994 publication manual of the American Psychological Association emphasized, "p" values are affected by sample size. As a result, it can be helpful to interpret the results of statistical significant tests in a sample size context by conducting so-called "what if" analyses. However, these methods can be inaccurate…
Interpretation of ambiguities by schoolchildren with low birth weight from Embu das Artes, São Paulo state, Brazil.

Science.gov (United States)

Pessoa, Rebeca Rodrigues; Araújo, Sarah Cueva Cândido Soares de; Isotani, Selma Mie; Puccini, Rosana Fiorini; Perissinoto, Jacy

To assess the development of language regarding the ability to recognize and interpret lexical ambiguity in low-birth-weight schoolchildren enrolled at the school system in the municipality of Embu das Artes, Sao Paulo state, compared with that of schoolchildren with normal birth weight. A case-control, retrospective, cross-sectional study conducted with 378 schoolchildren, both genders, aged 5 to 9.9 years, from the municipal schools of Embu das Artes. Study Group (SG) comprising 210 schoolchildren with birth weight Control Group (CG) composed of 168 school children with birth weight ≥ 2500 g. Participants of both groups were compared with respect to the skills of recognition and verbal interpretation of sentences containing lexical ambiguity using the Test of Language Competence. Variables of interest: Age and gender of children; age and schooling of mothers. Statistical analysis: Descriptive analysis to characterize the sample and score per group; Student's t test for comparison between the total scores of each skill/subtest; Chi-square test to compare items within each subtest; multiple regression analysis for the intervening variables. Participants of the SG presented lower scores for ambiguous sentences compared with those of participants of the CG. Multiple regression analysis showed that child's current age was a predictor for all metalinguistic skills regarding interpretation of ambiguities in both groups. Participants of the SG presented lower specific and total scores than those of participants of the CG for ambiguity skills. The child's current age factor positively influenced the ambiguity skills in both groups.
Effects of correcting for prematurity on cognitive test scores in childhood.

Science.gov (United States)

Wilson-Ching, Michelle; Pascoe, Leona; Doyle, Lex W; Anderson, Peter J

2014-03-01

The American Academy of Pediatrics recommends that test scores should be corrected for prematurity up to 3 years of age, but this practice varies greatly in both clinical and research settings. The aim of this study was to contrast the effects of using chronological age and those of using corrected age on measures of cognitive outcome across childhood. A theoretical model was constructed using norms from the Bayley Scales of Infant and Toddler Development, Third Edition; the Wechsler Preschool and Primary Scale of Intelligence, Third Edition Australian; and the Wechsler Intelligence Scales for Children, Fourth Edition Australian. Baseline scores representing different levels of functioning (70, below average; 85, borderline; and 100, average) were recalculated using the normative data for ages 6 months to 16 years to account for 1, 2, 3 and 4 months of prematurity. The model created depicted the difference in standardised scores between chronological and corrected age. Compared with scores corrected for prematurity, the absolute reduction in scores using chronological age was greater for increasing degree of prematurity, younger ages at assessment and higher baseline scores and was substantial even beyond 3 years of age. However, the pattern was erratic, with considerable fluctuation evident across different ages and baseline scores. Chronological age results in a lowering of scores at all ages for preterm-born subjects that is greater in the first few years and in those born at earlier gestational ages. Whether or not to correct for prematurity depends upon the context of the assessment. © 2014 The Authors. Journal of Paediatrics and Child Health © 2014 Paediatrics and Child Health Division (Royal Australasian College of Physicians).
The Effects of Group Members' Personalities on a Test Taker's L2 Group Oral Discussion Test Scores

Science.gov (United States)

Ockey, Gary J.

2009-01-01

The second language group oral is a test of second language speaking proficiency, in which a group of three or more English language learners discuss an assigned topic without interaction with interlocutors. Concerns expressed about the extent to which test takers' personal characteristics affect the scores of others in the group have limited its…
Standardised test protocol (Constant Score) for evaluation of functionality in patients with shoulder disorders

DEFF Research Database (Denmark)

Ban, Ilija; Troelsen, Anders; Christiansen, David Høyrup

2013-01-01

INTRODUCTION: The Constant Score (CS), developed as a scoring system to evaluate overall functionality of patients with shoulder disorders, is widely used but has been criticised for relying on an imprecise terminology and for lack of a standardised methodology. A modified guideline was therefore...... differences. One of the authors of the modified CS approved both the English and the Danish test protocol. CONCLUSION: A simple test protocol of the modified CS was developed in both English and Danish. With precise terminology and definitions, the test protocol is the first of its kind. We suggest its use...

Modeling Floor Effects in Standardized Vocabulary Test Scores in a Sample of Low SES Hispanic Preschool Children under the Multilevel Structural Equation Modeling Framework

Directory of Open Access Journals (Sweden)

Leina Zhu

2017-12-01

Full Text Available Researchers and practitioners often use standardized vocabulary tests such as the Peabody Picture Vocabulary Test-4 (PPVT-4; Dunn and Dunn, 2007 and its companion, the Expressive Vocabulary Test-2 (EVT-2; Williams, 2007, to assess English vocabulary skills as an indicator of children's school readiness. Despite their psychometric excellence in the norm sample, issues arise when standardized vocabulary tests are used to asses children from culturally, linguistically and ethnically diverse backgrounds (e.g., Spanish-speaking English language learners or delayed in some manner. One of the biggest challenges is establishing the appropriateness of these measures with non-English or non-standard English speaking children as often they score one to two standard deviations below expected levels (e.g., Lonigan et al., 2013. This study re-examines the issues in analyzing the PPVT-4 and EVT-2 scores in a sample of 4-to-5-year-old low SES Hispanic preschool children who were part of a larger randomized clinical trial on the effects of a supplemental English shared-reading vocabulary curriculum (Pollard-Durodola et al., 2016. It was found that data exhibited strong floor effects and the presence of floor effects made it difficult to differentiate the invention group and the control group on their vocabulary growth in the intervention. A simulation study is then presented under the multilevel structural equation modeling (MSEM framework and results revealed that in regular multilevel data analysis, ignoring floor effects in the outcome variables led to biased results in parameter estimates, standard error estimates, and significance tests. Our findings suggest caution in analyzing and interpreting scores of ethnically and culturally diverse children on standardized vocabulary tests (e.g., floor effects. It is recommended appropriate analytical methods that take into account floor effects in outcome variables should be considered.
Microscopic creep models and the interpretation of stress-dip tests during creep

International Nuclear Information System (INIS)

Poirier, J.P.

1976-09-01

A critical analysis is made of the principal divergent view points concerning stress-dip tests. The raw data are examined and interpreted in the light of various creep models. The following problems are discussed: is the reverse strain anelastic or plastic; is the zero creep rate periodic due to recovery or is it spurious; can the existence or inexistence of an internal stress be deduced from stress-dip tests; can stress-dip tests allow to determine whether glide is jerky or viscous; can the internal stress be measured by stress-dip tests
Opportunity to learn: Investigating possible predictors for pre-course Test Of Astronomy STandards TOAST scores

Science.gov (United States)

Berryhill, Katie J.

As astronomy education researchers become more interested in experimentally testing innovative teaching strategies to enhance learning in introductory astronomy survey courses ("ASTRO 101"), scholars are placing increased attention toward better understanding factors impacting student gain scores on the widely used Test Of Astronomy STandards (TOAST). Usually used in a pre-test and post-test research design, one might naturally assume that the pre-course differences observed between high- and low-scoring college students might be due in large part to their pre-existing motivation, interest, experience in science, and attitudes about astronomy. To explore this notion, 11 non-science majoring undergraduates taking ASTRO 101 at west coast community colleges were interviewed in the first few weeks of the course to better understand students' pre-existing affect toward learning astronomy with an eye toward predicting student success. In answering this question, we hope to contribute to our understanding of the incoming knowledge of students taking undergraduate introductory astronomy classes, but also gain insight into how faculty can best meet those students' needs and assist them in achieving success. Perhaps surprisingly, there was only weak correlation between students' motivation toward learning astronomy and their pre-test scores. Instead, the most fruitful predictor of TOAST pre-test scores was the quantity of pre-existing, informal, self-directed astronomy learning experiences.
Individual Differences in Digit Span, Susceptibility to Proactive Interference, and Aptitude/Achievement Test Scores.

Science.gov (United States)

Dempster, Frank N.; Cooney, John B.

1982-01-01

Individual differences in digit span, susceptibility to proactive interference, and various aptitude/achievement test scores were investigated in two experiments with college students. Results indicated that digit span was strongly correlated with aptitude/achievement scores, but did not indicate that susceptibility to proactive interference…
A structured approach to control of Salmonella Dublin in 10 Danish dairy herds based on risk scoring and test-and-manage procedures

DEFF Research Database (Denmark)

Nielsen, Liza Rosenbaum; Nielsen, Søren Saxmose

2012-01-01

routes of infection; 4) interpretation of repeated testing of individual animals to detect high-risk animals for special hygienic management or culling; and 5) diagnostic testing of different age groups and bulk tank milk to evaluate progress of control over time. Serology, true prevalence estimates...... stock and adult cattle in 10 case herds that were followed for more than three years. The five steps in the structured approach were: 1) risk scoring to determine transmission routes within the herd and into the herd; 2) determining a plan of action; 3) performing management changes to close important...... and changes in herd classification in the Danish surveillance programme for Salmonella Dublin were used to assess the progress in the herds during and after the control period. Effective control of Salmonella Dublin was achieved in all participating herds through management that focused on closing infection...
Construction of an Exome-Wide Risk Score for Schizophrenia Based on a Weighted Burden Test.

Science.gov (United States)

Curtis, David

2018-01-01

Polygenic risk scores obtained as a weighted sum of associated variants can be used to explore association in additional data sets and to assign risk scores to individuals. The methods used to derive polygenic risk scores from common SNPs are not suitable for variants detected in whole exome sequencing studies. Rare variants, which may have major effects, are seen too infrequently to judge whether they are associated and may not be shared between training and test subjects. A method is proposed whereby variants are weighted according to their frequency, their annotations and the genes they affect. A weighted sum across all variants provides an individual risk score. Scores constructed in this way are used in a weighted burden test and are shown to be significantly different between schizophrenia cases and controls using a five-way cross-validation procedure. This approach represents a first attempt to summarise exome sequence variation into a summary risk score, which could be combined with risk scores from common variants and from environmental factors. It is hoped that the method could be developed further. © 2017 John Wiley & Sons Ltd/University College London.
General practitioners' needs for ongoing support for the interpretation of spirometry tests.

NARCIS (Netherlands)

Poels, P.J.P.; Schermer, T.R.J.; Akkermans, R.P.; Jacobs, A.; Bogart-Jansen, M.; Bottema, B.J.A.M.; Weel, C. van

2007-01-01

BACKGROUND: Although one out of three general practitioners (GPs) carries out spirometry, the diagnostic interpretation of spirometric test results appears to be a common barrier for GPs towards its routine application. METHODS: Multivariate cross-sectional analysis of a questionnaire survey among
Reproducibility of scoring emphysema by HRCT

Energy Technology Data Exchange (ETDEWEB)

Malinen, A.; Partanen, K.; Rytkoenen, H.; Vanninen, R. [Kuopio Univ. Hospital (Finland). Dept. of Clinical Radiology; Erkinjuntti-Pekkanen, R. [Kuopio Univ. Hospital (Finland). Dept. of Pulmonary Diseases

2002-04-01

Purpose: We evaluated the reproducibility of three visual scoring methods of emphysema and compared these methods with pulmonary function tests (VC, DLCO, FEV1 and FEV%) among farmer's lung patients and farmers. Material and Methods: Three radiologists examined high-resolution CT images of farmer's lung patients and their matched controls (n=70) for chronic interstitial lung diseases. Intraobserver reproducibility and interobserver variability were assessed for three methods: severity, Sanders' (extent) and Sakai. Pulmonary function tests as spirometry and diffusing capacity were measured. Results: Intraobserver -values for all three methods were good (0.51-0.74). Interobserver varied from 0.35 to 0.72. The Sanders' and the severity methods correlated strongly with pulmonary function tests, especially DLCO and FEV1. Conclusion: The Sanders' method proved to be reliable in evaluating emphysema, in terms of good consistency of interpretation and good correlation with pulmonary function tests.
Pediatric residents' learning styles and temperaments and their relationships to standardized test scores.

Science.gov (United States)

Tuli, Sanjeev Y; Thompson, Lindsay A; Saliba, Heidi; Black, Erik W; Ryan, Kathleen A; Kelly, Maria N; Novak, Maureen; Mellott, Jane; Tuli, Sonal S

2011-12-01

Board certification is an important professional qualification and a prerequisite for credentialing, and the Accreditation Council for Graduate Medical Education (ACGME) assesses board certification rates as a component of residency program effectiveness. To date, research has shown that preresidency measures, including National Board of Medical Examiners scores, Alpha Omega Alpha Honor Medical Society membership, or medical school grades poorly predict postresidency board examination scores. However, learning styles and temperament have been identified as factors that 5 affect test-taking performance. The purpose of this study is to characterize the learning styles and temperaments of pediatric residents and to evaluate their relationships to yearly in-service and postresidency board examination scores. This cross-sectional study analyzed the learning styles and temperaments of current and past pediatric residents by administration of 3 validated tools: the Kolb Learning Style Inventory, the Keirsey Temperament Sorter, and the Felder-Silverman Learning Style test. These results were compared with known, normative, general and medical population data and evaluated for correlation to in-service examination and postresidency board examination scores. The predominant learning style for pediatric residents was converging 44% (33 of 75 residents) and the predominant temperament was guardian 61% (34 of 56 residents). The learning style and temperament distribution of the residents was significantly different from published population data (P = .002 and .04, respectively). Learning styles, with one exception, were found to be unrelated to standardized test scores. The predominant learning style and temperament of pediatric residents is significantly different than that of the populations of general and medical trainees. However, learning styles and temperament do not predict outcomes on standardized in-service and board examinations in pediatric residents.
Spinal appearance questionnaire: factor analysis, scoring, reliability, and validity testing.

Science.gov (United States)

Carreon, Leah Y; Sanders, James O; Polly, David W; Sucato, Daniel J; Parent, Stefan; Roy-Beaudry, Marjolaine; Hopkins, Jeffrey; McClung, Anna; Bratcher, Kelly R; Diamond, Beverly E

2011-08-15

Cross sectional. This study presents the factor analysis of the Spinal Appearance Questionnaire (SAQ) and its psychometric properties. Although the SAQ has been administered to a large sample of patients with adolescent idiopathic scoliosis (AIS) treated surgically, its psychometric properties have not been fully evaluated. This study presents the factor analysis and scoring of the SAQ and evaluates its psychometric properties. The SAQ and the Scoliosis Research Society-22 (SRS-22) were administered to AIS patients who were being observed, braced or scheduled for surgery. Standard demographic data and radiographic measures including Lenke type and curve magnitude were also collected. Of the 1802 patients, 83% were female; with a mean age of 14.8 years and mean initial Cobb angle of 55.8° (range, 0°-123°). From the 32 items of the SAQ, 15 loaded on two factors with consistent and significant correlations across all Lenke types. There is an Appearance (items 1-10) and an Expectations factor (items 12-15). Responses are summed giving a range of 5 to 50 for the Appearance domain and 5 to 20 for the Expectations domain. The Cronbach's α was 0.88 for both domains and Total score with a test-retest reliability of 0.81 for Appearance and 0.91 for Expectations. Correlations with major curve magnitude were higher for the SAQ Appearance and SAQ Total scores compared to correlations between the SRS Appearance and SRS Total scores. The SAQ and SRS-22 Scores were statistically significantly different in patients who were scheduled for surgery compared to those who were observed or braced. The SAQ is a valid measure of self-image in patients with AIS with greater correlation to curve magnitude than SRS Appearance and Total score. It also discriminates between patients who require surgery from those who do not.
Comparison of two teaching methods for cardiac arrhythmia interpretation among nursing students.

Science.gov (United States)

Varvaroussis, Dimitrios P; Kalafati, Maria; Pliatsika, Paraskevi; Castrén, Maaret; Lott, Carsten; Xanthos, Theodoros

2014-02-01

The aim of this study was to compare the six-stage method (SSM) for instructing primary cardiac arrhythmias interpretation to students without basic electrocardiogram (ECG) knowledge with a descriptive teaching method in a single educational intervention. This is a randomized trial. Following a brief instructional session, undergraduate nursing students, assigned to group A (SSM) and group B (descriptive teaching method), undertook a written test in cardiac rhythm recognition, immediately after the educational intervention (initial exam). Participants were also examined with an unannounced retention test (final exam), one month after instruction. Altogether 134 students completed the study. Interpretation accuracy for each cardiac arrhythmia was assessed. Mean score at the initial exam was 8.71±1.285 for group A and 8.74±1.303 for group B. Mean score at the final exam was 8.25±1.46 for group A vs 7.84±1.44 for group B. Overall results showed that the SSM was equally effective with the descriptive teaching method. The study showed that in each group bradyarrhythmias were identified correctly by more students than tachyarrhythmias. No significant difference between the two teaching methods was seen for any specific cardiac arrhythmia. The SSM effectively develops staff competency for interpreting common cardiac arrhythmias in students without ECG knowledge. More research is needed to support this conclusion and the method's effectiveness must be evaluated if being implemented to trainee groups with preexisting basic ECG interpretation knowledge. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
ACER Mathematics Profile Series: Number Test. (Test Booklet, Answer and Record Sheet, Score Key, and Teachers Handbook).

Science.gov (United States)

Cornish, Greg; Wines, Robin

The Number Test of the ACER Mathematics Profile Series, contains 30 items, for each of three suggested grade levels: 7-8, 8-9, and 9-10. Raw scores on all tests in the ACER Mathematics Profile Series (Number, Operations, Space and Measurement) are converted to a common scale called MAPS, a major feature of the Series. Based on the Rasch Model,…
Validity and reliability of Nintendo Wii Fit balance scores.

Science.gov (United States)

Wikstrom, Erik A

2012-01-01

Interactive gaming systems have the potential to help rehabilitate patients with musculoskeletal conditions. The Nintendo Wii Balance Board, which is part of the Wii Fit game, could be an effective tool to monitor progress during rehabilitation because the board and game can provide objective measures of balance. However, the validity and reliability of Wii Fit balance scores remain unknown. To determine the concurrent validity of balance scores produced by the Wii Fit game and the intrasession and intersession reliability of Wii Fit balance scores. Descriptive laboratory study. Sports medicine research laboratory. Forty-five recreationally active participants (age = 27.0 ± 9.8 years, height = 170.9 ± 9.2 cm, mass = 72.4 ± 11.8 kg) with a heterogeneous history of lower extremity injury. Participants completed a single-limb-stance task on a force plate and the Star Excursion Balance Test (SEBT) during the first test session. Twelve Wii Fit balance activities were completed during 2 test sessions separated by 1 week. Postural sway in the anteroposterior (AP) and mediolateral (ML) directions and the AP, ML, and resultant center-of-pressure (COP) excursions were calculated from the single-limb stance. The normalized reach distance was recorded for the anterior, posteromedial, and posterolateral directions of the SEBT. Wii Fit balance scores that the game software generated also were recorded. All 96 of the calculated correlation coefficients among Wii Fit activity outcomes and established balance outcomes were interpreted as poor (r Wii Fit balance activity scores ranged from good (intraclass correlation coefficient [ICC] = 0.80) to poor (ICC = 0.39), with 8 activities having poor intrasession reliability. Similarly, 11 of the 12 Wii Fit balance activity scores demonstrated poor intersession reliability, with scores ranging from fair (ICC = 0.74) to poor (ICC = 0.29). Wii Fit balance activity scores had poor concurrent validity relative to COP outcomes and SEBT
Normal Variability of Weekly Musculoskeletal Screening Scores and the Influence of Training Load across an Australian Football League Season.

Science.gov (United States)

Esmaeili, Alireza; Stewart, Andrew M; Hopkins, William G; Elias, George P; Lazarus, Brendan H; Rowell, Amber E; Aughey, Robert J

2018-01-01

Aim: The sit and reach test (S&R), dorsiflexion lunge test (DLT), and adductor squeeze test (AST) are commonly used in weekly musculoskeletal screening for athlete monitoring and injury prevention purposes. The aim of this study was to determine the normal week to week variability of the test scores, individual differences in variability, and the effects of training load on the scores. Methods: Forty-four elite Australian rules footballers from one club completed the weekly screening tests on day 2 or 3 post-main training (pre-season) or post-match (in-season) over a 10 month season. Ratings of perceived exertion and session duration for all training sessions were used to derive various measures of training load via both simple summations and exponentially weighted moving averages. Data were analyzed via linear and quadratic mixed modeling and interpreted using magnitude-based inference. Results: Substantial small to moderate variability was found for the tests at both season phases; for example over the in-season, the normal variability ±90% confidence limits were as follows: S&R ±1.01 cm, ±0.12; DLT ±0.48 cm, ±0.06; AST ±7.4%, ±0.6%. Small individual differences in variability existed for the S&R and AST (factor standard deviations between 1.31 and 1.66). All measures of training load had trivial effects on the screening scores. Conclusion: A change in a test score larger than the normal variability is required to be considered a true change. Athlete monitoring and flagging systems need to account for the individual differences in variability. The tests are not sensitive to internal training load when conducted 2 or 3 days post-training or post-match, and the scores should be interpreted cautiously when used as measures of recovery.
Standard practice for analysis and interpretation of physics dosimetry results for test reactors

International Nuclear Information System (INIS)

Anon.

1984-01-01

This practice describes the methodology summarized in Annex Al to be used in the analysis and interpretation of physics-dosimetry results from test reactors. This practice relies on, and ties together, the application of several supporting ASTM standard practices, guides, and methods that are in various stages of completion (see Fig. 1). Support subject areas that are discussed include reactor physics calculations, dosimeter selection and analysis, exposure units, and neutron spectrum adjustment methods. This practice is directed towards the development and application of physics-dosimetrymetallurgical data obtained from test reactor irradiation experiments that are performed in support of the operation, licensing, and regulation of LWR nuclear power plants. It specifically addresses the physics-dosimetry aspects of the problem. Procedures related to the analysis, interpretation, and application of both test and power reactor physics-dosimetry-metallurgy results are addressed in Practice E 853, Practice E 560, Matrix E 706(IE), Practice E 185, Matrix E 706(IG), Guide E 900, and Method E 646
Linear-rank testing of a non-binary, responder-analysis, efficacy score to evaluate pharmacotherapies for substance use disorders.

Science.gov (United States)

Holmes, Tyson H; Li, Shou-Hua; McCann, David J

2016-11-23

The design of pharmacological trials for management of substance use disorders is shifting toward outcomes of successful individual-level behavior (abstinence or no heavy use). While binary success/failure analyses are common, McCann and Li (CNS Neurosci Ther 2012; 18: 414-418) introduced "number of beyond-threshold weeks of success" (NOBWOS) scores to avoid dichotomized outcomes. NOBWOS scoring employs an efficacy "hurdle" with values reflecting duration of success. Here, we evaluate NOBWOS scores rigorously. Formal analysis of mathematical structure of NOBWOS scores is followed by simulation studies spanning diverse conditions to assess operating characteristics of five linear-rank tests on NOBWOS scores. Simulations include assessment of Fisher's exact test applied to hurdle component. On average, statistical power was approximately equal for five linear-rank tests. Under none of conditions examined did Fisher's exact test exhibit greater statistical power than any of the linear-rank tests. These linear-rank tests provide good Type I and Type II error control for comparing distributions of NOBWOS scores between groups (e.g. active vs. placebo). All methods were applied to re-analyses of data from four clinical trials of differing lengths and substances of abuse. These linear-rank tests agreed across all trials in rejecting (or not) their null (equality of distributions) at ≤ 0.05. © The Author(s) 2016.
Decision making under internal uncertainty: the case of multiple-choice tests with different scoring rules.

Science.gov (United States)

Bereby-Meyer, Yoella; Meyer, Joachim; Budescu, David V

2003-02-01

This paper assesses framing effects on decision making with internal uncertainty, i.e., partial knowledge, by focusing on examinees' behavior in multiple-choice (MC) tests with different scoring rules. In two experiments participants answered a general-knowledge MC test that consisted of 34 solvable and 6 unsolvable items. Experiment 1 studied two scoring rules involving Positive (only gains) and Negative (only losses) scores. Although answering all items was the dominating strategy for both rules, the results revealed a greater tendency to answer under the Negative scoring rule. These results are in line with the predictions derived from Prospect Theory (PT) [Econometrica 47 (1979) 263]. The second experiment studied two scoring rules, which allowed respondents to exhibit partial knowledge. Under the Inclusion-scoring rule the respondents mark all answers that could be correct, and under the Exclusion-scoring rule they exclude all answers that might be incorrect. As predicted by PT, respondents took more risks under the Inclusion rule than under the Exclusion rule. The results illustrate that the basic process that underlies choice behavior under internal uncertainty and especially the effect of framing is similar to the process of choice under external uncertainty and can be described quite accurately by PT. Copyright 2002 Elsevier Science B.V.
An Analysis of Cross Racial Identity Scale Scores Using Classical Test Theory and Rasch Item Response Models

Science.gov (United States)

Sussman, Joshua; Beaujean, A. Alexander; Worrell, Frank C.; Watson, Stevie

2013-01-01

Item response models (IRMs) were used to analyze Cross Racial Identity Scale (CRIS) scores. Rasch analysis scores were compared with classical test theory (CTT) scores. The partial credit model demonstrated a high goodness of fit and correlations between Rasch and CTT scores ranged from 0.91 to 0.99. CRIS scores are supported by both methods.…
The Impact of Correction for Guessing Formula on MC and Yes/No Vocabulary Tests' Scores

Directory of Open Access Journals (Sweden)

abdollah baradaran

2009-10-01

Full Text Available A standard correction for random guessing (cfg formula on multiple-choice and Yes/Noexaminations was examined retrospectively in the scores of the intermediate female EFL learners in an English language school. The correctionwas a weighting formula for points awarded for correct answers,incorrect answers, and unanswered questions so that the expectedvalue of the increase in test score due to guessing was zero. The researcher compared uncorrected and corrected scores on examinationsusing multiple-choice and Yes/No formats. These short-answer formats eliminatedor at least greatly reduced the potential for guessing the correctanswer. The expectation for students to improve their grade by guessingon multiple-choice and Yes/No format examinations is well known. The researcher examined a method for correcting for random guessing (cfg " no knowledge" on multiple- choice and Yes/No vocabulary examinations by comparing application and non-application of correction for guessing (cfg formula on scores on these examinations. It was done to determine whether the test takers really knew the correct answer, or they had resorted to a kind of guessing. This study represented a unique opportunity to compare scores from multiple-choice and Yes/No examinations in a settingin which students were given the same number of questions ineach of the two format types testing their knowledge over thesame subject matter. The results of this study indicated that the significant differences were highlighted between the subjects' scores when cfg formula was applied and when it was not.
Association testing for next-generation sequencing data using score statistics

DEFF Research Database (Denmark)

Skotte, Line; Korneliussen, Thorfinn Sand; Albrechtsen, Anders

2012-01-01

computationally feasible due to the use of score statistics. As part of the joint likelihood, we model the distribution of the phenotypes using a generalized linear model framework, which works for both quantitative and discrete phenotypes. Thus, the method presented here is applicable to case-control studies...... of genotype calls into account have been proposed; most require numerical optimization which for large-scale data is not always computationally feasible. We show that using a score statistic for the joint likelihood of observed phenotypes and observed sequencing data provides an attractive approach...... to association testing for next-generation sequencing data. The joint model accounts for the genotype classification uncertainty via the posterior probabilities of the genotypes given the observed sequencing data, which gives the approach higher power than methods based on called genotypes. This strategy remains...

Varying performance in mammographic interpretation across two countries: Do results indicate reader or population variances?

Science.gov (United States)

Soh, BaoLin P.; Lee, Warwick B.; Wong, Jill; Sim, Llewellyn; Hillis, Stephen L.; Tapia, Kriscia A.; Brennan, Patrick C.

2016-03-01

Aim: To compare the performance of Australian and Singapore breast readers interpreting a single test-set that consisted of mammographic examinations collected from the Australian population. Background: In the teleradiology era, breast readers are interpreting mammographic examinations from different populations. The question arises whether two groups of readers with similar training backgrounds, demonstrate the same level of performance when presented with a population familiar only to one of the groups. Methods: Fifty-three Australian and 15 Singaporean breast radiologists participated in this study. All radiologists were trained in mammogram interpretation and had a median of 9 and 15 years of experience in reading mammograms respectively. Each reader interpreted the same BREAST test-set consisting of sixty de-identified mammographic examinations arising from an Australian population. Performance parameters including JAFROC, ROC, case sensitivity as well as specificity were compared between Australian and Singaporean readers using a Mann Whitney U test. Results: A significant difference (P=0.036) was demonstrated between the JAFROC scores of the Australian and Singaporean breast radiologists. No other significant differences were observed. Conclusion: JAFROC scores for Australian radiologists were higher than those obtained by the Singaporean counterparts. Whilst it is tempting to suggest this is down to reader expertise, this may be a simplistic explanation considering the very similar training and audit backgrounds of the two populations of radiologists. The influence of reading images that are different from those that radiologists normally encounter cannot be ruled out and requires further investigation, particularly in the light of increasing international outsourcing of radiologic reporting.
Scoring an Abstract Contemporary Silent Film

OpenAIRE

Frost, Crystal

2014-01-01

I composed an original digital audio film score with full sound design for a contemporary silent film called Apple Tree. The film is highly conceptual and interpretive and required a very involved, intricate score to successfully tell the story. In the process of scoring this film, I learned new ways to convey an array of contrasting emotions through music and sound. After analyzing the film's emotional journey, I determined that six defining emotions were the foundation on which to build an ...
The Impact of Linking Distinct Achievement Test Scores on the Interpretation of Student Growth in Achievement

Science.gov (United States)

Airola, Denise Tobin

2011-01-01

Changes to state tests impact the ability of State Education Agencies (SEAs) to monitor change in performance over time. The purpose of this study was to evaluate the Standardized Performance Growth Index (PGIz), a proposed statistical model for measuring change in student and school performance, across transitions in tests. The PGIz is a…
Effects of Analytical and Holistic Scoring Patterns on Scorer Reliability in Biology Essay Tests

Science.gov (United States)

Ebuoh, Casmir N.

2018-01-01

Literature revealed that the patterns/methods of scoring essay tests had been criticized for not being reliable and this unreliability is more likely to be more in internal examinations than in the external examinations. The purpose of this study is to find out the effects of analytical and holistic scoring patterns on scorer reliability in…
[Relationship between unipedal stance test score and center of pressure velocity in elderly].

Science.gov (United States)

Rodrigo Antonio, Guzmán; Rony, Silvestre; Francisco Aniceto, Rodríguez; David Andrés, Arriagada; Pablo Andrés, Ortega

2011-01-01

Frequent falls are one of the most important health problems in the elderly population. The unipedal stance test (UPST), asses postural stability and is used in fall risk measures. Despite this, there is little information about its relationship with posturographic parameters (PP) that characterizes postural stability. Center of pressure velocity (CoPV) is one of the best PP that describes postural stability. The aim of this study was to analyze the relation between UST score and CoPV in elderly population. A sample of 38 healthy elderly subjects where divided in two groups according to their UPST score, low performance (LP, n=11) and high performance (HP, n=27). The correlation between UPST score and COP mean velocity (CoPmV), recorded from a posturographic test, was analyzed between both groups. An inverse correlation between UPST score and CoPmV was found in both groups. However, this was higher in the LP group (r=-0.69, P=.02) compared to the HP (r=-0.39, P=.04). Based on the results of this investigation, it may be concluded that the achievement on UPST has an inverse relationship with CoPmV, especially in subjects with low performance in the UPST. Copyright © 2010 SEGG. Published by Elsevier Espana. All rights reserved.
Linkage analysis in nuclear families. 2: Relationship between affected sib-pair tests and lod score analysis.

Science.gov (United States)

Knapp, M; Seuchter, S A; Baur, M P

1994-01-01

It is believed that the main advantage of affected sib-pair tests is that their application requires no information about the underlying genetic mechanism of the disease. However, here it is proved that the mean test, which can be considered the most prominent of the affected sib-pair tests, is equivalent to lod score analysis for an assumed recessive mode of inheritance, irrespective of the true mode of the disease. Further relationships of certain sib-pair tests and lod score analysis under specific assumed genetic modes are investigated.
A computer-aided detection system for rheumatoid arthritis MRI data interpretation and quantification of synovial activity

DEFF Research Database (Denmark)

Kubassove, Olga; Boesen, Mikael; Cimmino, Marco A

2009-01-01

and interpretation slow down development in this area. Existing scoring systems of especially synovitis are too rigid and insensitive to measure early treatment response and quantify inflammation. This study tested a novel automated, computer system for analysis of dynamic MRI data acquired from patients with RA...
Association of Health Sciences Reasoning Test scores with academic and experiential performance.

Science.gov (United States)

Cox, Wendy C; McLaughlin, Jacqueline E

2014-05-15

To assess the association of scores on the Health Sciences Reasoning Test (HSRT) with academic and experiential performance in a doctor of pharmacy (PharmD) curriculum. The HSRT was administered to 329 first-year (P1) PharmD students. Performance on the HSRT and its subscales was compared with academic performance in 29 courses throughout the curriculum and with performance in advanced pharmacy practice experiences (APPEs). Significant positive correlations were found between course grades in 8 courses and HSRT overall scores. All significant correlations were accounted for by pharmaceutical care laboratory courses, therapeutics courses, and a law and ethics course. There was a lack of moderate to strong correlation between HSRT scores and academic and experiential performance. The usefulness of the HSRT as a tool for predicting student success may be limited.
Impact of Answer-Switching Behavior on Multiple-Choice Test Scores in Higher Education

Directory of Open Access Journals (Sweden)

Ramazan BAŞTÜRK

2011-06-01

Full Text Available The multiple- choice format is one of the most popular selected-response item formats used in educational testing. Researchers have shown that Multiple-choice type test is a useful vehicle for student assessment in core university subjects that usually have large student numbers. Even though the educators, test experts and different test recourses maintain the idea that the first answer should be retained, many researchers argued that this argument is not dependent with empirical findings. The main question of this study is to examine how the answer switching behavior affects the multiple-choice test score. Additionally, gender differences and relationship between number of answer switching behavior and item parameters (item difficulty and item discrimination were investigated. The participants in this study consisted of 207 upper-level College of Education students from mid-sized universities. A Midterm exam consisted of 20 multiple-choice questions was used. According to the result of this study, answer switching behavior statistically increase test scores. On the other hand, there is no significant gender difference in answer-switching behavior. Additionally, there is a significant negative relationship between answer switching behavior and item difficulties.
Do Standardized Tests Penalize Deep-Thinking, Creative, or Conscientious Students?: Some Personality Correlates of Graduate Record Examinations Test Scores

Science.gov (United States)

Powers, Donald E.; Kaufman, James C.

2004-01-01

The objective of the study reported here was to explore the relationship of Graduate Record Examinations (GRE) General Test scores to selected personality traits--conscientiousness, rationality, ingenuity, quickness, creativity, and depth. A sample of 342 GRE test takers completed short personality inventory scales for each trait. Analyses…
Presentation of laboratory test results in patient portals: influence of interface design on risk interpretation and visual search behaviour.

Science.gov (United States)

Fraccaro, Paolo; Vigo, Markel; Balatsoukas, Panagiotis; van der Veer, Sabine N; Hassan, Lamiece; Williams, Richard; Wood, Grahame; Sinha, Smeeta; Buchan, Iain; Peek, Niels

2018-02-12

Patient portals are considered valuable instruments for self-management of long term conditions, however, there are concerns over how patients might interpret and act on the clinical information they access. We hypothesized that visual cues improve patients' abilities to correctly interpret laboratory test results presented through patient portals. We also assessed, by applying eye-tracking methods, the relationship between risk interpretation and visual search behaviour. We conducted a controlled study with 20 kidney transplant patients. Participants viewed three different graphical presentations in each of low, medium, and high risk clinical scenarios composed of results for 28 laboratory tests. After viewing each clinical scenario, patients were asked how they would have acted in real life if the results were their own, as a proxy of their risk interpretation. They could choose between: 1) Calling their doctor immediately (high interpreted risk); 2) Trying to arrange an appointment within the next 4 weeks (medium interpreted risk); 3) Waiting for the next appointment in 3 months (low interpreted risk). For each presentation, we assessed accuracy of patients' risk interpretation, and employed eye tracking to assess and compare visual search behaviour. Misinterpretation of risk was common, with 65% of participants underestimating the need for action across all presentations at least once. Participants found it particularly difficult to interpret medium risk clinical scenarios. Participants who consistently understood when action was needed showed a higher visual search efficiency, suggesting a better strategy to cope with information overload that helped them to focus on the laboratory tests most relevant to their condition. This study confirms patients' difficulties in interpreting laboratories test results, with many patients underestimating the need for action, even when abnormal values were highlighted or grouped together. Our findings raise patient safety
IMPACT OF SHOTS ON FINAL SCORE OF A FOOTBALL MATCH

Directory of Open Access Journals (Sweden)

Miroslav Radoman

2008-08-01

Full Text Available The research has been done on a sample of 64 played games on the World championship FIFA, World Cup Germany 2006 and 128 results of the games divided in three integrals according to the score (win, defeat and unresolved score . The analysis is done according to the total number of shots during the game. Considering the results that are got and their interpretations, we could conclude that the results of data analysis in which is used the multi-method of MANOVA analysis and discriminative analysis, has shown that there are significant difference in frequency of the games result (win, defeat or unresolved score in shots element during the game. Even thou the noticed difference in frequency are not equally expressed, the results that are got have insinuated that there are significant differences in followed elements of the football game. Implemented analysis (royev test i T-test have confirmed that in every analyzed elements of the shot there are statistically significant differences in the result of the game (win, defeat, unresolved score and that the differences in shot’s elements are consequence different selection of the tactics and techniques also the ability of their realization in the stage of at tack and defense.
The acquisition and retention of ECG interpretation skills after a standardized web-based ECG tutorial

DEFF Research Database (Denmark)

Rolskov Bojsen, Signe; Räder, Sune Bernd Emil Werner; Holst, Anders Gaardsdal

2015-01-01

BACKGROUND: Electrocardiogram (ECG) interpretation is of great importance for patient management. However, medical students frequently lack proficiency in ECG interpretation and rate their ECG training as inadequate. Our aim was to examine the effect of a standalone web-based ECG tutorial...... and to assess the retention of skills using multiple follow-up intervals. METHODS: 203 medical students were included in the study. All participants completed a pre-test, an ECG tutorial, and a post-test. The participants were also randomised to complete a retention-test after short (2-4 weeks), medium (10.......6), respectively). When comparing the pre-test to retention-test delta scores, junior students had learned significantly more than senior students (junior students improved 10.7 points and senior students improved 4.7 points, p = 0.003). CONCLUSION: A standalone web-based ECG tutorial can be an effective means...
Electrocardiographic interpretation skills of cardiology residents: are they competent?

Science.gov (United States)

Sibbald, Matthew; Davies, Edward G; Dorian, Paul; Yu, Eric H C

2014-12-01

Achieving competency at electrocardiogram (ECG) interpretation among cardiology subspecialty residents has traditionally focused on interpreting a target number of ECGs during training. However, there is little evidence to support this approach. Further, there are no data documenting the competency of ECG interpretation skills among cardiology residents, who become de facto the gold standard in their practice communities. We tested 29 Cardiology residents from all 3 years in a large training program using a set of 20 ECGs collected from a community cardiology practice over a 1-month period. Residents interpreted half of the ECGs using a standard analytic framework, and half using their own approach. Residents were scored on the number of correct and incorrect diagnoses listed. Overall diagnostic accuracy was 58%. Of 6 potentially life-threatening diagnoses, residents missed 36% (123 of 348) including hyperkalemia (81%), long QT (52%), complete heart block (35%), and ventricular tachycardia (19%). Residents provided additional inappropriate diagnoses on 238 ECGs (41%). Diagnostic accuracy was similar between ECGs interpreted using an analytic framework vs ECGs interpreted without an analytic framework (59% vs 58%; F(1,1333) = 0.26; P = 0.61). Cardiology resident proficiency at ECG interpretation is suboptimal. Despite the use of an analytic framework, there remain significant deficiencies in ECG interpretation among Cardiology residents. A more systematic method of addressing these important learning gaps is urgently needed. Copyright © 2014 Canadian Cardiovascular Society. Published by Elsevier Inc. All rights reserved.
CaPTHUS scoring model in primary hyperparathyroidism: can it eliminate the need for ioPTH testing?

Science.gov (United States)

Elfenbein, Dawn M; Weber, Sara; Schneider, David F; Sippel, Rebecca S; Chen, Herbert

2015-04-01

The CaPTHUS model was reported to have a positive predictive value of 100 % to correctly predict single-gland disease in patients with primary hyperparathyroidism, thus obviating the need for intraoperative parathyroid hormone (ioPTH) testing. We sought to apply the CaPTHUS scoring model in our patient population and assess its utility in predicting long-term biochemical cure. We retrospective reviewed all parathyroidectomies for primary hyperparathyroidism performed at our university hospital from 2003 to 2012. We routinely perform ioPTH testing. Biochemical cure was defined as a normal calcium level at 6 months. A total of 1,421 patients met the inclusion criteria: 78 % of patients had a single adenoma at the time of surgery, 98 % had a normal serum calcium at 1 week postoperatively, and 96 % had a normal serum calcium level 6 months postoperatively. Using the CaPTHUS scoring model, 307 patients (22.5 %) had a score of ≥ 3, with a positive predictive value of 91 % for single adenoma. A CaPTHUS score of ≥ 3 had a positive predictive value of 98 % for biochemical cure at 1 week as well as at 6 months. In our population, where ioPTH testing is used routinely to guide use of bilateral exploration, patients with a preoperative CaPTHUS score of ≥ 3 had good long-term biochemical cure rates. However, the model only predicted adenoma in 91 % of cases. If minimally invasive parathyroidectomy without ioPTH testing had been done for these patients, the cure rate would have dropped from 98 % to an unacceptable 89 %. Even in these patients with high CaPTHUS scores, multigland disease is present in almost 10 %, and ioPTH testing is necessary.
The use of test scores from large-scale assessment surveys: psychometric and statistical considerations

Directory of Open Access Journals (Sweden)

Henry Braun

2017-11-01

Full Text Available Abstract Background Economists are making increasing use of measures of student achievement obtained through large-scale survey assessments such as NAEP, TIMSS, and PISA. The construction of these measures, employing plausible value (PV methodology, is quite different from that of the more familiar test scores associated with assessments such as the SAT or ACT. These differences have important implications both for utilization and interpretation. Although much has been written about PVs, it appears that there are still misconceptions about whether and how to employ them in secondary analyses. Methods We address a range of technical issues, including those raised in a recent article that was written to inform economists using these databases. First, an extensive review of the relevant literature was conducted, with particular attention to key publications that describe the derivation and psychometric characteristics of such achievement measures. Second, a simulation study was carried out to compare the statistical properties of estimates based on the use of PVs with those based on other, commonly used methods. Results It is shown, through both theoretical analysis and simulation, that under fairly general conditions appropriate use of PV yields approximately unbiased estimates of model parameters in regression analyses of large scale survey data. The superiority of the PV methodology is particularly evident when measures of student achievement are employed as explanatory variables. Conclusions The PV methodology used to report student test performance in large scale surveys remains the state-of-the-art for secondary analyses of these databases.
Numerical Well Testing Interpretation Model and Applications in Crossflow Double-Layer Reservoirs by Polymer Flooding

Directory of Open Access Journals (Sweden)

Haiyang Yu

2014-01-01

Full Text Available This work presents numerical well testing interpretation model and analysis techniques to evaluate formation by using pressure transient data acquired with logging tools in crossflow double-layer reservoirs by polymer flooding. A well testing model is established based on rheology experiments and by considering shear, diffusion, convection, inaccessible pore volume (IPV, permeability reduction, wellbore storage effect, and skin factors. The type curves were then developed based on this model, and parameter sensitivity is analyzed. Our research shows that the type curves have five segments with different flow status: (I wellbore storage section, (II intermediate flow section (transient section, (III mid-radial flow section, (IV crossflow section (from low permeability layer to high permeability layer, and (V systematic radial flow section. The polymer flooding field tests prove that our model can accurately determine formation parameters in crossflow double-layer reservoirs by polymer flooding. Moreover, formation damage caused by polymer flooding can also be evaluated by comparison of the interpreted permeability with initial layered permeability before polymer flooding. Comparison of the analysis of numerical solution based on flow mechanism with observed polymer flooding field test data highlights the potential for the application of this interpretation method in formation evaluation and enhanced oil recovery (EOR.
Score Gains on g-loaded Tests: No g

NARCIS (Netherlands)

te Nijenhuis, J.; van Vianen, A.E.M.; van der Flier, H.

2007-01-01

IQ scores provide the best general predictor of success in education, job training, and work. However, there are many ways in which IQ scores can be increased, for instance by means of retesting or participation in learning potential training programs. What is the nature of these score gains? Jensen
Evaluation of Veterinary-Specific Interpretive Criteria for Susceptibility Testing of Streptococcus equi Subspecies with Trimethoprim-Sulfamethoxazole and Trimethoprim-Sulfadiazine

DEFF Research Database (Denmark)

Sadaka, Carmen; Kanellos, Theo; Guardabassi, Luca

2017-01-01

Antimicrobial susceptibility test results for trimethoprim-sulfadiazine with Streptococcus equi subspecies are interpreted based on human data for trimethoprim-sulfamethoxazole. The veterinary-specific data generated in this study support a single breakpoint for testing trimethoprim-sulfamethoxaz......Antimicrobial susceptibility test results for trimethoprim-sulfadiazine with Streptococcus equi subspecies are interpreted based on human data for trimethoprim-sulfamethoxazole. The veterinary-specific data generated in this study support a single breakpoint for testing trimethoprim...
What dementia reveals about proverb interpretation and its neuroanatomical correlates.

Science.gov (United States)

Kaiser, Natalie C; Lee, Grace J; Lu, Po H; Mather, Michelle J; Shapira, Jill; Jimenez, Elvira; Thompson, Paul M; Mendez, Mario F

2013-08-01

Neuropsychologists frequently include proverb interpretation as a measure of executive abilities. A concrete interpretation of proverbs, however, may reflect semantic impairments from anterior temporal lobes, rather than executive dysfunction from frontal lobes. The investigation of proverb interpretation among patients with different dementias with varying degrees of temporal and frontal dysfunction may clarify the underlying brain-behavior mechanisms for abstraction from proverbs. We propose that patients with behavioral variant frontotemporal dementia (bvFTD), who are characteristically more impaired on proverb interpretation than those with Alzheimer's disease (AD), are disproportionately impaired because of anterior temporal-mediated semantic deficits. Eleven patients with bvFTD and 10 with AD completed the Delis-Kaplan Executive Function System (D-KEFS) Proverbs Test and a series of neuropsychological measures of executive and semantic functions. The analysis included both raw and age-adjusted normed data for multiple choice responses on the D-KEFS Proverbs Test using independent samples t-tests. Tensor-based morphometry (TBM) applied to 3D T1-weighted MRI scans mapped the association between regional brain volume and proverb performance. Computations of mean Jacobian values within select regions of interest provided a numeric summary of regional volume, and voxel-wise regression yielded 3D statistical maps of the association between tissue volume and proverb scores. The patients with bvFTD were significantly worse than those with AD in proverb interpretation. The worse performance of the bvFTD patients involved a greater number of concrete responses to common, familiar proverbs, but not to uncommon, unfamiliar ones. These concrete responses to common proverbs correlated with semantic measures, whereas concrete responses to uncommon proverbs correlated with executive functions. After controlling for dementia diagnosis, TBM analyses indicated significant

Effect on intelligence test score of prenatal exposure to ionizing radiation in Hiroshima and Nagasaki

International Nuclear Information System (INIS)

Schull, W.J.; Otake, Masanori; Yoshimaru, Hiroshi.

1988-10-01

Analyses of intelligence test scores (Koga) at 10-11 years of age of individuals exposed prenatally to the atomic bombing of Hiroshima and Nagasaki using estimates of the uterine absorbed dose based on the recently introduced system of dosimetry, the Dosimetry System 1986 (DS86), reveal the following: 1) there is no evidence of a radiation-related effect on intelligence among those individuals exposed within 0-7 weeks after fertilization or in the 26th or subsequent weeks; 2) for individuals exposed at 8-15 weeks after fertilization, and to a lesser extent those exposed at 16-25 weeks, the mean tests scores but not the variances are significantly heterogeneous among exposure categories; 3) the cumulative distribution of test scores suggests a progressive shift downwards in individual scores with increasing exposure; and 4) within the group most sensitive to the occurrence of clinically recognizable severe mental retardation, individuals exposed 8 through 15 weeks after fertilization, the regression of intelligence score on estimated DS86 uterine absorbed dose is more linear than with T65DR fetal dose, the diminution in intelligence score under the linear model is 21-29 points at 1Gy. The effect is somewhat greater when the controls receiving less than 0.01 Gy are excluded, 24-33 points at 1 Gy. These findings are discussed in the light of the earlier analysis of the frequency of occurrence of mental retardation among the prenatally exposed survivors of the A-bombing of Hiroshima and Nagasaki. It is suggested that both are the consequences of the same underlying biological process or processes. (author)
Associations of maximal strength and muscular endurance test scores with cardiorespiratory fitness and body composition.

Science.gov (United States)

Vaara, Jani P; Kyröläinen, Heikki; Niemi, Jaakko; Ohrankämmen, Olli; Häkkinen, Arja; Kocay, Sheila; Häkkinen, Keijo

2012-08-01

The purpose of the present study was to assess the relationships between maximal strength and muscular endurance test scores additionally to previously widely studied measures of body composition and maximal aerobic capacity. 846 young men (25.5 ± 5.0 yrs) participated in the study. Maximal strength was measured using isometric bench press, leg extension and grip strength. Muscular endurance tests consisted of push-ups, sit-ups and repeated squats. An indirect graded cycle ergometer test was used to estimate maximal aerobic capacity (V(O2)max). Body composition was determined with bioelectrical impedance. Moreover, waist circumference (WC) and height were measured and body mass index (BMI) calculated. Maximal bench press was positively correlated with push-ups (r = 0.61, p strength (r = 0.34, p strength correlated positively (r = 0.36-0.44, p test scores were related to maximal aerobic capacity and body fat content, while fat free mass was associated with maximal strength test scores and thus is a major determinant for maximal strength. A contributive role of maximal strength to muscular endurance tests could be identified for the upper, but not the lower extremities. These findings suggest that push-up test is not only indicative of body fat content and maximal aerobic capacity but also maximal strength of upper body, whereas repeated squat test is mainly indicative of body fat content and maximal aerobic capacity, but not maximal strength of lower extremities.
ANA Testing: What should we know about the methods, indication and interpretation?

Directory of Open Access Journals (Sweden)

Au Elaine Yuen Ling

2017-11-01

Full Text Available Though ANA is a common test requested in several settings, one may not be aware of the potential traps for interpretation. Nowadays, there is a trend for autoantibodies diagnostics to move from traditional time honored manual methods to high throughput automated platforms. Nevertheless, the clinical significance and assay performance characteristics may be different from those “historical” methods. Though indirect immunofluorescence is the gold standard method for ANA tests, different laboratories vary in the slides (from different cell lines and commercial source, e.g., Hep 2, Hep 2000, etc., screening dilutions, terminology, reporting format and expertise. Hence, discrepancy in results among different laboratories is not uncommon and could be confusing. Knowing the assay characteristic and limitations helps proper results interpretation and facilitate patient’s management. Indeed, the titer and pattern by indirect immunofluorescence do provide valuable information in screening patients. In particular, DFS pattern with the associated anti-DFS70 antibodies has been shown to have a role to risk stratify cases referred for suspected autoimmune rheumatic disease.
Validity and Reliability of Nintendo Wii Fit Balance Scores

Science.gov (United States)

Wikstrom, Erik A.

2012-01-01

Context: Interactive gaming systems have the potential to help rehabilitate patients with musculoskeletal conditions. The Nintendo Wii Balance Board, which is part of the Wii Fit game, could be an effective tool to monitor progress during rehabilitation because the board and game can provide objective measures of balance. However, the validity and reliability of Wii Fit balance scores remain unknown. Objective: To determine the concurrent validity of balance scores produced by the Wii Fit game and the intrasession and intersession reliability of Wii Fit balance scores. Design: Descriptive laboratory study. Setting: Sports medicine research laboratory. Patients or Other Participants: Forty-five recreationally active participants (age = 27.0 ± 9.8 years, height = 170.9 ± 9.2 cm, mass = 72.4 ± 11.8 kg) with a heterogeneous history of lower extremity injury. Intervention(s): Participants completed a single-limb–stance task on a force plate and the Star Excursion Balance Test (SEBT) during the first test session. Twelve Wii Fit balance activities were completed during 2 test sessions separated by 1 week. Main Outcome Measure(s): Postural sway in the anteroposterior (AP) and mediolateral (ML) directions and the AP, ML, and resultant center-of-pressure (COP) excursions were calculated from the single-limb stance. The normalized reach distance was recorded for the anterior, posteromedial, and posterolateral directions of the SEBT. Wii Fit balance scores that the game software generated also were recorded. Results: All 96 of the calculated correlation coefficients among Wii Fit activity outcomes and established balance outcomes were interpreted as poor (r Wii Fit balance activity scores ranged from good (intraclass correlation coefficient [ICC] = 0.80) to poor (ICC = 0.39), with 8 activities having poor intrasession reliability. Similarly, 11 of the 12 Wii Fit balance activity scores demonstrated poor intersession reliability, with
Puzzle based teaching versus traditional instruction in electrocardiogram interpretation for medical students--a pilot study.

Science.gov (United States)

Rubinstein, Jack; Dhoble, Abhijeet; Ferenchick, Gary

2009-01-13

Most medical professionals are expected to possess basic electrocardiogram (EKG) interpretation skills. But, published data suggests that residents' and physicians' EKG interpretation skills are suboptimal. Learning styles differ among medical students; individualization of teaching methods has been shown to be viable and may result in improved learning. Puzzles have been shown to facilitate learning in a relaxed environment. The objective of this study was to assess efficacy of teaching puzzle in EKG interpretation skills among medical students. This is a reader blinded crossover trial. Third year medical students from College of Human Medicine, Michigan State University participated in this study. Two groups (n = 9) received two traditional EKG interpretation skills lectures followed by a standardized exam and two extra sessions with the teaching puzzle and a different exam. Two other groups (n = 6) received identical courses and exams with the puzzle session first followed by the traditional teaching. EKG interpretation scores on final test were used as main outcome measure. The average score after only traditional teaching was 4.07 +/- 2.08 while after only the puzzle session was 4.04 +/- 2.36 (p = 0.97). The average improvement after the traditional session was followed up with a puzzle session was 2.53 +/- 1.94 while the average improvement after the puzzle session was followed with the traditional session was 2.08 +/- 1.73 (p = 0.67). The final EKG exam score for this cohort (n = 15) was 84.1 compared to 86.6 (p = 0.22) for a comparable sample of medical students (n = 15) at a different campus. Teaching EKG interpretation with puzzles is comparable to traditional teaching and may be particularly useful for certain subgroups of students. Puzzle session are more interactive and relaxing, and warrant further investigations on larger scale.
Item Response Theory Modeling of the Philadelphia Naming Test

Science.gov (United States)

Fergadiotis, Gerasimos; Kellough, Stacey; Hula, William D.

2015-01-01

Purpose: In this study, we investigated the fit of the Philadelphia Naming Test (PNT; Roach, Schwartz, Martin, Grewal, & Brecher, 1996) to an item-response-theory measurement model, estimated the precision of the resulting scores and item parameters, and provided a theoretical rationale for the interpretation of PNT overall scores by relating…
Student Test Scores: How the Sausage Is Made and Why You Should Care. Evidence Speaks Reports, Vol 1, #25

Science.gov (United States)

Jacob, Brian A.

2016-01-01

Contrary to popular belief, modern cognitive assessments--including the new Common Core tests--produce test scores based on sophisticated statistical models rather than the simple percent of items a student answers correctly. While there are good reasons for this, it means that reported test scores depend on many decisions made by test designers,…
The effect of instructional methodology on high school students natural sciences standardized tests scores

Science.gov (United States)

Powell, P. E.

Educators have recently come to consider inquiry based instruction as a more effective method of instruction than didactic instruction. Experience based learning theory suggests that student performance is linked to teaching method. However, research is limited on inquiry teaching and its effectiveness on preparing students to perform well on standardized tests. The purpose of the study to investigate whether one of these two teaching methodologies was more effective in increasing student performance on standardized science tests. The quasi experimental quantitative study was comprised of two stages. Stage 1 used a survey to identify teaching methods of a convenience sample of 57 teacher participants and determined level of inquiry used in instruction to place participants into instructional groups (the independent variable). Stage 2 used analysis of covariance (ANCOVA) to compare posttest scores on a standardized exam by teaching method. Additional analyses were conducted to examine the differences in science achievement by ethnicity, gender, and socioeconomic status by teaching methodology. Results demonstrated a statistically significant gain in test scores when taught using inquiry based instruction. Subpopulation analyses indicated all groups showed improved mean standardized test scores except African American students. The findings benefit teachers and students by presenting data supporting a method of content delivery that increases teacher efficacy and produces students with a greater cognition of science content that meets the school's mission and goals.
The efficient market hypothesis: problems with interpretations of empirical tests

Directory of Open Access Journals (Sweden)

Denis Alajbeg

2012-03-01

Full Text Available Despite many “refutations” in empirical tests, the efficient market hypothesis (EMH remains the central concept of financial economics. The EMH’s resistance to the results of empirical testing emerges from the fact that the EMH is not a falsifiable theory. Its axiomatic definition shows how asset prices would behave under assumed conditions. Testing for this price behavior does not make much sense as the conditions in the financial markets are much more complex than the simplified conditions of perfect competition, zero transaction costs and free information used in the formulation of the EMH. Some recent developments within the tradition of the adaptive market hypothesis are promising regarding development of a falsifiable theory of price formation in financial markets, but are far from giving assurance that we are approaching a new formulation. The most that can be done in the meantime is to be very cautious while interpreting the empirical evidence that is presented as “testing” the EMH.
Critique of the Watson-Glaser Critical Thinking Appraisal Test: The More You Know, the Lower Your Score

Directory of Open Access Journals (Sweden)

Kevin Possin

2014-12-01

Full Text Available The Watson-Glaser Critical Thinking Appraisal Test is one of the oldest, most frequently used, multiple-choice critical-thinking tests on the market in business, government, and legal settings for purposes of hiring and promotion. I demonstrate, however, that the test has serious construct-validity issues, stemming primarily from its ambiguous, unclear, misleading, and sometimes mysterious instructions, which have remained unaltered for decades. Erroneously scored items further diminish the test’s validity. As a result, having enhanced knowledge of formal and informal logic could well result in test subjects receiving lower scores on the test. That’s not how things should work for a CT assessment test.
Test Scores, Class Rank and College Performance: Lessons for Broadening Access and Promoting Success.

Science.gov (United States)

Niu, Sunny X; Tienda, Marta

2012-04-01

Using administrative data for five Texas universities that differ in selectivity, this study evaluates the relative influence of two key indicators for college success-high school class rank and standardized tests. Empirical results show that class rank is the superior predictor of college performance and that test score advantages do not insulate lower ranked students from academic underperformance. Using the UT-Austin campus as a test case, we conduct a simulation to evaluate the consequences of capping students admitted automatically using both achievement metrics. We find that using class rank to cap the number of students eligible for automatic admission would have roughly uniform impacts across high schools, but imposing a minimum test score threshold on all students would have highly unequal consequences by greatly reduce the admission eligibility of the highest performing students who attend poor high schools while not jeopardizing admissibility of students who attend affluent high schools. We discuss the implications of the Texas admissions experiment for higher education in Europe.
Relationships between the handball-specific complex test, non-specific field tests and the match performance score in elite professional handball players.

Science.gov (United States)

Hermassi, Souhail; Chelly, Mohamed-Souhaiel; Wollny, Rainer; Hoffmeyer, Birgit; Fieseler, Georg; Schulze, Stephan; Irlenbusch, Lars; Delank, Karl-Stefan; Shephard, Roy J; Bartels, Thomas; Schwesig, René

2018-06-01

This study assessed the validity of the handball-specific complex test (HBCT) and two non-specific field tests in professional elite handball athletes, using the match performance score (MPS) as the gold standard of performance. Thirteen elite male handball players (age: 27.4±4.8 years; premier German league) performed the HBCT, the Yo-Yo Intermittent Recovery (YYIR) test and a repeated shuttle sprint ability (RSA) test at the beginning of pre-season training. The RSA results were evaluated in terms of best time, total time, and fatigue decrement. Heart rates (HR) were assessed at selected times throughout all tests; the recovery HR was measured immediately post-test and 10 minutes later. The match performance score was based on various handball specific parameters (e.g., field goals, assists, steals, blocks, and technical mistakes) as seen during all matches of the immediately subsequent season (2015/2016). The parameters of run 1, run 2, and HR recovery at minutes 6 and 10 of the RSA test all showed a variance of more than 10% (range: 11-15%). However, the variance of scores for the YYIR test was much smaller (range: 1-7%). The resting HR (r2=0.18), HR recovery at minute 10 (r2=0.10), lactate concentration at rest (r2=0.17), recovery of heart rate from 0 to 10 minutes (r2=0.15), and velocity of second throw at first trial (r2=0.37) were the most valid HBCT parameters. Much effort is necessary to assess MPS and to develop valid tests. Speed and the rate of functional recovery seem the best predictors of competitive performance for elite handball players.
Attention, interpretation, and memory biases in subclinical depression: a proof-of-principle test of the combined cognitive biases hypothesis.

Science.gov (United States)

Everaert, Jonas; Duyck, Wouter; Koster, Ernst H W

2014-04-01

Emotional biases in attention, interpretation, and memory are viewed as important cognitive processes underlying symptoms of depression. To date, there is a limited understanding of the interplay among these processing biases. This study tested the dependence of memory on depression-related biases in attention and interpretation. Subclinically depressed and nondepressed participants completed a computerized version of the scrambled sentences test (measuring interpretation bias) while their eye movements were recorded (measuring attention bias). This task was followed by an incidental free recall test of previously constructed interpretations (measuring memory bias). Path analysis revealed a good fit for the model in which selective orienting of attention was associated with interpretation bias, which in turn was associated with a congruent bias in memory. Also, a good fit was observed for a path model in which biases in the maintenance of attention and interpretation were associated with memory bias. Both path models attained a superior fit compared with path models without the theorized functional relations among processing biases. These findings enhance understanding of how mechanisms of attention and interpretation regulate what is remembered. As such, they offer support for the combined cognitive biases hypothesis or the notion that emotionally biased cognitive processes are not isolated mechanisms but instead influence each other. Implications for theoretical models and emotion regulation across the spectrum of depressive symptoms are discussed.
College Math Assessment: SAT Scores vs. College Math Placement Scores

Science.gov (United States)

Foley-Peres, Kathleen; Poirier, Dawn

2008-01-01

Many colleges and university's use SAT math scores or math placement tests to place students in the appropriate math course. This study compares the use of math placement scores and SAT scores for 188 freshman students. The student's grades and faculty observations were analyzed to determine if the SAT scores and/or college math assessment scores…
Does the Test Work? Evaluating a Web-Based Language Placement Test

Science.gov (United States)

Long, Avizia Y.; Shin, Sun-Young; Geeslin, Kimberly; Willis, Erik W.

2018-01-01

In response to the need for examples of test validation from which everyday language programs can benefit, this paper reports on a study that used Bachman's (2005) assessment use argument (AUA) framework to examine evidence to support claims made about the intended interpretations and uses of scores based on a new web-based Spanish language…
A Comparison of Scores on the WISC-R and Lorge-Thorndike Intelligence Test for Disadvantaged Black Elementary School Children

Science.gov (United States)

Lowe, James D.; Karnes, Frances A.

1976-01-01

It is indicated that, although the scores [obtained on both tests] are significantly correlated, the tests yield significantly different scores with the Lorge-Thorndike consistently overestimating the WISC-R full scale I.Q. (Author)
The Mediating Effect of Listening Metacognitive Awareness between Test-Taking Motivation and Listening Test Score: An Expectancy-Value Theory Approach

OpenAIRE

Xu, Jian

2017-01-01

The present study investigated test-taking motivation in L2 listening testing context by applying Expectancy-Value Theory as the framework. Specifically, this study was intended to examine the complex relationships among expectancy, importance, interest, listening anxiety, listening metacognitive awareness, and listening test score using data from a large-scale and high-stakes language test among Chinese first-year undergraduates. Structural equation modeling was used to examine the mediating...
International Test Score Comparisons and Educational Policy: A Review of the Critiques

Science.gov (United States)

Carnoy, Martin

2015-01-01

Stanford education professor Martin Carnoy examines four main critiques of how international test results are used in policymaking. Of particular interest are critiques of the policy analyses published by the Program for International Student Assessment (PISA). Using average PISA scores as a comparative measure of student achievement is misleading…
Parent Ratings of Impulsivity and Inhibition Predict State Testing Scores

Directory of Open Access Journals (Sweden)

Rebecca A. Lundwall

2018-03-01

Full Text Available One principle of cognitive development is that earlier intervention for educational difficulties tends to improve outcomes such as future educational and career success. One possible way to help students who struggle is to determine if they process information differently. Such determination might lead to clues for interventions. For example, early information processing requires attention before the information can be identified, encoded, and stored. The aim of the present study was to investigate whether parent ratings of inattention, inhibition, and impulsivity, and whether error rate on a reflexive attention task could be used to predict child scores on state standardized tests. Finding such an association could provide assistance to educators in identifying academically struggling children who might require targeted educational interventions. Children (N = 203 were invited to complete a peripheral cueing task (which measures the automatic reorienting of the brain’s attentional resources from one location to another. While the children completed the task, their parents completed a questionnaire. The questionnaire gathered information on broad indicators of child functioning, including observable behaviors of impulsivity, inattention, and inhibition, as well as state academic scores (which the parent retrieved online from their school. We used sequential regression to analyze contributions of error rate and parent-rated behaviors in predicting six academic scores. In one of the six analyses (for science, we found that the improvement was significant from the simplified model (with only family income, child age, and sex as predictors to the full model (adding error rate and three parent-rated behaviors. Two additional analyses (reading and social studies showed near significant improvement from simplified to full models. Parent-rated behaviors were significant predictors in all three of these analyses. In the reading score analysis
The Impact of the Use of Hierarchical Teaching on Test Scores of Students’ Technology

Directory of Open Access Journals (Sweden)

Zhao Guorong

2015-01-01

Full Text Available Test scores of students’ technology is the main basis for physical examination of college students’ physical, fitness evaluation based on test results. To change the view by the stratified teaching method consistent system of teaching mode, special movement technical level of students is improved significantly.

The Effects of Listening to Music Just Before Reading Test on Students’ Test Score

OpenAIRE

MAHDAVI, Mojtaba

2015-01-01

Abstract. In this study the researcher examined the effect of music on reading comprehension played just before the test . Because the emotional consequences of music listening are evident in stress and anxiety removal, it was used as a tool to pacify the mind of the tastes and boost their memory and the related cognitive processes. Experimental group did well with the mean score of) and control group (). This study confirmed that using multimedia devices such as music can not only i...
The Apgar score has survived the test of time.

Science.gov (United States)

Finster, Mieczyslaw; Wood, Margaret

2005-04-01

In 1953, Virginia Apgar, M.D. published her proposal for a new method of evaluation of the newborn infant. The avowed purpose of this paper was to establish a simple and clear classification of newborn infants which can be used to compare the results of obstetric practices, types of maternal pain relief and the results of resuscitation. Having considered several objective signs pertaining to the condition of the infant at birth she selected five that could be evaluated and taught to the delivery room personnel without difficulty. These signs were heart rate, respiratory effort, reflex irritability, muscle tone and color. Sixty seconds after the complete birth of the baby a rating of zero, one or two was given to each sign, depending on whether it was absent or present. Virginia Apgar reviewed anesthesia records of 1025 infants born alive at Columbia Presbyterian Medical Center during the period of this report. All had been rated by her method. Infants in poor condition scored 0-2, infants in fair condition scored 3-7, while scores 8-10 were achieved by infants in good condition. The most favorable score 1 min after birth was obtained by infants delivered vaginally with the occiput the presenting part (average 8.4). Newborns delivered by version and breech extraction had the lowest score (average 6.3). Infants delivered by cesarean section were more vigorous (average score 8.0) when spinal was the method of anesthesia versus an average score of 5.0 when general anesthesia was used. Correlating the 60 s score with neonatal mortality, Virginia found that mature infants receiving 0, 1 or 2 scores had a neonatal death rate of 14%; those scoring 3, 4, 5, 6 or 7 had a death rate of 1.1%; and those in the 8-10 score group had a death rate of 0.13%. She concluded that the prognosis of an infant is excellent if he receives one of the upper three scores, and poor if one of the lowest three scores.
Identifying genetic marker sets associated with phenotypes via an efficient adaptive score test

KAUST Repository

Cai, T.; Lin, X.; Carroll, R. J.

2012-01-01

the overall effect of a marker-set have been actively studied in recent years. For example, score tests derived under an Empirical Bayes (EB) framework (Liu and others, 2007. Semiparametric regression of multidimensional genetic pathway data: least
Gender Gaps in High School GPA and ACT Scores: High School Grade Point Average and ACT Test Score by Subject and Gender. Information Brief 2014-12

Science.gov (United States)

ACT, Inc., 2014

2014-01-01

Female students who graduated from high school in 2013 averaged higher grades than their male counterparts in all subjects, but male graduates earned higher scores on the math and science sections of the ACT. This information brief looks at high school grade point average and ACT test score by subject and gender
Parallel approach to identifying the well-test interpretation model using a neurocomputer

Science.gov (United States)

May, Edward A., Jr.; Dagli, Cihan H.

1996-03-01

The well test is one of the primary diagnostic and predictive tools used in the analysis of oil and gas wells. In these tests, a pressure recording device is placed in the well and the pressure response is recorded over time under controlled flow conditions. The interpreted results are indicators of the well's ability to flow and the damage done to the formation surrounding the wellbore during drilling and completion. The results are used for many purposes, including reservoir modeling (simulation) and economic forecasting. The first step in the analysis is the identification of the Well-Test Interpretation (WTI) model, which determines the appropriate solution method. Mis-identification of the WTI model occurs due to noise and non-ideal reservoir conditions. Previous studies have shown that a feed-forward neural network using the backpropagation algorithm can be used to identify the WTI model. One of the drawbacks to this approach is, however, training time, which can run into days of CPU time on personal computers. In this paper a similar neural network is applied using both a personal computer and a neurocomputer. Input data processing, network design, and performance are discussed and compared. The results show that the neurocomputer greatly eases the burden of training and allows the network to outperform a similar network running on a personal computer.
Normative data for the Maryland CNC Test.

Science.gov (United States)

Mendel, Lisa Lucks; Mustain, William D; Magro, Jessica

2014-09-01

The Maryland consonant-vowel nucleus-consonant (CNC) Test is routinely used in Veterans Administration medical centers, yet there is a paucity of published normative data for this test. The purpose of this study was to provide information on the means and distribution of word-recognition scores on the Maryland CNC Test as a function of degree of hearing loss for a veteran population. A retrospective, descriptive design was conducted. The sample consisted of records from veterans who had Compensation and Pension (C&P) examinations at a Veterans Administration medical center (N = 1,760 ears). Audiometric records of veterans who had C&P examinations during a 10 yr period were reviewed, and the pure-tone averages (PTA4) at four frequencies (1000, 2000, 3000, and 4000 Hz) were documented. The maximum word-recognition score (PBmax) was determined from the performance-intensity functions obtained using the Maryland CNC Test. Correlations were made between PBmax and PTA4. A wide range of word-recognition scores were obtained at all levels of PTA4 for this population. In addition, a strong negative correlation between the PBmax and the PTA4 was observed, indicating that as PTA4 increased, PBmax decreased. Word-recognition scores decreased significantly as hearing loss increased beyond a mild hearing loss. Although threshold was influenced by age, no statistically significant relationship was found between word-recognition score and the age of the participants. RESULTS from this study provide normative data in table and figure format to assist audiologists in interpreting patient results on the Maryland CNC test for a veteran population. These results provide a quantitative method for audiologists to use to interpret word-recognition scores based on pure-tone hearing loss. American Academy of Audiology.
Interpreter-mediated dentistry.

Science.gov (United States)

Bridges, Susan; Drew, Paul; Zayts, Olga; McGrath, Colman; Yiu, Cynthia K Y; Wong, H M; Au, T K F

2015-05-01

The global movements of healthcare professionals and patient populations have increased the complexities of medical interactions at the point of service. This study examines interpreter mediated talk in cross-cultural general dentistry in Hong Kong where assisting para-professionals, in this case bilingual or multilingual Dental Surgery Assistants (DSAs), perform the dual capabilities of clinical assistant and interpreter. An initial language use survey was conducted with Polyclinic DSAs (n = 41) using a logbook approach to provide self-report data on language use in clinics. Frequencies of mean scores using a 10-point visual analogue scale (VAS) indicated that the majority of DSAs spoke mainly Cantonese in clinics and interpreted for postgraduates and professors. Conversation Analysis (CA) examined recipient design across a corpus (n = 23) of video-recorded review consultations between non-Cantonese speaking expatriate dentists and their Cantonese L1 patients. Three patterns of mediated interpreting indicated were: dentist designated expansions; dentist initiated interpretations; and assistant initiated interpretations to both the dentist and patient. The third, rather than being perceived as negative, was found to be framed either in response to patient difficulties or within the specific task routines of general dentistry. The findings illustrate trends in dentistry towards personalized care and patient empowerment as a reaction to product delivery approaches to patient management. Implications are indicated for both treatment adherence and the education of dental professionals. Copyright © 2015 Elsevier Ltd. All rights reserved.
From Test Scores to Language Use: Emergent Bilinguals Using English to Accomplish Academic Tasks

Science.gov (United States)

Rodriguez-Mojica, Claudia

2018-01-01

Prominent discourses about emergent bilinguals' academic abilities tend to focus on performance as measured by test scores and perpetuate the message that emergent bilinguals trail far behind their peers. When we remove the constraints of formal testing situations, what can emergent bilinguals do in English as they engage in naturally occurring…
A Case Study About Why It Can Be Difficult To Test Whether Propensity Score Analysis Works in Field Experiments

Directory of Open Access Journals (Sweden)

Thomas D. Cook

2012-01-01

Full Text Available Peikes, Moreno and Orzol (2008 sensibly caution researchers that propensity score analysis may not lead to valid causal inference in field applications. But at the same time, they made the far stronger claim to have performed an ideal test of whether propensity score matching in quasi-experimental data is capable of approximating the results of a randomized experiment in their dataset, and that this ideal test showed that such matching could not do so. In this article we show that their study does not support that conclusion because it failed to meet a number of basic criteria for an ideal test. By implication, many other purported tests of the effectiveness of propensity score analysis probably also fail to meet these criteria, and are therefore questionable contributions to the literature on the effects of propensity score analysis.
Interpretation of large-strain geophysical crosshole tests

International Nuclear Information System (INIS)

Drnevich, V.P.; Salgado, R.; Ashmawy, A.; Grant, W.P.; Vallenas, P.

1995-10-01

At sites in earthquake-prone areas, the nonlinear dynamic stress-strain behavior of soil with depth is essential for earthquake response analyses. A seismic crosshole test has been developed where large dynamic forces are applied in a borehole. These forces generate shear strains in the surrounding soil that are well into the nonlinear range. The shear strain amplitudes decrease with distance from the source. Velocity sensors located in three additional holes at various distances from the source hole measure the particle velocity and the travel time of the shear wave from the source. This paper provides an improved, systematic interpretation scheme for the data from these large-strain geophysical crosshole tests. Use is made of both the measured velocities at each sensor and the travel times. The measured velocity at each sensor location is shown to be a good measure of the soil particle velocity at that location. Travel times to specific features on the velocity time history, such as first crossover, are used to generate travel time curves for the waves which are nonlinear. At some distance the amplitudes reduce to where the stress-strain behavior is essentially linear and independent of strain amplitude. This fact is used together with the measurements at the three sensor locations in a rational approach for fitting curves of shear wave velocity versus distance from the source hole that allow the determination of the shear wave velocity and the shear strain amplitude at each of the sensor locations as well as the shear wave velocity associated with small-strain (linear) behavior. The method is automated using off-the-shelf PC-based software. The method is applied to large-strain crosshole tests performed as part of the studies for the design and construction of the proposed Multi-Function Waste Tank Facility planned for Hanford Site
Puzzle based teaching versus traditional instruction in electrocardiogram interpretation for medical students – a pilot study

Science.gov (United States)

Rubinstein, Jack; Dhoble, Abhijeet; Ferenchick, Gary

2009-01-01

Background Most medical professionals are expected to possess basic electrocardiogram (EKG) interpretation skills. But, published data suggests that residents' and physicians' EKG interpretation skills are suboptimal. Learning styles differ among medical students; individualization of teaching methods has been shown to be viable and may result in improved learning. Puzzles have been shown to facilitate learning in a relaxed environment. The objective of this study was to assess efficacy of teaching puzzle in EKG interpretation skills among medical students. Methods This is a reader blinded crossover trial. Third year medical students from College of Human Medicine, Michigan State University participated in this study. Two groups (n = 9) received two traditional EKG interpretation skills lectures followed by a standardized exam and two extra sessions with the teaching puzzle and a different exam. Two other groups (n = 6) received identical courses and exams with the puzzle session first followed by the traditional teaching. EKG interpretation scores on final test were used as main outcome measure. Results The average score after only traditional teaching was 4.07 ± 2.08 while after only the puzzle session was 4.04 ± 2.36 (p = 0.97). The average improvement after the traditional session was followed up with a puzzle session was 2.53 ± 1.94 while the average improvement after the puzzle session was followed with the traditional session was 2.08 ± 1.73 (p = 0.67). The final EKG exam score for this cohort (n = 15) was 84.1 compared to 86.6 (p = 0.22) for a comparable sample of medical students (n = 15) at a different campus. Conclusion Teaching EKG interpretation with puzzles is comparable to traditional teaching and may be particularly useful for certain subgroups of students. Puzzle session are more interactive and relaxing, and warrant further investigations on larger scale. PMID:19144134
Puzzle based teaching versus traditional instruction in electrocardiogram interpretation for medical students – a pilot study

Directory of Open Access Journals (Sweden)

Dhoble Abhijeet

2009-01-01

Full Text Available Abstract Background Most medical professionals are expected to possess basic electrocardiogram (EKG interpretation skills. But, published data suggests that residents' and physicians' EKG interpretation skills are suboptimal. Learning styles differ among medical students; individualization of teaching methods has been shown to be viable and may result in improved learning. Puzzles have been shown to facilitate learning in a relaxed environment. The objective of this study was to assess efficacy of teaching puzzle in EKG interpretation skills among medical students. Methods This is a reader blinded crossover trial. Third year medical students from College of Human Medicine, Michigan State University participated in this study. Two groups (n = 9 received two traditional EKG interpretation skills lectures followed by a standardized exam and two extra sessions with the teaching puzzle and a different exam. Two other groups (n = 6 received identical courses and exams with the puzzle session first followed by the traditional teaching. EKG interpretation scores on final test were used as main outcome measure. Results The average score after only traditional teaching was 4.07 ± 2.08 while after only the puzzle session was 4.04 ± 2.36 (p = 0.97. The average improvement after the traditional session was followed up with a puzzle session was 2.53 ± 1.94 while the average improvement after the puzzle session was followed with the traditional session was 2.08 ± 1.73 (p = 0.67. The final EKG exam score for this cohort (n = 15 was 84.1 compared to 86.6 (p = 0.22 for a comparable sample of medical students (n = 15 at a different campus. Conclusion Teaching EKG interpretation with puzzles is comparable to traditional teaching and may be particularly useful for certain subgroups of students. Puzzle session are more interactive and relaxing, and warrant further investigations on larger scale.
The Disaggregation of Value-Added Test Scores to Assess Learning Outcomes in Economics Courses

Science.gov (United States)

Walstad, William B.; Wagner, Jamie

2016-01-01

This study disaggregates posttest, pretest, and value-added or difference scores in economics into four types of economic learning: positive, retained, negative, and zero. The types are derived from patterns of student responses to individual items on a multiple-choice test. The micro and macro data from the "Test of Understanding in College…
MMPI-2 and MMPI-A Computerized Interpretation: An Adjunct to Quality Mental Health Service.

Science.gov (United States)

Phelps, LeAdelle

1994-01-01

Provides reviews of computerized scoring and interpretive systems for the Minnesota Multiphasic Personality Inventory (MMPI-2 and MMPI-A): Caldwell Report, the Psychological Assessment Resources MMPI-2 Interpretive System, and the National Computer Systems Programs. Concludes that when used appropriately, such scoring systems enhance a counselor's…
The TSCA interagency testing committee`s approaches to screening and scoring chemicals and chemical groups: 1977-1983

Energy Technology Data Exchange (ETDEWEB)

Walker, J.D. [Environmental Protection Agency, Washington, DC (United States)

1990-12-31

This paper describes the TSCA interagency testing committee`s (ITC) approaches to screening and scoring chemicals and chemical groups between 1977 and 1983. During this time the ITC conducted five scoring exercises to select chemicals and chemical groups for detailed review and to determine which of these chemicals and chemical groups should be added to the TSCA Section 4(e) Priority Testing List. 29 refs., 1 fig., 2 tabs.
A Case Study About Why It Can Be Difficult To Test Whether Propensity Score Analysis Works in Field Experiments

Directory of Open Access Journals (Sweden)

William R. Shadish

2013-02-01

Full Text Available Peikes, Moreno and Orzol (2008 sensibly caution researchers that propensity score analysis may not lead to valid causal inference in field applications. But at the same time, they made the far stronger claim to have performed an ideal test of whether propensity score matching in quasi-experimental data is capable of approximating the results of a randomized experiment in their dataset, and that this ideal test showed that such matching could not do so. In this article we show that their study does not support that conclusion because it failed to meet a number of basic criteria for an ideal test. By implication, many other purported tests of the effectiveness of propensity score analysis probably also fail to meet these criteria, and are therefore questionable contributions to the literature on the effects of propensity score analysis. DOI: 10.2458/azu_jmmss.v3i2.16475
Are students' impressions of improved learning through active learning methods reflected by improved test scores?

Science.gov (United States)

Everly, Marcee C

2013-02-01

To report the transformation from lecture to more active learning methods in a maternity nursing course and to evaluate whether student perception of improved learning through active-learning methods is supported by improved test scores. The process of transforming a course into an active-learning model of teaching is described. A voluntary mid-semester survey for student acceptance of the new teaching method was conducted. Course examination results, from both a standardized exam and a cumulative final exam, among students who received lecture in the classroom and students who had active learning activities in the classroom were compared. Active learning activities were very acceptable to students. The majority of students reported learning more from having active-learning activities in the classroom rather than lecture-only and this belief was supported by improved test scores. Students who had active learning activities in the classroom scored significantly higher on a standardized assessment test than students who received lecture only. The findings support the use of student reflection to evaluate the effectiveness of active-learning methods and help validate the use of student reflection of improved learning in other research projects. Copyright © 2011 Elsevier Ltd. All rights reserved.
SIGI: score-based identification of genomic islands

Directory of Open Access Journals (Sweden)

Merkl Rainer

2004-03-01

Full Text Available Abstract Background Genomic islands can be observed in many microbial genomes. These stretches of DNA have a conspicuous composition with regard to sequence or encoded functions. Genomic islands are assumed to be frequently acquired via horizontal gene transfer. For the analysis of genome structure and the study of horizontal gene transfer, it is necessary to reliably identify and characterize these islands. Results A scoring scheme on codon frequencies Score_G1G2(cdn = log(f_G2(cdn / f_G1(cdn was utilized. To analyse genes of a species G1 and to test their relatedness to species G2, scores were determined by applying the formula to log-odds derived from mean codon frequencies of the two genomes. A non-redundant set of nearly 400 codon usage tables comprising microbial species was derived; its members were used alternatively at position G2. Genes having at least one score value above a species-specific and dynamically determined cut-off value were analysed further. By means of cluster analysis, genes were identified that comprise clusters of statistically significant size. These clusters were predicted as genomic islands. Finally and individually for each of these genes, the taxonomical relation among those species responsible for significant scores was interpreted. The validity of the approach and its limitations were made plausible by an extensive analysis of natural genes and synthetic ones aimed at modelling the process of gene amelioration. Conclusions The method reliably allows to identify genomic island and the likely origin of alien genes.
Changes in Student Populations and Average Test Scores of Dutch Primary Schools

Science.gov (United States)

Luyten, Hans; de Wolf, Inge

2011-01-01

This article focuses on the relation between student population characteristics and average test scores per school in the final grade of primary education from a dynamic perspective. Aggregated data of over 5,000 Dutch primary schools covering a 6-year period were used to study the relation between changes in school populations and shifts in mean…
Effect of Mindfulness Meditation on Perceived Stress Scores and Autonomic Function Tests of Pregnant Indian Women.

Science.gov (United States)

Muthukrishnan, Shobitha; Jain, Reena; Kohli, Sangeeta; Batra, Swaraj

2016-04-01

Various pregnancy complications like hypertension, preeclampsia have been strongly correlated with maternal stress. One of the connecting links between pregnancy complications and maternal stress is mind-body intervention which can be part of Complementary and Alternative Medicine (CAM). Biologic measures of stress during pregnancy may get reduced by such interventions. To evaluate the effect of Mindfulness meditation on perceived stress scores and autonomic function tests of pregnant Indian women. Pregnant Indian women of 12 weeks gestation were randomised to two treatment groups: Test group with Mindfulness meditation and control group with their usual obstetric care. The effect of Mindfulness meditation on perceived stress scores and cardiac sympathetic functions and parasympathetic functions (Heart rate variation with respiration, lying to standing ratio, standing to lying ratio and respiratory rate) were evaluated on pregnant Indian women. There was a significant decrease in perceived stress scores, a significant decrease of blood pressure response to cold pressor test and a significant increase in heart rate variability in the test group (pwomen. The results of this study suggest that mindfulness meditation improves parasympathetic functions in pregnant women and is a powerful modulator of the sympathetic nervous system during pregnancy.

The Impact of the 2004 Hurricanes on Florida Comprehensive Assessment Test Scores: Implications for School Counselors

Science.gov (United States)

Baggerly, Jennifer; Ferretti, Larissa K.

2008-01-01

What is the impact of natural disasters on students' statewide assessment scores? To answer this question, Florida Comprehensive Assessment Test (FCAT) scores of 55,881 students in grades 4 through 10 were analyzed to determine if there were significant decreases after the 2004 hurricanes. Results reveal that there was statistical but no practical…
Lower Quarter Y-Balance Test Scores and Lower Extremity Injury in NCAA Division I Athletes.

Science.gov (United States)

Lai, Wilson C; Wang, Dean; Chen, James B; Vail, Jeremy; Rugg, Caitlin M; Hame, Sharon L

2017-08-01

Functional movement tests that are predictive of injury risk in National Collegiate Athletic Association (NCAA) athletes are useful tools for sports medicine professionals. The Lower Quarter Y-Balance Test (YBT-LQ) measures single-leg balance and reach distances in 3 directions. To assess whether the YBT-LQ predicts the laterality and risk of sports-related lower extremity (LE) injury in NCAA athletes. Case-control study; Level of evidence, 3. The YBT-LQ was administered to 294 NCAA Division I athletes from 21 sports during preparticipation physical examinations at a single institution. Athletes were followed prospectively over the course of the corresponding season. Correlation analysis was performed between the laterality of reach asymmetry and composite scores (CS) versus the laterality of injury. Receiver operating characteristic (ROC) analysis was used to determine the optimal asymmetry cutoff score for YBT-LQ. A multivariate regression analysis adjusting for sex, sport type, body mass index, and history of prior LE surgery was performed to assess predictors of earlier and higher rates of injury. Neither the laterality of reach asymmetry nor the CS correlated with the laterality of injury. ROC analysis found optimal cutoff scores of 2, 9, and 3 cm for anterior, posteromedial, and posterolateral reach, respectively. All of these potential cutoff scores, along with a cutoff score of 4 cm used in the majority of prior studies, were associated with poor sensitivity and specificity. Furthermore, none of the asymmetric cutoff scores were associated with earlier or increased rate of injury in the multivariate analyses. YBT-LQ scores alone do not predict LE injury in this collegiate athlete population. Sports medicine professionals should be cautioned against using the YBT-LQ alone to screen for injury risk in collegiate athletes.
RENZI SCORE FOR OBSTRUCTED DEFECATION SYNDROME - VALIDATION OF THE PORTUGUESE VERSION ACCORDING TO THE COSMIN CHECKLIST.

Science.gov (United States)

Caetano, Ana Celia; Dias, Sara; Santa-Cruz, André; Rolanda, Carla

2018-01-01

Recently, the Obstructed Defecation Syndrome score (ODS score) was developed and validated by Renzi to assess clinical staging and to allow evaluation and comparison of the efficacy of treatment of this disorder. Our goal is to validate the Portuguese version of Renzi ODS score, according to the Consensus based Standards for the selection of the Health Measurement Instruments (COSMIN) checklist. Following guidelines for cross-cultural validity, Renzi ODS score was translated into the Portuguese language. Then, a group of patients and healthy controls were invited to fill in the Renzi ODS score at baseline, after 2 weeks and 3 months, respectively. We assessed internal consistency, reliability and measurement error, content and construct validity, responsiveness and interpretability. A total of 113 individuals (77 patients; 36 healthy controls) completed the questionnaire. Seventy and 30 patients repeated the Renzi ODS score after 2 weeks and 3 months respectively. Factor analysis confirmed the unidimensionality of the scale. Cronbach's α coefficient of 0.77 supported item's homogeneity. Weighted quadratic kappa of 0.89 established test-retest reliability. The smallest detectable change at the individual level was 2.66 and at the group level was 0.30. Renzi ODS score and the total (-0.32) and physical (-0.43) SF-36 scores correlated negatively. Patient and control's groups significantly differed (11 points). The change score of Renzi ODS score between baseline and 3 months correlated negatively with the clinical evolution (-0.86). ROC analysis showed minimal important change of 2.00 with AUC 0.97. Neither floor nor ceiling effects were observed. This work validated the Portuguese version of Renzi ODS score. We can now use this reliable, responsive, and interpretable (at the group level) tool to evaluate Portuguese ODS patients.
Interpretative commenting.

Science.gov (United States)

Vasikaran, Samuel

2008-08-01

* Clinical laboratories should be able to offer interpretation of the results they produce. * At a minimum, contact details for interpretative advice should be available on laboratory reports.Interpretative comments may be verbal or written and printed. * Printed comments on reports should be offered judiciously, only where they would add value; no comment preferred to inappropriate or dangerous comment. * Interpretation should be based on locally agreed or nationally recognised clinical guidelines where available. * Standard tied comments ("canned" comments) can have some limited use.Individualised narrative comments may be particularly useful in the case of tests that are new, complex or unfamiliar to the requesting clinicians and where clinical details are available. * Interpretative commenting should only be provided by appropriately trained and credentialed personnel. * Audit of comments and continued professional development of personnel providing them are important for quality assurance.
Depressive status explains a significant amount of the variance in COPD assessment test (CAT) scores.

Science.gov (United States)

Miravitlles, Marc; Molina, Jesús; Quintano, José Antonio; Campuzano, Anna; Pérez, Joselín; Roncero, Carlos

2018-01-01

COPD assessment test (CAT) is a short, easy-to-complete health status tool that has been incorporated into the multidimensional assessment of COPD in order to guide therapy; therefore, it is important to understand the factors determining CAT scores. This is a post hoc analysis of a cross-sectional, observational study conducted in respiratory medicine departments and primary care centers in Spain with the aim of identifying the factors determining CAT scores, focusing particularly on the cognitive status measured by the Mini-Mental State Examination (MMSE) and levels of depression measured by the short Beck Depression Inventory (BDI). A total of 684 COPD patients were analyzed; 84.1% were men, the mean age of patients was 68.7 years, and the mean forced expiratory volume in 1 second (%) was 55.1%. Mean CAT score was 21.8. CAT scores correlated with the MMSE score (Pearson's coefficient r =-0.371) and the BDI ( r =0.620), both p CAT scores and explained 45% of the variability. However, a model including only MMSE and BDI scores explained up to 40% and BDI alone explained 38% of the CAT variance. CAT scores are associated with clinical variables of severity of COPD. However, cognitive status and, in particular, the level of depression explain a larger percentage of the variance in the CAT scores than the usual COPD clinical severity variables.
Sequential Neighborhood Effects: The Effect of Long-Term Exposure to Concentrated Disadvantage on Children's Reading and Math Test Scores.

Science.gov (United States)

Hicks, Andrew L; Handcock, Mark S; Sastry, Narayan; Pebley, Anne R

2018-02-01

Prior research has suggested that children living in a disadvantaged neighborhood have lower achievement test scores, but these studies typically have not estimated causal effects that account for neighborhood choice. Recent studies used propensity score methods to account for the endogeneity of neighborhood exposures, comparing disadvantaged and nondisadvantaged neighborhoods. We develop an alternative propensity function approach in which cumulative neighborhood effects are modeled as a continuous treatment variable. This approach offers several advantages. We use our approach to examine the cumulative effects of neighborhood disadvantage on reading and math test scores in Los Angeles. Our substantive results indicate that recency of exposure to disadvantaged neighborhoods may be more important than average exposure for children's test scores. We conclude that studies of child development should consider both average cumulative neighborhood exposure and the timing of this exposure.
Refining Ovarian Cancer Test accuracy Scores (ROCkeTS): protocol for a prospective longitudinal test accuracy study to validate new risk scores in women with symptoms of suspected ovarian cancer

Science.gov (United States)

Sundar, Sudha; Rick, Caroline; Dowling, Francis; Au, Pui; Rai, Nirmala; Champaneria, Rita; Stobart, Hilary; Neal, Richard; Davenport, Clare; Mallett, Susan; Sutton, Andrew; Kehoe, Sean; Timmerman, Dirk; Bourne, Tom; Van Calster, Ben; Gentry-Maharaj, Aleksandra; Deeks, Jon

2016-01-01

Introduction Ovarian cancer (OC) is associated with non-specific symptoms such as bloating, making accurate diagnosis challenging: only 1 in 3 women with OC presents through primary care referral. National Institute for Health and Care Excellence guidelines recommends sequential testing with CA125 and routine ultrasound in primary care. However, these diagnostic tests have limited sensitivity or specificity. Improving accurate triage in women with vague symptoms is likely to improve mortality by streamlining referral and care pathways. The Refining Ovarian Cancer Test Accuracy Scores (ROCkeTS; HTA 13/13/01) project will derive and validate new tests/risk prediction models that estimate the probability of having OC in women with symptoms. This protocol refers to the prospective study only (phase III). Methods and analysis ROCkeTS comprises four parallel phases. The full ROCkeTS protocol can be found at http://www.birmingham.ac.uk/ROCKETS. Phase III is a prospective test accuracy study. The study will recruit 2450 patients from 15 UK sites. Recruited patients complete symptom and anxiety questionnaires, donate a serum sample and undergo ultrasound scored as per International Ovarian Tumour Analysis (IOTA) criteria. Recruitment is at rapid access clinics, emergency departments and elective clinics. Models to be evaluated include those based on ultrasound derived by the IOTA group and novel models derived from analysis of existing data sets. Estimates of sensitivity, specificity, c-statistic (area under receiver operating curve), positive predictive value and negative predictive value of diagnostic tests are evaluated and a calibration plot for models will be presented. ROCkeTS has received ethical approval from the NHS West Midlands REC (14/WM/1241) and is registered on the controlled trials website (ISRCTN17160843) and the National Institute of Health Research Cancer and Reproductive Health portfolios. PMID:27507231
Assessing working memory in children with ADHD: Minor administration and scoring changes may improve digit span backward's construct validity.

Science.gov (United States)

Wells, Erica L; Kofler, Michael J; Soto, Elia F; Schaefer, Hillary S; Sarver, Dustin E

2018-01-01

Pediatric ADHD is associated with impairments in working memory, but these deficits often go undetected when using clinic-based tests such as digit span backward. The current study pilot-tested minor administration/scoring modifications to improve digit span backward's construct and predictive validities in a well-characterized sample of children with ADHD. WISC-IV digit span was modified to administer all trials (i.e., ignore discontinue rule) and count digits rather than trials correct. Traditional and modified scores were compared to a battery of criterion working memory (construct validity) and academic achievement tests (predictive validity) for 34 children with ADHD ages 8-13 (M=10.41; 11 girls). Traditional digit span backward scores failed to predict working memory or KTEA-2 achievement (allns). Alternate administration/scoring of digit span backward significantly improved its associations with working memory reordering (r=.58), working memory dual-processing (r=.53), working memory updating (r=.28), and KTEA-2 achievement (r=.49). Consistent with prior work, these findings urge caution when interpreting digit span performance. Minor test modifications may address test validity concerns, and should be considered in future test revisions. Digit span backward becomes a valid measure of working memory at exactly the point that testing is traditionally discontinued. Copyright © 2017 Elsevier Ltd. All rights reserved.
Chest radiograph interpretation by medical students

International Nuclear Information System (INIS)

Jeffrey, D.R.; Goddard, P.R.; Callaway, M.P.; Greenwood, R.

2003-01-01

AIM: To assess the ability of final year medical students to interpret conventional chest radiographs. MATERIALS AND METHODS: Ten conventional chest radiographs were selected from a teaching hospital radiology department library that were good radiological examples of common conditions. All were conditions that a medical student should be expected to recognize by the end of their training. One normal radiograph was included. The radiographs were shown to 52 final year medical students who were asked to describe their findings. RESULTS: The median score achieved was 12.5 out of 20 (range 6-18). There was no difference between the median scores of male and female students (12.5 and 12.3, respectively, p=0.82) but male students were more likely to be certain of their answers than female students (median certainty scores 23.0 and 14.0, respectively). The overall degree of certainty was low. On no radiograph were more than 25% of students definite about their answer. Students had received little formal radiology teaching (2-42 h, median 21) and few expressed an interest in radiology as a career. Only two (3.8%) students thought they were good at interpreting chest radiographs, 17 (32.7%) thought they were bad or awful. CONCLUSION: Medical students reaching the end of their training do not perform well at interpreting simple chest radiographs. They lack confidence and have received little formal radiological tuition. Perhaps as a result, few are interested in radiology as a career, which is a matter for concern in view of the current shortage of radiologists in the UK
Using College Admission Test Scores to Clarify High School Placement. Leading Indicator Spotlight

Science.gov (United States)

Flug, Susanna

2010-01-01

In "Beyond Test Scores: Leading Indicators for Education," Foley and colleagues (2008) define leading indicators as those that "provide early signals of progress toward academic achievement" (p. 1) and stress that educators "need leading indicators to help them see the direction their efforts are going in and to take…
Differences of wells scores accuracy, caprini scores and padua scores in deep vein thrombosis diagnosis

Science.gov (United States)

Gatot, D.; Mardia, A. I.

2018-03-01

Deep Vein Thrombosis (DVT) is the venous thrombus in lower limbs. Diagnosis is by using venography or ultrasound compression. However, these examinations are not available yet in some health facilities. Therefore many scoring systems are developed for the diagnosis of DVT. The scoring method is practical and safe to use in addition to efficacy, and effectiveness in terms of treatment and costs. The existing scoring systems are wells, caprini and padua score. There have been many studies comparing the accuracy of this score but not in Medan. Therefore, we are interested in comparative research of wells, capriniand padua score in Medan.An observational, analytical, case-control study was conducted to perform diagnostic tests on the wells, caprini and padua score to predict the risk of DVT. The study was at H. Adam Malik Hospital in Medan.From a total of 72 subjects, 39 people (54.2%) are men and the mean age are 53.14 years. Wells score, caprini score and padua score has a sensitivity of 80.6%; 61.1%, 50% respectively; specificity of 80.65; 66.7%; 75% respectively, and accuracy of 87.5%; 64.3%; 65.7% respectively.Wells score has better sensitivity, specificity and accuracy than caprini and padua score in diagnosing DVT.
Psychometric properties of the Neck OutcOme Score, Neck Disability Index, and Short Form-36 were evaluated in patients with neck pain

DEFF Research Database (Denmark)

Juul, Tina; Søgaard, Karen; Davis, Aileen M.

2016-01-01

Objective:To assess reliability, construct validity, responsiveness, and interpretability for Neck OutcOme Score (NOOS), Neck Disability Index (NDI), and Short Form–36 (SF-36) in neck pain patients. Study Design and Setting: Internal consistency was assessed by Cronbach alpha. Test-retest reliabi...
A high COPD assessment test score may predict anxiety in COPD

Directory of Open Access Journals (Sweden)

Harryanto H

2018-03-01

Full Text Available Hilman Harryanto,1 Sally Burrows,2 Yuben Moodley1,2 1Department of Respiratory Medicine, Fiona Stanley Hospital, Perth, WA, Australia; 2Faculty of Health and Medical Sciences, Medical School, University of Western Australia, Perth, WA, AustraliaThe prevalence of anxiety is 55% in patients with COPD,1 and it is associated with worse disease control. Therefore, early recognition and institution of treatment of this comorbidity significantly improve patient’s quality of life. Recently, a questionnaire called the COPD assessment test (CAT has been incorporated into the Global Initiative for Chronic Obstructive Lung Disease (GOLD guidelines for the management of COPD, and a higher score is associated with increased COPD symptoms.2 Considering the regular use of CAT, it was evaluated whether this tool can also be used to identify anxiety. The CAT score was correlated with the Hospital Anxiety and Depression Scale (HADS to determine the level at which CAT may predict anxiety.
A study of low scores in Canadian children and adolescents on the Wechsler Intelligence Scale For Children, Fourth Edition (WISC-IV).

Science.gov (United States)

Brooks, Brian L

2011-01-01

Knowing the prevalence of low neurocognitive scores for the WISC-IV Canadian normative sample (WISC-IV(CDN)) is an important supplement for clinical interpretation of test performance. On the WISC-IV(CDN), it is uncommon for children and adolescents to have 4 or more subtest scores or 2 or more Index scores ≤ 9th percentile when all scores on the battery are considered simultaneously. As the level of the child's intelligence increases or the number of years of parental education increases, the prevalence of low scores decreases. These results are consistent with existing studies of the base rates of low scores in children and adolescents on pediatric cognitive batteries, including the WISC-IV American normative sample. Tables provided are ready for clinical use.
Exploration of analysis methods for diagnostic imaging tests: problems with ROC AUC and confidence scores in CT colonography.

Science.gov (United States)

Mallett, Susan; Halligan, Steve; Collins, Gary S; Altman, Doug G

2014-01-01

Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC) for the evaluation of CT colonography for the detection of polyps, either with or without computer assisted detection. In a multireader multicase study of 10 readers and 107 cases we compared sensitivity and specificity, using radiological reporting of the presence or absence of polyps, to ROC AUC calculated from confidence scores concerning the presence of polyps. Both methods were assessed against a reference standard. Here we focus on five readers, selected to illustrate issues in design and analysis. We compared diagnostic measures within readers, showing that differences in results are due to statistical methods. Reader performance varied widely depending on whether sensitivity and specificity or ROC AUC was used. There were problems using confidence scores; in assigning scores to all cases; in use of zero scores when no polyps were identified; the bimodal non-normal distribution of scores; fitting ROC curves due to extrapolation beyond the study data; and the undue influence of a few false positive results. Variation due to use of different ROC methods exceeded differences between test results for ROC AUC. The confidence scores recorded in our study violated many assumptions of ROC AUC methods, rendering these methods inappropriate. The problems we identified will apply to other detection studies using confidence scores. We found sensitivity and specificity were a more reliable and clinically appropriate method to compare diagnostic tests.
Comprehensive School Reform and Standardized Test Scores in Illinois Elementary and Middle Schools

Science.gov (United States)

McEnroe, James D.

2010-01-01

The study examined the effects of the federally funded Comprehensive School Reform (CSR) program on student performance on mandated standardized tests. The study focused on the mathematics and reading scores of Illinois public elementary and middle and junior high school students. The federal CSR program provided Illinois schools with an annual…
Evaluation of Factors Affecting Continuous Performance Test Identical Pairs Version Score of Schizophrenic Patients in a Japanese Clinical Sample

Directory of Open Access Journals (Sweden)

Takayoshi Koide

2012-01-01

Full Text Available Aim. Cognitive impairment in schizophrenia strongly relates to social outcome and is a good candidate for endophenotypes. When we accurately measure drug efficacy or effects of genes or variants relevant to schizophrenia on cognitive impairment, clinical factors that can affect scores on cognitive tests, such as age and severity of symptoms, should be considered. To elucidate the effect of clinical factors, we conducted multiple regression analysis using scores of the Continuous Performance Test Identical Pairs Version (CPT-IP, which is often used to measure attention/vigilance in schizophrenia. Methods. We conducted the CPT-IP (4-4 digit and examined clinical information (sex, age, education years, onset age, duration of illness, chlorpromazine-equivalent dose, and Positive and Negative Symptom Scale (PANSS scores in 126 schizophrenia patients in Japanese population. Multiple regression analysis was used to evaluate the effect of clinical factors. Results. Age, chlorpromazine-equivalent dose, and PANSS-negative symptom score were associated with mean d′ score in patients. These three clinical factors explained about 28% of the variance in mean d′ score. Conclusions. As conclusion, CPT-IP score in schizophrenia patients is influenced by age, chlorpromazine-equivalent dose and PANSS negative symptom score.
Good validity and reliability of the forgotten joint score in evaluating the outcome of total knee arthroplasty

DEFF Research Database (Denmark)

Thomsen, Morten G; Latifi, Roshan; Kallemose, Thomas

2016-01-01

. We investigated the validity and reliability of the FJS. Patients and methods - A Danish version of the FJS questionnaire was created according to internationally accepted standards. 360 participants who underwent primary TKA were invited to participate in the study. Of these, 315 were included...... in a validity study and 150 in a reliability study. Correlation between the Oxford knee score (OKS) and the FJS was examined and test-retest evaluation was performed. A ceiling effect was defined as participants reaching a score within 15% of the maximum achievable score. Results - The validity study revealed...... of the FJS (ICC? 0.79). We found a high level of internal consistency (Cronbach's? = 0.96). The ceiling effect for the FJS was 16%, as compared to 37% for the OKS. Interpretation - The FJS showed good construct validity and test-retest reliability. It had a lower ceiling effect than the OKS. The FJS appears...
Specific algorithm method of scoring the Clock Drawing Test applied in cognitively normal elderly

Directory of Open Access Journals (Sweden)

Liana Chaves Mendes-Santos

Full Text Available The Clock Drawing Test (CDT is an inexpensive, fast and easily administered measure of cognitive function, especially in the elderly. This instrument is a popular clinical tool widely used in screening for cognitive disorders and dementia. The CDT can be applied in different ways and scoring procedures also vary. OBJECTIVE: The aims of this study were to analyze the performance of elderly on the CDT and evaluate inter-rater reliability of the CDT scored by using a specific algorithm method adapted from Sunderland et al. (1989. METHODS: We analyzed the CDT of 100 cognitively normal elderly aged 60 years or older. The CDT ("free-drawn" and Mini-Mental State Examination (MMSE were administered to all participants. Six independent examiners scored the CDT of 30 participants to evaluate inter-rater reliability. RESULTS AND CONCLUSION: A score of 5 on the proposed algorithm ("Numbers in reverse order or concentrated", equivalent to 5 points on the original Sunderland scale, was the most frequent (53.5%. The CDT specific algorithm method used had high inter-rater reliability (p<0.01, and mean score ranged from 5.06 to 5.96. The high frequency of an overall score of 5 points may suggest the need to create more nuanced evaluation criteria, which are sensitive to differences in levels of impairment in visuoconstructive and executive abilities during aging.
An immunohistochemical and fluorescence in situ hybridization-based comparison between the Oracle HER2 Bond Immunohistochemical System, Dako HercepTest, and Vysis PathVysion HER2 FISH using both commercially validated and modified ASCO/CAP and United Kingdom HER2 IHC scoring guidelines.

LENUS (Irish Health Repository)

O'Grady, Anthony

2010-12-01

Immunohistochemistry (IHC) is used as the frontline assay to determine HER2 status in invasive breast cancer patients. The aim of the study was to compare the performance of the Leica Oracle HER2 Bond IHC System (Oracle) with the current most readily accepted Dako HercepTest (HercepTest), using both commercially validated and modified ASCO\\/CAP and UK HER2 IHC scoring guidelines. A total of 445 breast cancer samples from 3 international clinical HER2 referral centers were stained with the 2 test systems and scored in a blinded fashion by experienced pathologists. The overall agreement between the 2 tests in a 3×3 (negative, equivocal and positive) analysis shows a concordance of 86.7% and 86.3%, respectively when analyzed using commercially validated and modified ASCO\\/CAP and UK HER2 IHC scoring guidelines. There is a good concordance between the Oracle and the HercepTest. The advantages of a complete fully automated test such as the Oracle include standardization of key analytical factors and improved turn around time. The implementation of the modified ASCO\\/CAP and UK HER2 IHC scoring guidelines has minimal effect on either assay interpretation, showing that Oracle can be used as a methodology for accurately determining HER2 IHC status in formalin fixed, paraffin-embedded breast cancer tissue.

An immunohistochemical and fluorescence in situ hybridization-based comparison between the Oracle HER2 Bond Immunohistochemical System, Dako HercepTest, and Vysis PathVysion HER2 FISH using both commercially validated and modified ASCO/CAP and United Kingdom HER2 IHC scoring guidelines.

Science.gov (United States)

O'Grady, Anthony; Allen, David; Happerfield, Lisa; Johnson, Nicola; Provenzano, Elena; Pinder, Sarah E; Tee, Lilian; Gu, Mai; Kay, Elaine W

2010-12-01

Immunohistochemistry (IHC) is used as the frontline assay to determine HER2 status in invasive breast cancer patients. The aim of the study was to compare the performance of the Leica Oracle HER2 Bond IHC System (Oracle) with the current most readily accepted Dako HercepTest (HercepTest), using both commercially validated and modified ASCO/CAP and UK HER2 IHC scoring guidelines. A total of 445 breast cancer samples from 3 international clinical HER2 referral centers were stained with the 2 test systems and scored in a blinded fashion by experienced pathologists. The overall agreement between the 2 tests in a 3×3 (negative, equivocal and positive) analysis shows a concordance of 86.7% and 86.3%, respectively when analyzed using commercially validated and modified ASCO/CAP and UK HER2 IHC scoring guidelines. There is a good concordance between the Oracle and the HercepTest. The advantages of a complete fully automated test such as the Oracle include standardization of key analytical factors and improved turn around time. The implementation of the modified ASCO/CAP and UK HER2 IHC scoring guidelines has minimal effect on either assay interpretation, showing that Oracle can be used as a methodology for accurately determining HER2 IHC status in formalin fixed, paraffin-embedded breast cancer tissue.
Interpretive criteria for mupirocin susceptibility testing of Staphylococcus spp. using CLSI guidelines.

LENUS (Irish Health Repository)

Creagh, S

2012-02-03

Mupirocin is an antimicrobial agent commonly used to treat staphylococcal infection or to eliminate persistent carriage. To date, interpretive criteria have not been established to define susceptibility or resistance when performing mupirocin susceptibility testing. In this evaluation, using CLSI guidelines, a total of 502 staphylococci comprising 219 methicillin-sensitive Staphylococcus aureus, 222 methicillin-resistant S. aureus and 61 coagulase-negative staphylococci are tested by broth microdilution, disc diffusion and E-test. Disc diffusion using 5 microg mupirocin discs was found to be a reliable method to distinguish susceptible and resistant strains. Minimum inhibitory concentration (MIC) determination was required to differentiate low-level and high-level resistance to mupirocin. E-test was found to be an accurate alternative to broth microdilution for the routine determination of MIC values of staphylococci to mupirocin. Broth microdilution and disc-diffusion results were plotted on a scattergram, and error rates were calculated. No errors were found using susceptibility criteria of < 4 microg\\/mL (MIC) and > 19 mm (zone diameter).
Using Automated Essay Scores as an Anchor When Equating Constructed Response Writing Tests

Science.gov (United States)

Almond, Russell G.

2014-01-01

Assessments consisting of only a few extended constructed response items (essays) are not typically equated using anchor test designs as there are typically too few essay prompts in each form to allow for meaningful equating. This article explores the idea that output from an automated scoring program designed to measure writing fluency (a common…
Implications of Deployed and Nondeployed Fathers on Seventh Graders' California Achievement Test Scores during a Military Crisis.

Science.gov (United States)

Pisano, Mark C.

The differences in California Achievement Test (CAT) scores from 1990 to 1991 in seventh graders, currently enrolled in Albritton Junior High School in the Fort Bragg Schools, of deployed and nondeployed fathers were analyzed. CAT percentile scores from 1990 and 1991 (1991 being the year of "Desert Storm") were obtained in reading, math…
How Well Does the Sum Score Summarize the Test? Summability as a Measure of Internal Consistency

NARCIS (Netherlands)

Goeman, J.J.; De, Jong N.H.

2018-01-01

Many researchers use Cronbach's alpha to demonstrate internal consistency, even though it has been shown numerous times that Cronbach's alpha is not suitable for this. Because the intention of questionnaire and test constructers is to summarize the test by its overall sum score, we advocate
Zertifikat Deutsch als Fremdsprache and the Oral Proficiency Interview: A Comparison of Test Scores and Examinations.

Science.gov (United States)

Lalande, John F.; Schweckendiek, Jurgen

1986-01-01

Investigates what correlations might exist between an individual's score on the Zertifikat Deutsch als Fremdsprache and on the Oral Proficiency Interview. The tests themselves are briefly described. Results indicate that the two tests appear to correlate well in their evaluation of speaking skills. (SED)
Normative Data for the Balance Error Scoring System in Adults

Directory of Open Access Journals (Sweden)

Grant L. Iverson

2013-01-01

Full Text Available Background. The balance error scoring system (BESS is a brief, easily administered test of static balance. The purpose of this study is to develop normative data for this test. Study Design. Cross-sectional, descriptive, and cohort design. Methods. The sample was drawn from a population of clients taking part in a comprehensive preventive health screen at a multidisciplinary healthcare center. Community-dwelling adults aged 20–69 (N=1,236 were administered the BESS within the context of a fitness evaluation. They did not have significant medical, neurological, or lower extremity problems that might have an adverse effect on balance. Results. There was a significant positive correlation between BESS scores and age (r=.34. BESS performance was similar for participants between the ages of 20 and 49 and significantly declined between ages 50 and 69. Men performed slightly better than women on the BESS. Women who were overweight performed significantly more poorly on the test compared to women who were not overweight (P<.0001; Cohen's d=.62. The BESS normative data are stratified by age and sex. Conclusions. These normative data provide a frame of reference for interpreting BESS performance in adults who sustain traumatic brain injuries and adults with diverse neurological or vestibular problems.
Rey's Auditory Verbal Learning Test scores can be predicted from whole brain MRI in Alzheimer's disease

Directory of Open Access Journals (Sweden)

Elaheh Moradi

2017-01-01

Full Text Available Rey's Auditory Verbal Learning Test (RAVLT is a powerful neuropsychological tool for testing episodic memory, which is widely used for the cognitive assessment in dementia and pre-dementia conditions. Several studies have shown that an impairment in RAVLT scores reflect well the underlying pathology caused by Alzheimer's disease (AD, thus making RAVLT an effective early marker to detect AD in persons with memory complaints. We investigated the association between RAVLT scores (RAVLT Immediate and RAVLT Percent Forgetting and the structural brain atrophy caused by AD. The aim was to comprehensively study to what extent the RAVLT scores are predictable based on structural magnetic resonance imaging (MRI data using machine learning approaches as well as to find the most important brain regions for the estimation of RAVLT scores. For this, we built a predictive model to estimate RAVLT scores from gray matter density via elastic net penalized linear regression model. The proposed approach provided highly significant cross-validated correlation between the estimated and observed RAVLT Immediate (R = 0.50 and RAVLT Percent Forgetting (R = 0.43 in a dataset consisting of 806 AD, mild cognitive impairment (MCI or healthy subjects. In addition, the selected machine learning method provided more accurate estimates of RAVLT scores than the relevance vector regression used earlier for the estimation of RAVLT based on MRI data. The top predictors were medial temporal lobe structures and amygdala for the estimation of RAVLT Immediate and angular gyrus, hippocampus and amygdala for the estimation of RAVLT Percent Forgetting. Further, the conversion of MCI subjects to AD in 3-years could be predicted based on either observed or estimated RAVLT scores with an accuracy comparable to MRI-based biomarkers.
An Automated, High-Throughput Method for Interpreting the Tandem Mass Spectra of Glycosaminoglycans

Science.gov (United States)

Duan, Jiana; Jonathan Amster, I.

2018-05-01

The biological interactions between glycosaminoglycans (GAGs) and other biomolecules are heavily influenced by structural features of the glycan. The structure of GAGs can be assigned using tandem mass spectrometry (MS2), but analysis of these data, to date, requires manually interpretation, a slow process that presents a bottleneck to the broader deployment of this approach to solving biologically relevant problems. Automated interpretation remains a challenge, as GAG biosynthesis is not template-driven, and therefore, one cannot predict structures from genomic data, as is done with proteins. The lack of a structure database, a consequence of the non-template biosynthesis, requires a de novo approach to interpretation of the mass spectral data. We propose a model for rapid, high-throughput GAG analysis by using an approach in which candidate structures are scored for the likelihood that they would produce the features observed in the mass spectrum. To make this approach tractable, a genetic algorithm is used to greatly reduce the search-space of isomeric structures that are considered. The time required for analysis is significantly reduced compared to an approach in which every possible isomer is considered and scored. The model is coded in a software package using the MATLAB environment. This approach was tested on tandem mass spectrometry data for long-chain, moderately sulfated chondroitin sulfate oligomers that were derived from the proteoglycan bikunin. The bikunin data was previously interpreted manually. Our approach examines glycosidic fragments to localize SO3 modifications to specific residues and yields the same structures reported in literature, only much more quickly.
Clinical score and rapid antigen detection test to guide antibiotic use for sore throats: randomised controlled trial of PRISM (primary care streptococcal management).

Science.gov (United States)

Little, Paul; Hobbs, F D Richard; Moore, Michael; Mant, David; Williamson, Ian; McNulty, Cliodna; Cheng, Ying Edith; Leydon, Geraldine; McManus, Richard; Kelly, Joanne; Barnett, Jane; Glasziou, Paul; Mullee, Mark

2013-10-10

To determine the effect of clinical scores that predict streptococcal infection or rapid streptococcal antigen detection tests compared with delayed antibiotic prescribing. Open adaptive pragmatic parallel group randomised controlled trial. Primary care in United Kingdom. Patients aged ≥ 3 with acute sore throat. An internet programme randomised patients to targeted antibiotic use according to: delayed antibiotics (the comparator group for analyses), clinical score, or antigen test used according to clinical score. During the trial a preliminary streptococcal score (score 1, n=1129) was replaced by a more consistent score (score 2, n=631; features: fever during previous 24 hours; purulence; attends rapidly (within three days after onset of symptoms); inflamed tonsils; no cough/coryza (acronym FeverPAIN). Symptom severity reported by patients on a 7 point Likert scale (mean severity of sore throat/difficulty swallowing for days two to four after the consultation (primary outcome)), duration of symptoms, use of antibiotics. For score 1 there were no significant differences between groups. For score 2, symptom severity was documented in 80% (168/207 (81%) in delayed antibiotics group; 168/211 (80%) in clinical score group; 166/213 (78%) in antigen test group). Reported severity of symptoms was lower in the clinical score group (-0.33, 95% confidence interval -0.64 to -0.02; P=0.04), equivalent to one in three rating sore throat a slight versus moderate problem, with a similar reduction for the antigen test group (-0.30, -0.61 to -0.00; P=0.05). Symptoms rated moderately bad or worse resolved significantly faster in the clinical score group (hazard ratio 1.30, 95% confidence interval 1.03 to 1.63) but not the antigen test group (1.11, 0.88 to 1.40). In the delayed antibiotics group, 75/164 (46%) used antibiotics. Use of antibiotics in the clinical score group (60/161) was 29% lower (adjusted risk ratio 0.71, 95% confidence interval 0.50 to 0.95; P=0.02) and in the
Towards reporting standards for neuropsychological study results: A proposal to minimize communication errors with standardized qualitative descriptors for normalized test scores.

Science.gov (United States)

Schoenberg, Mike R; Rum, Ruba S

2017-11-01

Rapid, clear and efficient communication of neuropsychological results is essential to benefit patient care. Errors in communication are a lead cause of medical errors; nevertheless, there remains a lack of consistency in how neuropsychological scores are communicated. A major limitation in the communication of neuropsychological results is the inconsistent use of qualitative descriptors for standardized test scores and the use of vague terminology. PubMed search from 1 Jan 2007 to 1 Aug 2016 to identify guidelines or consensus statements for the description and reporting of qualitative terms to communicate neuropsychological test scores was conducted. The review found the use of confusing and overlapping terms to describe various ranges of percentile standardized test scores. In response, we propose a simplified set of qualitative descriptors for normalized test scores (Q-Simple) as a means to reduce errors in communicating test results. The Q-Simple qualitative terms are: 'very superior', 'superior', 'high average', 'average', 'low average', 'borderline' and 'abnormal/impaired'. A case example illustrates the proposed Q-Simple qualitative classification system to communicate neuropsychological results for neurosurgical planning. The Q-Simple qualitative descriptor system is aimed as a means to improve and standardize communication of standardized neuropsychological test scores. Research are needed to further evaluate neuropsychological communication errors. Conveying the clinical implications of neuropsychological results in a manner that minimizes risk for communication errors is a quintessential component of evidence-based practice. Copyright © 2017 Elsevier B.V. All rights reserved.
Repeatability of an automated Landolt C test, compared with the early treatment of diabetic retinopathy study (ETDRS) chart testing.

Science.gov (United States)

Ruamviboonsuk, Paisan; Tiensuwan, Montip; Kunawut, Catleya; Masayaanon, Patcharapim

2003-10-01

To evaluate the repeatability of visual acuity scores from the automated test and compare them with the Early Treatment of Diabetic Retinopathy Study (ETDRS) chart. Instrument validation study based on a model of repeatability study in two observations. SMETHODS: a prospective, clinic-based, comparative study. A total of 206 participants without ocular diseases and refractive errors in their right eyes were randomly enrolled in the automated group in which 107 participants performed the automated test and the ETDRS group in which 99 participants read the ETDRS chart. All participants were tested with only their right eyes without corrections at 4 meters and came back to have the same tests 1 week later. The automated test used the Landolt rings as optotypes and was conducted by a low-ended personal computer with a 15-inch monitor and a wireless keyboard. The "letter" score calculated by counting every correct response to optotypes, and the "threshold curve" score interpreted from the optotype size at the midpoint of a visual acuity threshold curve. The 95% confidence interval of test-retest of visual acuity scores from the automated test are comparable to the ETDRS chart (.143 compared with.125 for letter scores,.145 compared with.122 for threshold curve scores). The score repeatabilities, calculated from the standard deviations of test-retest, from the automated test are also comparable to the ETDRS chart (.201 compared with.177 for letter scores,.206 compared with.172 for threshold curve scores). All comparisons demonstrated no statistical difference (P >.05). The automated testing system in this study enables practical measuring visual acuity by the Landolt rings. The system's repeatability, which is comparable to the ETDRS chart, supports its role as an alternative tool for measuring outcome in new clinical research. Its ability to practically generate visual acuity threshold curves may also be useful in future clinical research studies.
Challenges in interpretation of thyroid hormone test results

Directory of Open Access Journals (Sweden)

Lalić Tijana

2016-01-01

Full Text Available Introduction. In interpreting thyroid hormones results it is preferable to think of interference and changes in concentration of their carrier proteins. Outline of Cases. We present two patients with discrepancy between the results of thyroid function tests and clinical status. The first case presents a 62-year-old patient with a nodular goiter and Hashimoto thyroiditis. Thyroid function test showed low thyroid-stimulating hormone (TSH and normal to low fT4. By determining thyroid status (ТSH, T4, fT4, T3, fT3 in two laboratories, basal and after dilution, as well as thyroxine-binding globulin (TBG, it was concluded that the thyroid hormone levels were normal. The results were influenced by heterophile antibodies leading to a false lower TSH level and suspected secondary hypothyroidism. The second case, a 40-year-old patient, was examined and followed because of the variable size thyroid nodule and initially borderline elevated TSH, after which thyroid status showed low level of total thyroid hormones and normal TSH. Based on additional analysis it was concluded that low T4 and T3 were a result of low TBG. It is a hereditary genetic disorder with no clinical significance. Conclusion. Erroneous diagnosis of thyroid disorders and potentially harmful treatment could be avoided by proving the interference or TBG deficiency whenever there is a discrepancy between the thyroid function results and the clinical picture.
Noninvasive testing in coronary artery disease. Selection of procedures and interpretation of results

International Nuclear Information System (INIS)

Sox, H.C. Jr.

1983-01-01

In patients with acute chest pain, selection of diagnostic tests and admission to and discharge from the coronary care unit are critical decisions for which useful empirical guidelines are now available. In hospitalized patients, the serum level of the MB fraction of creatine kinase is particularly useful when the history strongly suggests infarction but the ECG is nondiagnostic. In patients with chronic chest pain, the gender of the patient and the character of the pain are the most important guides to selecting and interpreting exercise tests. In women and in men with nonanginal chest pain, the myocardial scintiscan is preferred to the exercise ECG because of its greater diagnostic accuracy. In men with atypical angina, the two tests are nearly equivalent, and the added cost of the scintiscan is a factor in test selection. Since nearly all men with typical angina have coronary artery disease, diagnostic tests are usually not needed
A Note on the Use of the Hiskey-Nebraska Test of Learning Aptitude with Deaf Children.

Science.gov (United States)

Watson, Betty U.; Goldgar, David E.

1985-01-01

Comparing distribution of scores on the Hiskey-Nebraska Test of Learning Aptitude (H-NTLA) with those from the Wechsler Performance Scales for 71 hearing impaired Ss revealed a correlation of .85. However, the H-NTLA yielded more Ss with extreme scores. Findings stress the need for caution in interpreting extreme H-NTLA scores. (CL)
Evaluation of the Discrepancy between the European Pharmacopoeia Test and an Adopted United States Pharmacopoeia Test Regarding the Weight Uniformity of Scored Tablet Halves: Is Harmonization Required?

Science.gov (United States)

Zaid, Abdel Naser; Ghoush, Abeer Abu; Al-Ramahi, Rowa'; Are'r, Mohammed

2012-01-01

The aim of this study was to evaluate whether there exists any discrepancy between the European Pharmacopoeia (Ph. Eur.) and adopted United States Pharmacopeia (USP) tests concerning the weight uniformity measurements of tablet halves after splitting. The USP method does not contain provisions to evaluate split tablets, so here we adopt their whole tablet weight uniformity method. Twenty-nine different commercial scored tablets (local and imported) were divided. The split units were individually weighed and the relative standard deviation (RSD) for each product was calculated and then evaluated according to both the adopted USP and the Ph. Eur. tests of weight uniformity. Twenty out of the 29 products tested failed the USP test, while 14 of them failed the Ph. Eur. test. Nine products passed both the USP and Ph. Eur. tests. Six products passed the Ph. Eur. test but failed the USP test, with all of these products having an RSD greater than 6%. The correlation coefficient between the weight and content of split halves for three randomly selected products-corotenol 100 mg, corotenol 50 mg, and lorazepam 2.5 mg-was found to be 0.986, 0.998, and 0.72, respectively. A clear difference can be seen between outcomes obtained by the two compendial tablet splitting methods with regard to weight uniformity. Results from the USP test showed that tighter measures are needed to pass the test. Our results argue that the Ph. Eur. should revise the existing weight uniformity test on scored tablets to include the RSD parameter in it. The USP should include this adopted test as a specific test for scored tablet halves, not just whole tablets. Manufacturers in some cases will need to improve the quality of the produced scored tablets in order to pass the USP test, especially those with low therapeutic indices. Finally, harmonization between the pharmacopoeias regarding the weight uniformity testing of split tablets is warranted. The aim of this study was to evaluate whether there
Interpreting patient decisional conflict scores: behavior and emotions in decisions about treatment

NARCIS (Netherlands)

Knops, Anouk M.; Goossens, Astrid; Ubbink, Dirk T.; Legemate, Dink A.; Stalpers, Lukas J.; Bossuyt, Patrick M.

2013-01-01

Patient decision aids facilitate treatment decisions. They are often evaluated in terms of their effect on decisional conflict, as measured by the Decisional Conflict Scale (DCS). It is unclear to what extent lower DCS scores are accompanied by observable patient behavior or emotions. To help
Interpretation of ongoing thermal response tests of vertical (BHE) borehole heat exchangers with predictive uncertainty based stopping criterion

DEFF Research Database (Denmark)

Poulsen, Søren Erbs; Alberdi Pagola, Maria

2015-01-01

A method for real-time interpretation of ongoing thermal response tests of vertical borehole heat exchangers is presented. The method utilizes a statistically based stopping criterion for ongoing tests. The study finds minimum testing times for synthetic and actual TRTs to be in the interval 12–2...
Unexplained Graft Dysfunction after Heart Transplantation—Role of Novel Molecular Expression Test Score and QTc-Interval: A Case Report

Directory of Open Access Journals (Sweden)

Khurram Shahzad

2010-01-01

Full Text Available In the current era of immunosuppressive medications there is increased observed incidence of graft dysfunction in the absence of known histological criteria of rejection after heart transplantation. A noninvasive molecular expression diagnostic test was developed and validated to rule out histological acute cellular rejection. In this paper we present for the first time, longitudinal pattern of changes in this novel diagnostic test score along with QTc-interval in a patient who was admitted with unexplained graft dysfunction. Patient presented with graft failure with negative findings on all known criteria of rejection including acute cellular rejection, antibody mediated rejection and cardiac allograft vasculopathy. The molecular expression test score showed gradual increase and QTc-interval showed gradual prolongation with the gradual decline in graft function. This paper exemplifies that in patients presenting with unexplained graft dysfunction, GEP test score and QTc-interval correlate with the changes in the graft function.
Refining Ovarian Cancer Test accuracy Scores (ROCkeTS): protocol for a prospective longitudinal test accuracy study to validate new risk scores in women with symptoms of suspected ovarian cancer.

Science.gov (United States)

Sundar, Sudha; Rick, Caroline; Dowling, Francis; Au, Pui; Snell, Kym; Rai, Nirmala; Champaneria, Rita; Stobart, Hilary; Neal, Richard; Davenport, Clare; Mallett, Susan; Sutton, Andrew; Kehoe, Sean; Timmerman, Dirk; Bourne, Tom; Van Calster, Ben; Gentry-Maharaj, Aleksandra; Menon, Usha; Deeks, Jon

2016-08-09

Ovarian cancer (OC) is associated with non-specific symptoms such as bloating, making accurate diagnosis challenging: only 1 in 3 women with OC presents through primary care referral. National Institute for Health and Care Excellence guidelines recommends sequential testing with CA125 and routine ultrasound in primary care. However, these diagnostic tests have limited sensitivity or specificity. Improving accurate triage in women with vague symptoms is likely to improve mortality by streamlining referral and care pathways. The Refining Ovarian Cancer Test Accuracy Scores (ROCkeTS; HTA 13/13/01) project will derive and validate new tests/risk prediction models that estimate the probability of having OC in women with symptoms. This protocol refers to the prospective study only (phase III). ROCkeTS comprises four parallel phases. The full ROCkeTS protocol can be found at http://www.birmingham.ac.uk/ROCKETS. Phase III is a prospective test accuracy study. The study will recruit 2450 patients from 15 UK sites. Recruited patients complete symptom and anxiety questionnaires, donate a serum sample and undergo ultrasound scored as per International Ovarian Tumour Analysis (IOTA) criteria. Recruitment is at rapid access clinics, emergency departments and elective clinics. Models to be evaluated include those based on ultrasound derived by the IOTA group and novel models derived from analysis of existing data sets. Estimates of sensitivity, specificity, c-statistic (area under receiver operating curve), positive predictive value and negative predictive value of diagnostic tests are evaluated and a calibration plot for models will be presented. ROCkeTS has received ethical approval from the NHS West Midlands REC (14/WM/1241) and is registered on the controlled trials website (ISRCTN17160843) and the National Institute of Health Research Cancer and Reproductive Health portfolios. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted

Test-retest reliability and minimal detectable change scores for sit-to-stand-to-sit tests, the six-minute walk test, the one-leg heel-rise test, and handgrip strength in people undergoing hemodialysis.

Science.gov (United States)

Segura-Ortí, Eva; Martínez-Olmos, Francisco José

2011-08-01

Determining the relative and absolute reliability of outcomes of physical performance tests for people undergoing hemodialysis is necessary to discriminate between the true effects of exercise interventions and the inherent variability of this cohort. The aims of this study were to assess the relative reliability of sit-to-stand-to-sit tests (the STS-10, which measures the time [in seconds] required to complete 10 full stands from a sitting position, and the STS-60, which measures the number of repetitions achieved in 60 seconds), the Six-Minute Walk Test (6MWT), the one-leg heel-rise test, and the handgrip strength test and to calculate minimal detectable change (MDC) scores in people undergoing hemodialysis. This study was a prospective, nonexperimental investigation. Thirty-nine people undergoing hemodialysis at 2 clinics in Spain were contacted. Study participants performed the STS-10 (n=37), the STS-60 (n=37), and the 6MWT (n=36). At one of the settings, the participants also performed the one-leg heel-rise test (n=21) and the handgrip strength test (n=12) on both the right and the left sides. Participants attended 2 testing sessions 1 to 2 weeks apart. High intraclass correlation coefficients (≥.88) were found for all tests, suggesting good relative reliability. The MDC scores at 90% confidence intervals were as follows: 8.4 seconds for the STS-10, 4 repetitions for the STS-60, 66.3 m for the 6MWT, 3.4 kg for handgrip strength (force-generating capacity), 3.7 repetitions for the one-leg heel-rise test with the right leg, and 5.2 repetitions for the one-leg heel-rise test with the left leg. Limitations A limited sample of patients was used in this study. The STS-16, STS-60, 6MWT, one-leg heel rise test, and handgrip strength test are reliable outcome measures. The MDC scores at 90% confidence intervals for these tests will help to determine whether a change is due to error or to an intervention.
Rugby versus Soccer in South Africa: Content Familiarity Contributes to Cross-Cultural Differences in Cognitive Test Scores

Science.gov (United States)

Malda, Maike; van de Vijver, Fons J. R.; Temane, Q. Michael

2010-01-01

In this study, cross-cultural differences in cognitive test scores are hypothesized to depend on a test's cultural complexity (Cultural Complexity Hypothesis: CCH), here conceptualized as its content familiarity, rather than on its cognitive complexity (Spearman's Hypothesis: SH). The content familiarity of tests assessing short-term memory,…
Text-interpreter language for flexible generation of patient notes and instructions.

Science.gov (United States)

Forker, T S

1992-01-01

An interpreted computer language has been developed along with a windowed user interface and multi-printer-support formatter to allow preparation of documentation of patient visits, including progress notes, prescriptions, excuses for work/school, outpatient laboratory requisitions, and patient instructions. Input is by trackball or mouse with little or no keyboard skill required. For clinical problems with specific protocols, the clinician can be prompted with problem-specific items of history, exam, and lab data to be gathered and documented. The language implements a number of text-related commands as well as branching logic and arithmetic commands. In addition to generating text, it is simple to implement arithmetic calculations such as weight-specific drug dosages; multiple branching decision-support protocols for paramedical personnel (or physicians); and calculation of clinical scores (e.g., coma or trauma scores) while simultaneously documenting the status of each component of the score. ASCII text files produced by the interpreter are available for computerized quality audit. Interpreter instructions are contained in text files users can customize with any text editor.
Hydraulic testing of Salado Formation evaporites at the Waste Isolation Pilot Plant site: Second interpretive report

Energy Technology Data Exchange (ETDEWEB)

Beauheim, R.L. [Sandia National Labs., Albuquerque, NM (United States); Roberts, R.M.; Dale, T.F.; Fort, M.D.; Stensrud, W.A. [INTERA, Inc., Austin, TX (United States)

1993-12-01

Pressure-pulse, constant-pressure flow, and pressure-buildup tests have been performed in bedded evaporites of the Salado Formation at the Waste Isolation Pilot Plant (WIPP) site to evaluate the hydraulic properties controlling brine flow through the Salado. Transmissivities have been interpreted from six sequences of tests conducted on five stratigraphic intervals within 15 m of the WIPP underground excavations.
Hydraulic testing of Salado Formation evaporites at the Waste Isolation Pilot Plant site: Second interpretive report

International Nuclear Information System (INIS)

Beauheim, R.L.; Roberts, R.M.; Dale, T.F.; Fort, M.D.; Stensrud, W.A.

1993-12-01

Pressure-pulse, constant-pressure flow, and pressure-buildup tests have been performed in bedded evaporites of the Salado Formation at the Waste Isolation Pilot Plant (WIPP) site to evaluate the hydraulic properties controlling brine flow through the Salado. Transmissivities have been interpreted from six sequences of tests conducted on five stratigraphic intervals within 15 m of the WIPP underground excavations
Applicability of Various Load Test Interpretation Criteria in Measuring Driven Precast Concrete Pile Uplift Capacity

Directory of Open Access Journals (Sweden)

Maria Cecilia M. Marcos

2018-04-01

Full Text Available This paper presented a comprehensive analysis of load test interpretation criteria to determine their suitability to driven precast concrete (PC pile uplift capacity. A database was developed containing static pile load tests and utilized for the evaluation. The piles were round and square cross-sections under drained and undrained loading. To explore and compare their behavior, the stored data were categorized into four groups. In general, the trends of every criterion for the four groups were notably the same. In both drained and undrained loading, slightly larger interpreted capacities were demonstrated by square piles than by round piles. Moreover, round piles demonstrated more ductile load-displacement response than square piles especially in undrained loading. Statistical analyses presented that smaller values of displacements exhibited higher coefficient of variation. The drained and undrained tests were compared and results showed less variability in drained than undrained loading and capacity ratios (Qx/QCHIN in drained loading were slightly higher than in undrained loading. The interrelationship and applicability of these criteria as well as the design recommendations in terms of normalized capacity and displacement were given based on the analyses.
Differences in distribution of T-scores and Z-scores among bone densitometry tests in postmenopausal women (a comparative study)

International Nuclear Information System (INIS)

Wendlova, J.

2002-01-01

To determine the character of T-score and Z-score value distribution in individually selected methods of bone densitometry and to compare them using statistical analysis. We examined 56 postmenopausal women with an age between 43 and 68 years with osteopenia or osteoporosis according to the WHO classification. The following measurements were made in each patient: T-score and Z-score for: 1) Stiffness index (S) of the left heel bone, USM (index). 2) Bone mineral density of the left heel bone (BMDh), DEXA (g of Ca hydroxyapatite per cm 2 ). 3) Bone mineral density of trabecular bone of the L1 vertebra (BMDL1). QCT (mg of Ca hydroxyapatite per cm 3 ). The densitometers used in the study were: ultrasonometer to measure heel bone, Achilles plus LUNAR, USA: DEXA to measure heel bone, PIXl, LUNAR, USA: QCT to measure the L1 vertebra, CT, SOMATOM Plus, Siemens, Germany. Statistical analysis: differences between measured values of T-scores (Z-scores) were evaluated by parametric or non-parametric methods of determining the 95 % confidence intervals (C.I.). Differences between Z-score and T-score values for compared measurements were statistically significant; however, these differences were lower for Z-scores. Largest differences in 95 % C.I., characterizing individual measurements of T-score values (in comparison with Z-scores), were found for those densitometers whose age range of the reference groups of young adults differed the most, and conversely, the smallest differences in T-score values were found when the differences between the age ranges of reference groups were smallest. The higher variation in T-score values in comparison to Z-scores is also caused by a non-standard selection of the reference groups of young adults for the QCT, PIXI and Achilles Plus densitometers used in the study. Age characteristics of the reference group for T-scores should be standardized for all types of densitometers. (author)
Scoring CT/HRCT findings among asbestos-exposed workers: effects of patient's age, body mass index and common laboratory test results

Energy Technology Data Exchange (ETDEWEB)

Vehmas, T.; Huuskonen, M.S. [Finnish Institute of Occupational Health, Department of Radiology, Helsinki (Finland); Kivisaari, L. [Helsinki University Central Hospital, Department of Radiology, Helsinki (Finland); Jaakkola, M.S. [Finnish Institute of Occupational Health, Department of Radiology, Helsinki (Finland); University of Birmingham, Institute of Occupational and Environmental Medicine, Birmingham (United Kingdom)

2005-02-01

We studied the effects of age, body mass index (BMI) and some common laboratory test results on several pulmonary CT/HRCT signs. Five hundred twenty-eight construction workers (age 38-80, mean 63 years) were imaged with spiral and high resolution CT. Images were scored by three radiologists for solitary pulmonary nodules, signs indicative of fibrosis and emphysema, ground glass opacities, bronchial wall thickness and bronchiectasis. Multivariate statistical analyses were adjusted for smoking and asbestos exposure. Increasing age, blood haemoglobin value and erythrocyte sedimentation rate correlated positively with several HRCT signs. Increasing BMI was associated with a decrease in several signs, especially parenchymal bands, honeycombing, all kinds of emphysema and bronchiectasis. The latter finding might be due to the suboptimal image quality in obese individuals, which may cause suspicious findings to be overlooked. Background data, including patient's age and body constitution, should be considered when CT/HRCT images are interpreted. (orig.)
A Study on Variables that Affect Class Scores of Primary Education Students in Placement Test

OpenAIRE

Yavuz, Mustafa

2010-01-01

This study aims to determine the variables that predict class scores which are obtained by adding 70 % of the Placement Test (PT) scores of the primary education sixth and seventh grade students who took it for the first time in the 2007-2008 academic year within the framework of the system of passing to secondary education reorganized by the MNE, 25 % of their end-of-the-year passing grades. The study is of general survey model. The study group consists of students who took the PT in the 200...
Reaffirming normal: the high risk of pathologizing healthy adults when interpreting the MMPI-2-RF.

Science.gov (United States)

Odland, Anthony P; Lammy, Andrew B; Perle, Jonathan G; Martin, Phillip K; Grote, Christopher L

2015-01-01

Monte Carlo simulations were utilized to determine the proportion of the normal population expected to have scale elevations on the MMPI-2-RF when multiple scores are interpreted. Results showed that when all 40 MMPI-2-RF scales are simultaneously considered, approximately 70% of normal adults are likely to have at least one scale elevation at or above 65 T, and as many as 20% will have five or more elevated scales. When the Restructured Clinical (RC) Scales are under consideration, 34% of normal adults have at least one elevated score. Interpretation of the Specific Problem Scales and Personality Psychopathology Five Scales--Revised also yielded higher than expected rates of significant scores, with as many as one in four normal adults possibly being miscategorized as having features of a personality disorder by the latter scales. These findings are consistent with the growing literature on rates of apparently abnormal scores in the normal population due to multiple score interpretation. Findings are discussed in relation to clinical assessment, as well as in response to recent work suggesting that the MMPI-2-RF's multiscale composition does not contribute to high rates of elevated scores.
Anthropometric adjustments are helpful in the interpretation of BMD and BMC Z-scores of pediatric patients with Prader-Willi syndrome.

Science.gov (United States)

Hangartner, T N; Short, D F; Eldar-Geva, T; Hirsch, H J; Tiomkin, M; Zimran, A; Gross-Tsur, V

2016-12-01

Anthropometric adjustments of bone measurements are necessary in Prader-Willi syndrome patients to correctly assess the bone status of these patients. This enables physicians to get a more accurate diagnosis of normal versus abnormal bone, allow for early and effective intervention, and achieve better therapeutic results. Bone mineral density (BMD) is decreased in patients with Prader-Willi syndrome (PWS). Because of largely abnormal body height and weight, traditional BMD Z-scores may not provide accurate information in this patient group. The goal of the study was to assess a cohort of individuals with PWS and characterize the development of low bone density based on two adjustment models applied to a dataset of BMD and bone mineral content (BMC) from dual-energy X-ray absorptiometry (DXA) measurements. Fifty-four individuals, aged 5-20 years with genetically confirmed PWS, underwent DXA scans of spine and hip. Thirty-one of them also underwent total body scans. Standard Z-scores were calculated for BMD and BMC of spine and total hip based on race, sex, and age for all patients, as well as of whole body and whole-body less head for those patients with total-body scans. Additional Z-scores were generated based on anthropometric adjustments using weight, height, and percentage body fat and a second model using only weight and height in addition to race, sex, and age. As many PWS patients have abnormal anthropometrics, addition of explanatory variables weight, height, and fat resulted in different bone classifications for many patients. Thus, 25-70 % of overweight patients, previously diagnosed as normal, were subsequently diagnosed as below normal, and 40-60 % of patients with below-normal body height changed from below normal to normal depending on bone parameter. This is the first study to include anthropometric adjustments into the interpretation of BMD and BMC in children and adolescents with PWS. This enables physicians to get a more accurate diagnosis of
An analysis of aviation test scores to characterize Student Naval Aviator disqualification

OpenAIRE

Wahl, Erich J.

1998-01-01

Approved for public release; distribution is unlimited The U.S. Navy uses the Aviation Selection Test Battery (ASTh) to identify those Student Naval Aviator (SNA) applicants most likely to succeed in flight training. Using classification and regression trees, this thesis concludes that individual answers to an ASTh subtest, the Biographical Inventory, are not good predictors of SNA primary flight grades. It also concludes that those SNA who score less than a 6 on the Pilot Biographical Inv...
The influence of a continuing education program on the image interpretation accuracy of rural radiographers.

Science.gov (United States)

Smith, Tony N; Traise, Peter; Cook, Aiden

2009-01-01

In regional, rural and remote clinical practice, radiographers work closely with medical members of the acute care team in the interpretation of radiographic images, particularly when no radiologist is available. However, the misreading of radiographs by non-radiologist physicians has been shown to be the most common type of clinical error in the emergency department. Further, in Australia few rural radiographers are specifically trained to interpret and report on images. This study aimed to evaluate the accuracy of a group of rural radiographers in interpreting musculoskeletal plain radiographs, and to assess the effectiveness of continuing education (CE) in improving their accuracy within a short time frame. Following ethics approval, 16 rural radiographers were recruited to the study. At inception a purpose-designed 'test-object' of 25 cases compiled by a radiologist was used to assess image interpretation accuracy. The cases were categorised into three grades of complexity. The radiographers entered their answers on a structured radiographer opinion form (ROF) that had three levels of response - 'general opinion', 'observations' and 'open comment'. Subsequent to base-line testing, the radiographers participated in a CE program aimed at improving their image interpretation skills. After a 4 month period they were re-tested using the same methodology. The ROFs were scored by the radiologist and the pooled results analysed for statistically significant changes at all ROF levels and grades of complexity. While for the small number of less complex grade 1 cases there was no change in image interpretation accuracy, for the more numerous and more complex grade 2 and grade 3 cases there was a statistically significant improvement at the 'general opinion' and 'observation' levels (paired t-test, p radiologist. However, radiographers' ability to use radiological vocabulary needs improvement. The complementary role that exists between radiographers and other members of
Validating High-Stakes Testing Programs.

Science.gov (United States)

Kane, Michael

2002-01-01

Makes the point that the interpretations and use of high-stakes test scores rely on policy assumptions about what should be taught and the content standards and performance standards that should be applied. The assumptions built into an assessment need to be subjected to scrutiny and criticism if a strong case is to be made for the validity of the…
Supporting Accurate Interpretation of Self-Administered Medical Test Results for Mobile Health: Assessment of Design, Demographics, and Health Condition.

Science.gov (United States)

Hohenstein, Jess C; Baumer, Eric Ps; Reynolds, Lindsay; Murnane, Elizabeth L; O'Dell, Dakota; Lee, Seoho; Guha, Shion; Qi, Yu; Rieger, Erin; Gay, Geri

2018-02-28

Technological advances in personal informatics allow people to track their own health in a variety of ways, representing a dramatic change in individuals' control of their own wellness. However, research regarding patient interpretation of traditional medical tests highlights the risks in making complex medical data available to a general audience. This study aimed to explore how people interpret medical test results, examined in the context of a mobile blood testing system developed to enable self-care and health management. In a preliminary investigation and main study, we presented 27 and 303 adults, respectively, with hypothetical results from several blood tests via one of the several mobile interface designs: a number representing the raw measurement of the tested biomarker, natural language text indicating whether the biomarker's level was low or high, or a one-dimensional chart illustrating this level along a low-healthy axis. We measured respondents' correctness in evaluating these results and their confidence in their interpretations. Participants also told us about any follow-up actions they would take based on the result and how they envisioned, generally, using our proposed personal health system. We find that a majority of participants (242/328, 73.8%) were accurate in their interpretations of their diagnostic results. However, 135 of 328 participants (41.1%) expressed uncertainty and confusion about their ability to correctly interpret these results. We also find that demographics and interface design can impact interpretation accuracy, including false confidence, which we define as a respondent having above average confidence despite interpreting a result inaccurately. Specifically, participants who saw a natural language design were the least likely (421.47 times, P=.02) to exhibit false confidence, and women who saw a graph design were less likely (8.67 times, P=.04) to have false confidence. On the other hand, false confidence was more likely
Psychometric Evaluation of the Lower Extremity Computerized Adaptive Test, the Modified Harris Hip Score, and the Hip Outcome Score.

Science.gov (United States)

Hung, Man; Hon, Shirley D; Cheng, Christine; Franklin, Jeremy D; Aoki, Stephen K; Anderson, Mike B; Kapron, Ashley L; Peters, Christopher L; Pelt, Christopher E

2014-12-01

The applicability and validity of many patient-reported outcome measures in the high-functioning population are not well understood. To compare the psychometric properties of the modified Harris Hip Score (mHHS), the Hip Outcome Score activities of daily living subscale (HOS-ADL) and sports (HOS-sports), and the Lower Extremity Computerized Adaptive Test (LE CAT). The hypotheses was that all instruments would perform well but that the LE CAT would show superiority psychometrically because a combination of CAT and a large item bank allows for a high degree of measurement precision. Cohort study (diagnosis); Level of evidence, 2. Data were collected from 472 advanced-age, active participants from the Huntsman World Senior Games in 2012. Validity evidences were examined through item fit, dimensionality, monotonicity, local independence, differential item functioning, person raw score to measure correlation, and instrument coverage (ie, ceiling and floor effects), and reliability evidences were examined through Cronbach alpha and person separation index. All instruments demonstrated good item fit, unidimensionality, monotonicity, local independence, and person raw score to measure correlations. The HOS-ADL had high ceiling effects of 36.02%, and the mHHS had ceiling effects of 27.54%. The LE CAT had ceiling effects of 8.47%, and the HOS-sports had no ceiling effects. None of the instruments had any floor effects. The mHHS had a very low Cronbach alpha of 0.41 and an extremely low person separation index of 0.08. Reliabilities for the LE CAT were excellent and for the HOS-ADL and HOS-sports were good. The LE CAT showed better psychometric properties overall than the HOS-ADL, HOS-sports, and mHHS for the senior population. The mHHS demonstrated pronounced ceiling effects and poor reliabilities that should be of concern. The high ceiling effects for the HOS-ADL were also of concern. The LE CAT was superior in all psychometric aspects examined in this study. Future
Assessment of the level of agreement in the interpretation of plain radiographs of lumbar spondylosis among clinical physiotherapists in Ghana

International Nuclear Information System (INIS)

Bello, Ajediran I; Ofori, Eric K; Alabi, Oluwasegun J; Adjei, David N

2014-01-01

Objective physical assessment of patients with lumbar spondylosis involves plain film radiographs (PFR) viewing and interpretation by the radiologists. Physiotherapists also routinely assess PFR within the scope of their practice. However, studies appraising the level of agreement of physiotherapists’ PFR interpretation with radiologists are not common in Ghana. Forty-one (41) physiotherapists took part in the cross-sectional survey. An assessment guide was developed from findings of the interpretation of three PFR of patients with lumbar spondylosis by a radiologist. The three PFR were selected from a pool of different radiographs based on clarity, common visible pathological features, coverage body segments and short post production period. Physiotherapists were required to view the same PFR after which they were assessed with the assessment guide according to the number of features identified correctly or incorrectly. The score range on the assessment form was 0–24, interpreted as follow: 0–8 points (low), 9–16 points (moderate) and 17–24 points (high) levels of agreement. Data were analyzed using one sample t-test and fisher’s exact test at α = 0.05. The mean score of interpretation for the physiotherapists was 12.7 ± 2.6 points compared to the radiologist’s interpretation of 24 points (assessment guide). The physiotherapists’ levels were found to be significantly associated with their academic qualification (p = 0.006) and sex (p = 0.001). However, their levels of agreement were not significantly associated with their age group (p = 0.098), work settings (p = 0.171), experience (p = 0.666), preferred PFR view (p = 0.088) and continuing education (p = 0.069). The physiotherapists’ skills fall short of expectation for interpreting PFR of patients with lumbar spondylosis. The levels of agreement with radiologist’s interpretation have no link with year of clinial practice, age, work settings and continuing education. Thus, routine PFR viewing
Assessment of the level of agreement in the interpretation of plain radiographs of lumbar spondylosis among clinical physiotherapists in Ghana.

Science.gov (United States)

Bello, Ajediran I; Ofori, Eric K; Alabi, Oluwasegun J; Adjei, David N

2014-03-29

Objective physical assessment of patients with lumbar spondylosis involves plain film radiographs (PFR) viewing and interpretation by the radiologists. Physiotherapists also routinely assess PFR within the scope of their practice. However, studies appraising the level of agreement of physiotherapists' PFR interpretation with radiologists are not common in Ghana. Forty-one (41) physiotherapists took part in the cross-sectional survey. An assessment guide was developed from findings of the interpretation of three PFR of patients with lumbar spondylosis by a radiologist. The three PFR were selected from a pool of different radiographs based on clarity, common visible pathological features, coverage body segments and short post production period. Physiotherapists were required to view the same PFR after which they were assessed with the assessment guide according to the number of features identified correctly or incorrectly. The score range on the assessment form was 0-24, interpreted as follow: 0-8 points (low), 9-16 points (moderate) and 17-24 points (high) levels of agreement. Data were analyzed using one sample t-test and fisher's exact test at α = 0.05. The mean score of interpretation for the physiotherapists was 12.7 ± 2.6 points compared to the radiologist's interpretation of 24 points (assessment guide). The physiotherapists' levels were found to be significantly associated with their academic qualification (p = 0.006) and sex (p = 0.001). However, their levels of agreement were not significantly associated with their age group (p = 0.098), work settings (p = 0.171), experience (p = 0.666), preferred PFR view (p = 0.088) and continuing education (p = 0.069). The physiotherapists' skills fall short of expectation for interpreting PFR of patients with lumbar spondylosis. The levels of agreement with radiologist's interpretation have no link with year of clinial practice, age, work settings and continuing education. Thus
GalaxyDock BP2 score: a hybrid scoring function for accurate protein-ligand docking

Science.gov (United States)

Baek, Minkyung; Shin, Woong-Hee; Chung, Hwan Won; Seok, Chaok

2017-07-01

Protein-ligand docking is a useful tool for providing atomic-level understanding of protein functions in nature and design principles for artificial ligands or proteins with desired properties. The ability to identify the true binding pose of a ligand to a target protein among numerous possible candidate poses is an essential requirement for successful protein-ligand docking. Many previously developed docking scoring functions were trained to reproduce experimental binding affinities and were also used for scoring binding poses. However, in this study, we developed a new docking scoring function, called GalaxyDock BP2 Score, by directly training the scoring power of binding poses. This function is a hybrid of physics-based, empirical, and knowledge-based score terms that are balanced to strengthen the advantages of each component. The performance of the new scoring function exhibits significant improvement over existing scoring functions in decoy pose discrimination tests. In addition, when the score is used with the GalaxyDock2 protein-ligand docking program, it outperformed other state-of-the-art docking programs in docking tests on the Astex diverse set, the Cross2009 benchmark set, and the Astex non-native set. GalaxyDock BP2 Score and GalaxyDock2 with this score are freely available at http://galaxy.seoklab.org/softwares/galaxydock.html.
Poisson Approximation-Based Score Test for Detecting Association of Rare Variants.

Science.gov (United States)

Fang, Hongyan; Zhang, Hong; Yang, Yaning

2016-07-01

Genome-wide association study (GWAS) has achieved great success in identifying genetic variants, but the nature of GWAS has determined its inherent limitations. Under the common disease rare variants (CDRV) hypothesis, the traditional association analysis methods commonly used in GWAS for common variants do not have enough power for detecting rare variants with a limited sample size. As a solution to this problem, pooling rare variants by their functions provides an efficient way for identifying susceptible genes. Rare variant typically have low frequencies of minor alleles, and the distribution of the total number of minor alleles of the rare variants can be approximated by a Poisson distribution. Based on this fact, we propose a new test method, the Poisson Approximation-based Score Test (PAST), for association analysis of rare variants. Two testing methods, namely, ePAST and mPAST, are proposed based on different strategies of pooling rare variants. Simulation results and application to the CRESCENDO cohort data show that our methods are more powerful than the existing methods. © 2016 John Wiley & Sons Ltd/University College London.

Effects of handcuffs on neuropsychological testing: Implications for criminal forensic evaluations.

Science.gov (United States)

Biddle, Christine M; Fazio, Rachel L; Dyshniku, Fiona; Denney, Robert L

2018-01-01

Neuropsychological evaluations are increasingly performed in forensic contexts, including in criminal settings where security sometimes cannot be compromised to facilitate evaluation according to standardized procedures. Interpretation of nonstandardized assessment results poses significant challenges for the neuropsychologist. Research is limited in regard to the validation of neuropsychological test accommodation and modification practices that deviate from standard test administration; there is no published research regarding the effects of hand restraints upon neuropsychological evaluation results. This study provides preliminary results regarding the impact of restraints on motor functioning and common neuropsychological tests with a motor component. When restrained, performance on nearly all tests utilized was significantly impacted, including Trail Making Test A/B, a coding test, and several tests of motor functioning. Significant performance decline was observed in both raw scores and normative scores. Regression models are also provided in order to help forensic neuropsychologists adjust for the effect of hand restraints on raw scores of these tests, as the hand restraints also resulted in significant differences in normative scores; in the most striking case there was nearly a full standard deviation of discrepancy.
Utilizing the Six Realms of Meaning in Improving Campus Standardized Test Scores through Team Teaching and Strategic Planning

Science.gov (United States)

Stevenson, Rosnisha D.; Kritsonis, William Allan

2009-01-01

This article will seek to utilize Dr. William Allan Kritsonis' book "Ways of Knowing Through the Realms of Meaning" (2007) as a framework to improve a campus's standardized test scores, more specifically, their TAKS (Texas Assessment of Knowledge and Skills) scores. Many campuses have an improvement plan, also known as a Campus…
Integrating GIS in the Middle School Curriculum: Impacts on Diverse Students' Standardized Test Scores

Science.gov (United States)

Goldstein, Donna; Alibrandi, Marsha

2013-01-01

This case study conducted with 1,425 middle school students in Palm Beach County, Florida, included a treatment group receiving GIS instruction (256) and a control group without GIS instruction (1,169). Quantitative analyses on standardized test scores indicated that inclusion of GIS in middle school curriculum had a significant effect on student…
Virginia tech freshman class becoming more competitive; Rise in grades and test scores noted

OpenAIRE

Virginia Tech News

2004-01-01

Admission to Virginia Tech continues to become more competitive as applicants report higher grade point averages and test scores than previous years. The incoming class of 4,975 students has an average grade point average (GPA) of 3.68 and SAT 1203, up from 3.60 GPA and 1197 SAT in 2003.
The Health Professions Admission Test (HPAT) score and leaving certificate results can independently predict academic performance in medical school: do we need both tests?

LENUS (Irish Health Repository)

Halpenny, D

2010-11-01

A recent study raised concerns regarding the ability of the health professions admission test (HPAT) Ireland to improve the selection process in Irish medical schools. We aimed to establish whether performance in a mock HPAT correlated with academic success in medicine. A modified HPAT examination and a questionnaire were administered to a group of doctors and medical students. There was a significant correlation between HPAT score and college results (r2: 0.314, P = 0.018, Spearman Rank) and between leaving cert score and college results (r2: 0.306, P = 0.049, Spearman Rank). There was no correlation between leaving cert points score and HPAT score. There was no difference in HPAT score across a number of other variables including gender, age and medical speciality. Our results suggest that both the HPAT Ireland and the leaving certificate examination could act as independent predictors of academic achievement in medicine.
A Comparative Investigation into Understandings and Uses of the "TOEFL iBT"® Test, the International English Language Testing Service (Academic) Test, and the Pearson Test of English for Graduate Admissions in the United States and Australia: A Case Study of Two University Contexts. "TOEFL iBT"® Research Report. TOEFL iBT-24. ETS Research Report. RR-14-44

Science.gov (United States)

Ginther, April; Elder, Catherine

2014-01-01

In line with expanded conceptualizations of validity that encompass the interpretations and uses of test scores in particular policy contexts, this report presents results of a comparative analysis of institutional understandings and uses of 3 international English proficiency tests widely used for tertiary selection--the "TOEFL iBT"®…
Test Review: An Interview with Amy Gabel--About the WISC-V

Science.gov (United States)

Greathouse, Dan; Shaughnessy, Michael F.

2016-01-01

Whenever a major intelligence or achievement test is revised, there is always renewed interest in the underlying structure of the test as well as a renewed interest in the scoring, administration, and interpretation changes. In this interview, Amy Gabel discusses the most recent revision of the "Wechsler Intelligence Scale for Children-Fifth…
Science Teacher Efficacy and Outcome Expectancy as Predictors of Students' End-of-Instruction (EOI) Biology I Test Scores

Science.gov (United States)

Angle, Julie; Moseley, Christine

2009-01-01

The purpose of this study was to compare teacher efficacy beliefs of secondary Biology I teachers whose students' mean scores on the statewide End-of-Instruction (EOI) Biology I test met or exceeded the state academic proficiency level (Proficient Group) to teacher efficacy beliefs of secondary Biology I teachers whose students' mean scores on the…
A mathematical model for interpretable clinical decision support with applications in gynecology.

Directory of Open Access Journals (Sweden)

Vanya M C A Van Belle

Full Text Available Over time, methods for the development of clinical decision support (CDS systems have evolved from interpretable and easy-to-use scoring systems to very complex and non-interpretable mathematical models. In order to accomplish effective decision support, CDS systems should provide information on how the model arrives at a certain decision. To address the issue of incompatibility between performance, interpretability and applicability of CDS systems, this paper proposes an innovative model structure, automatically leading to interpretable and easily applicable models. The resulting models can be used to guide clinicians when deciding upon the appropriate treatment, estimating patient-specific risks and to improve communication with patients.We propose the interval coded scoring (ICS system, which imposes that the effect of each variable on the estimated risk is constant within consecutive intervals. The number and position of the intervals are automatically obtained by solving an optimization problem, which additionally performs variable selection. The resulting model can be visualised by means of appealing scoring tables and color bars. ICS models can be used within software packages, in smartphone applications, or on paper, which is particularly useful for bedside medicine and home-monitoring. The ICS approach is illustrated on two gynecological problems: diagnosis of malignancy of ovarian tumors using a dataset containing 3,511 patients, and prediction of first trimester viability of pregnancies using a dataset of 1,435 women. Comparison of the performance of the ICS approach with a range of prediction models proposed in the literature illustrates the ability of ICS to combine optimal performance with the interpretability of simple scoring systems.The ICS approach can improve patient-clinician communication and will provide additional insights in the importance and influence of available variables. Future challenges include extensions of the
Apparently abnormal Wechsler Memory Scale index score patterns in the normal population.

Science.gov (United States)

Carrasco, Roman Marcus; Grups, Josefine; Evans, Brittney; Simco, Edward; Mittenberg, Wiley

2015-01-01

Interpretation of the Wechsler Memory Scale-Fourth Edition may involve examination of multiple memory index score contrasts and similar comparisons with Wechsler Adult Intelligence Scale-Fourth Edition ability indexes. Standardization sample data suggest that 15-point differences between any specific pair of index scores are relatively uncommon in normal individuals, but these base rates refer to a comparison between a single pair of indexes rather than multiple simultaneous comparisons among indexes. This study provides normative data for the occurrence of multiple index score differences calculated by using Monte Carlo simulations and validated against standardization data. Differences of 15 points between any two memory indexes or between memory and ability indexes occurred in 60% and 48% of the normative sample, respectively. Wechsler index score discrepancies are normally common and therefore not clinically meaningful when numerous such comparisons are made. Explicit prior interpretive hypotheses are necessary to reduce the number of index comparisons and associated false-positive conclusions. Monte Carlo simulation accurately predicts these false-positive rates.
Associations between cadmium exposure and neurocognitive test scores in a cross-sectional study of US adults.

Science.gov (United States)

Ciesielski, Timothy; Bellinger, David C; Schwartz, Joel; Hauser, Russ; Wright, Robert O

2013-02-05

Low-level environmental cadmium exposure and neurotoxicity has not been well studied in adults. Our goal was to evaluate associations between neurocognitive exam scores and a biomarker of cumulative cadmium exposure among adults in the Third National Health and Nutrition Examination Survey (NHANES III). NHANES III is a nationally representative cross-sectional survey of the U.S. population conducted between 1988 and 1994. We analyzed data from a subset of participants, age 20-59, who participated in a computer-based neurocognitive evaluation. There were four outcome measures: the Simple Reaction Time Test (SRTT: visual motor speed), the Symbol Digit Substitution Test (SDST: attention/perception), the Serial Digit Learning Test (SDLT) trials-to-criterion, and the SDLT total-error-score (SDLT-tests: learning recall/short-term memory). We fit multivariable-adjusted models to estimate associations between urinary cadmium concentrations and test scores. 5662 participants underwent neurocognitive screening, and 5572 (98%) of these had a urinary cadmium level available. Prior to multivariable-adjustment, higher urinary cadmium concentration was associated with worse performance in each of the 4 outcomes. After multivariable-adjustment most of these relationships were not significant, and age was the most influential variable in reducing the association magnitudes. However among never-smokers with no known occupational cadmium exposure the relationship between urinary cadmium and SDST score (attention/perception) was significant: a 1 μg/L increase in urinary cadmium corresponded to a 1.93% (95%CI: 0.05, 3.81) decrement in performance. These results suggest that higher cumulative cadmium exposure in adults may be related to subtly decreased performance in tasks requiring attention and perception, particularly among those adults whose cadmium exposure is primarily though diet (no smoking or work based cadmium exposure). This association was observed among exposure levels
Validating Automated Essay Scoring: A (Modest) Refinement of the "Gold Standard"

Science.gov (United States)

Powers, Donald E.; Escoffery, David S.; Duchnowski, Matthew P.

2015-01-01

By far, the most frequently used method of validating (the interpretation and use of) automated essay scores has been to compare them with scores awarded by human raters. Although this practice is questionable, human-machine agreement is still often regarded as the "gold standard." Our objective was to refine this model and apply it to…
COMPARISON BETWEEN WOOD DRYING DEFECT SCORES: SPECIMEN TESTING X ANALYSIS OF KILN-DRIED BOARDS

Directory of Open Access Journals (Sweden)

Djeison Cesar Batista

2015-04-01

Full Text Available It is important to develop drying technologies for Eucalyptus grandis lumber, which is one of the most planted species of this genus in Brazil and plays an important role as raw material for the wood industry. The general aim of this work was to assess the conventional kiln drying of juvenile wood of three clones of Eucalyptus grandis. The specific aims were to compare the behavior between: i drying defects indicated by tests with wood specimens and conventional kiln-dried boards; and ii physical properties and the drying quality. Five 11-year-old trees of each clone were felled, and only flatsawn boards of the first log were used. Basic density and total shrinkage were determined, and the drying test with wood specimens at 100 °C was carried out. Kiln drying of boards was performed, and initial and final moisture content, moisture gradient in thickness, drying stresses and drying defects were assessed. The defect scoring method was used to verify the behavior between the defects detected by specimen testing and the defects detected in kiln-dried boards. As main results, the drying schedule was too severe for the wood, resulting in a high level of boards with defects. The behavior between the defects in the drying test with specimens and the defects of kiln-dried boards was different, there was no correspondence, according to the defect scoring method.
The effect of an intervention program on functional movement screen test scores in mixed martial arts athletes.

Science.gov (United States)

Bodden, Jamie G; Needham, Robert A; Chockalingam, Nachiappan

2015-01-01

This study assessed the basic fundamental movements of mixed martial arts (MMA) athletes using the functional movement screen (FMS) assessment and determined if an intervention program was successful at improving results. Participants were placed into 1 of the 2 groups: intervention and control groups. The intervention group was required to complete a corrective exercise program 4 times per week, and all participants were asked to continue their usual MMA training routine. A mid-intervention FMS test was included to examine if successful results were noticed sooner than the 8-week period. Results highlighted differences in FMS test scores between the control group and intervention group (p = 0.006). Post hoc testing revealed a significant increase in the FMS score of the intervention group between weeks 0 and 8 (p = 0.00) and weeks 0 and 4 (p = 0.00) and no significant increase between weeks 4 and 8 (p = 1.00). A χ analysis revealed that the intervention group participants were more likely to have an FMS score >14 than participants in the control group at week 4 (χ = 7.29, p < 0.01) and week 8 (χ = 5.2, p ≤ 0.05). Finally, a greater number of participants in the intervention group were free from asymmetry at week 4 and week 8 compared with the initial test period. The results of the study suggested that a 4-week intervention program was sufficient at improving FMS scores. Most if not all, the movements covered on the FMS relate to many aspects of MMA training. The knowledge that the FMS can identify movement dysfunctions and, furthermore, the fact that the issues can be improved through a standardized intervention program could be advantageous to MMA coaches, thus, providing the opportunity to adapt and implement new additions to training programs.
Neurocognitive performance and symptom profiles of Spanish-speaking Hispanic athletes on the ImPACT test.

Science.gov (United States)

Ott, Summer; Schatz, Philip; Solomon, Gary; Ryan, Joseph J

2014-03-01

This study documented baseline neurocognitive performance of 23,815 athletes on the Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) test. Specifically, 9,733 Hispanic, Spanish-speaking athletes who completed the ImPACT test in English and 2,087 Hispanic, Spanish-speaking athletes who completed the test in Spanish were compared with 11,955 English-speaking athletes who completed the test in English. Athletes were assigned to age groups (13-15, 16-18). Results revealed a significant effect of language group (p Spanish-speaking athletes completing the test in Spanish scored more poorly than Spanish-speaking and English-speaking athletes completing the test in English, on all Composite scores and Total Symptom scores. Spanish-speaking athletes completing the test in English also performed more poorly than English-speaking athletes completing the test in English on three Composite scores. These differences in performance and reported symptoms highlight the need for caution in interpreting ImPACT test data for Hispanic Americans.
Interpretation of time-domain electromagnetic soundings in the Calico Hills area, Nevada Test Site, Nye County, Nevada

Science.gov (United States)

Kauahikaua, J.

A controlled source, time domain electromagnetic (TDEM) sounding survey was conducted in the Calico Hills area of the Nevada Test Site (NTS). The geoelectric structure was determined as an aid in the evaluation of the site for possible future storage of spent nuclear fuel or high level nuclear waste. The data were initially interpreted with a simple scheme that produces an apparent resistivity versus depth curve from the vertical magnetic field data. These curves are qualitatively interpreted much like standard Schlumberger resistivity sounding curves. Final interpretation made use of a layered earth Marquardt inversion computer program. The results combined with those from a set of Schlumberger soundings in the area show that there is a moderately resistive basement at a depth no greater than 800 meters. The basement resistivity is greater than 100 ohm meters.
Effects of Public Preschool Expenditures on the Test Scores of 4 Graders: Evidence from TIMSS.

Science.gov (United States)

Waldfogel, Jane; Zhai, Fuhua

2008-02-01

This study examines the effects of public preschool expenditures on the math and science scores of 4(th) graders, holding constant child, family, and school characteristics, other relevant social expenditures, and country and year effects, in seven Organization for Economic Co-operation and Development (OECD) countries -- Australia, Japan, Netherlands, New Zealand, Norway, U.K., and U.S -- using data from the 1995 and 2003 Trends in International Mathematics and Science Study (TIMSS). Our results indicate that there are small but significant positive effects of public preschool expenditures on the math and science scores of 4(th) graders and preschool expenditures reduce the risk of children scoring at the low level of proficiency. We also find some evidence that children from low-resource homes and homes where the test language is not always spoken may tend to gain more from increased public preschool expenditures than other children,.
The Validity of Graduate Management Admission Test Scores: A Summary of Studies Conducted from 1997 to 2004

Science.gov (United States)

Talento-Miller, Eileen; Rudner, Lawrence M.

2008-01-01

The validity of Graduate Management Admission Test (GMAT) scores is examined by summarizing 273 studies conducted between 1997 and 2004. Each of the studies was conducted through the Validity Study Service of the test sponsor and contained identical variables and statistical methods. Validity coefficients from each of the studies were corrected…
Standardized Computer-based Organized Reporting of EEG: SCORE

Science.gov (United States)

Beniczky, Sándor; Aurlien, Harald; Brøgger, Jan C; Fuglsang-Frederiksen, Anders; Martins-da-Silva, António; Trinka, Eugen; Visser, Gerhard; Rubboli, Guido; Hjalgrim, Helle; Stefan, Hermann; Rosén, Ingmar; Zarubova, Jana; Dobesberger, Judith; Alving, Jørgen; Andersen, Kjeld V; Fabricius, Martin; Atkins, Mary D; Neufeld, Miri; Plouin, Perrine; Marusic, Petr; Pressler, Ronit; Mameniskiene, Ruta; Hopfengärtner, Rüdiger; Emde Boas, Walter; Wolf, Peter

2013-01-01

The electroencephalography (EEG) signal has a high complexity, and the process of extracting clinically relevant features is achieved by visual analysis of the recordings. The interobserver agreement in EEG interpretation is only moderate. This is partly due to the method of reporting the findings in free-text format. The purpose of our endeavor was to create a computer-based system for EEG assessment and reporting, where the physicians would construct the reports by choosing from predefined elements for each relevant EEG feature, as well as the clinical phenomena (for video-EEG recordings). A working group of EEG experts took part in consensus workshops in Dianalund, Denmark, in 2010 and 2011. The faculty was approved by the Commission on European Affairs of the International League Against Epilepsy (ILAE). The working group produced a consensus proposal that went through a pan-European review process, organized by the European Chapter of the International Federation of Clinical Neurophysiology. The Standardised Computer-based Organised Reporting of EEG (SCORE) software was constructed based on the terms and features of the consensus statement and it was tested in the clinical practice. The main elements of SCORE are the following: personal data of the patient, referral data, recording conditions, modulators, background activity, drowsiness and sleep, interictal findings, “episodes” (clinical or subclinical events), physiologic patterns, patterns of uncertain significance, artifacts, polygraphic channels, and diagnostic significance. The following specific aspects of the neonatal EEGs are scored: alertness, temporal organization, and spatial organization. For each EEG finding, relevant features are scored using predefined terms. Definitions are provided for all EEG terms and features. SCORE can potentially improve the quality of EEG assessment and reporting; it will help incorporate the results of computer-assisted analysis into the report, it will make
Your move: The effect of chess on mathematics test scores.

Science.gov (United States)

Rosholm, Michael; Mikkelsen, Mai Bjørnskov; Gumede, Kamilla

2017-01-01

We analyse the effect of substituting a weekly mathematics lesson in primary school grades 1-3 with a lesson in mathematics based on chess instruction. We use data from the City of Aarhus in Denmark, combining test score data with a comprehensive data set obtained from administrative registers. We use two different methodological approaches to identify and estimate treatment effects and we tend to find positive effects, indicating that knowledge acquired through chess play can be transferred to the domain of mathematics. We also find larger impacts for unhappy children and children who are bored in school, perhaps because chess instruction facilitates learning by providing an alternative approach to mathematics for these children. The results are encouraging and suggest that chess may be an important and effective tool for improving mathematical capacity in young students.

Relationship Between Working Memory and English-Chinese Consecu-tive Interpreting

Institute of Scientific and Technical Information of China (English)

王磊; 陈莉; 徐晓娟

2016-01-01

Working memory is the system that actively holds multiple pieces of transitory information in the mind, where they can be manipulated. In interpreting, working memory is in charge of the storage and processing of immediate information, thus making an important factor in influencing interpreting quality. The role played by working memory capacity in interpreting re-mains to be a hotspot issue in the field of interpreting research.This thesis aims to investigate the relationship between working memory capacity and E-C consecutive interpreting by conducting two tests. The first test is working memory span test and the second one is E-C consecutive interpreting test. By comparing and analyzing the results of two tests, this thesis comes to the con-clusion that working memory capacity is positively correlated with E-C consecutive interpreting in terms of fluency and logic.
Speech-discrimination scores modeled as a binomial variable.

Science.gov (United States)

Thornton, A R; Raffin, M J

1978-09-01

Many studies have reported variability data for tests of speech discrimination, and the disparate results of these studies have not been given a simple explanation. Arguments over the relative merits of 25- vs 50-word tests have ignored the basic mathematical properties inherent in the use of percentage scores. The present study models performance on clinical tests of speech discrimination as a binomial variable. A binomial model was developed, and some of its characteristics were tested against data from 4120 scores obtained on the CID Auditory Test W-22. A table for determining significant deviations between scores was generated and compared to observed differences in half-list scores for the W-22 tests. Good agreement was found between predicted and observed values. Implications of the binomial characteristics of speech-discrimination scores are discussed.
Multicultural issues in test interpretation.

Science.gov (United States)

Langdon, Henriette W; Wiig, Elisabeth H

2009-11-01

Designing the ideal test or series of tests to assess individuals who speak languages other than English is difficult. This article first describes some of the roadblocks-one of which is the lack of identification criteria for language and learning disabilities in monolingual and bilingual populations in most countries of the non-English-speaking world. This lag exists, in part, because access to general education is often limited. The second section describes tests that have been developed in the United States, primarily for Spanish-speaking individuals because they now represent the largest first-language majority in the United States (80% of English-language learners [ELLs] speak Spanish at home). We discuss tests developed for monolingual and bilingual English-Spanish speakers in the United States and divide this coverage into two parts: The first addresses assessment of students' first language (L1) and second language (L2), usually English, with different versions of the same test; the second describes assessment of L1 and L2 using the same version of the test, administered in the two languages. Examples of tests that fit a priori-determined criteria are briefly discussed throughout the article. Suggestions how to develop tests for speakers of languages other than English are also provided. In conclusion, we maintain that there will never be a perfect test or set of tests to adequately assess the communication skills of a bilingual individual. This is not surprising because we have yet to develop an ideal test or set of tests that fits monolingual Anglo speakers perfectly. Tests are tools, and the speech-language pathologist needs to know how to use those tools most effectively and equitably. The goal of this article is to provide such guidance. Thieme Medical Publishers.
The Sinonasal Outcome Test 22 score in persons without chronic rhinosinusitis

DEFF Research Database (Denmark)

Lange, Bibi; Thilsing, T; Baelum, J

2016-01-01

-67 with a mean score of 10.5 (CI: 9.1 - 11.9) and the median score was 7. Persons with allergic rhinitis and blue collar workers had a significant higher score. CONCLUSION: The median value of 7 is taken as the normal SNOT 22 score in persons without CRS and can be used as a reference in clinical settings...... and research. Allergic rhinitis and occupation affects SNOT 22 in persons without CRS. This article is protected by copyright. All rights reserved....
Free digital image analysis software helps to resolve equivocal scores in HER2 immunohistochemistry.

Science.gov (United States)

Helin, Henrik O; Tuominen, Vilppu J; Ylinen, Onni; Helin, Heikki J; Isola, Jorma

2016-02-01

Evaluation of human epidermal growth factor receptor 2 (HER2) immunohistochemistry (IHC) is subject to interobserver variation and lack of reproducibility. Digital image analysis (DIA) has been shown to improve the consistency and accuracy of the evaluation and its use is encouraged in current testing guidelines. We studied whether digital image analysis using a free software application (ImmunoMembrane) can assist in interpreting HER2 IHC in equivocal 2+ cases. We also compared digital photomicrographs with whole-slide images (WSI) as material for ImmunoMembrane DIA. We stained 750 surgical resection specimens of invasive breast cancers immunohistochemically for HER2 and analysed staining with ImmunoMembrane. The ImmunoMembrane DIA scores were compared with the originally responsible pathologists' visual scores, a researcher's visual scores and in situ hybridisation (ISH) results. The originally responsible pathologists reported 9.1 % positive 3+ IHC scores, for the researcher this was 8.4 % and for ImmunoMembrane 9.5 %. Equivocal 2+ scores were 34 % for the pathologists, 43.7 % for the researcher and 10.1 % for ImmunoMembrane. Negative 0/1+ scores were 57.6 % for the pathologists, 46.8 % for the researcher and 80.8 % for ImmunoMembrane. There were six false positive cases, which were classified as 3+ by ImmunoMembrane and negative by ISH. Six cases were false negative defined as 0/1+ by IHC and positive by ISH. ImmunoMembrane DIA using digital photomicrographs and WSI showed almost perfect agreement. In conclusion, digital image analysis by ImmunoMembrane can help to resolve a majority of equivocal 2+ cases in HER2 IHC, which reduces the need for ISH testing.
Interpretations of Tracer Tests Performed in the Culebra Dolomite at the Waste Isolation Pilot Plant Site

International Nuclear Information System (INIS)

MEIGS, LUCY C.; BEAUHEIM, RICHARD L.; JONES, TOYA L.

2000-01-01

This report provides (1) an overview of all tracer testing conducted in the Culebra Dolomite Member of the Rustler Formation at the Waste Isolation Pilot Plant (WPP) site, (2) a detailed description of the important information about the 1995-96 tracer tests and the current interpretations of the data, and (3) a summary of the knowledge gained to date through tracer testing in the Culebra. Tracer tests have been used to identify transport processes occurring within the Culebra and quantify relevant parameters for use in performance assessment of the WIPP. The data, especially those from the tests performed in 1995-96, provide valuable insight into transport processes within the Culebra. Interpretations of the tracer tests in combination with geologic information, hydraulic-test information, and laboratory studies have resulted in a greatly improved conceptual model of transport processes within the Culebra. At locations where the transmissivity of the Culebra is low ( -6 m 2 /s), we conceptualize the Culebra as a single-porosity medium in which advection occurs largely through the primary porosity of the dolomite matrix. At locations where the transmissivity of the Culebra is high (> 4 x 10 -6 m 2 /s), we conceptualize the Culebra as a heterogeneous, layered, fractured medium in which advection occurs largely through fractures and solutes diffuse between fractures and matrix at multiple rates. The variations in diffusion rate can be attributed to both variations in fracture spacing (or the spacing of advective pathways) and matrix heterogeneity. Flow and transport appear to be concentrated in the lower Culebra. At all locations, diffusion is the dominant transport process in the portions of the matrix that tracer does not access by flow
[Development of a proverb test for assessment of concrete thinking problems in schizophrenic patients].

Science.gov (United States)

Barth, A; Küfferle, B

2001-11-01

Concretism is considered an important aspect of schizophrenic thought disorder. Traditionally it is measured using the method of proverb interpretation, in which metaphoric proverbs are presented with the request that the subject tell its meaning. Interpretations are recorded and scored on concretistic tendencies. However, this method has two problems: its reliability is doubtful and it is rather complicated to perform. In this paper, a new version of a multiple choice proverb test is presented which can solve these problems in a reliable and economic manner. Using the new test, it is has been shown that schizophrenic patients have greater deficits in proverb interpretation than depressive patients.
Methods for interpreting change over time in patient-reported outcome measures.

Science.gov (United States)

Wyrwich, K W; Norquist, J M; Lenderking, W R; Acaster, S

2013-04-01

Interpretation guidelines are needed for patient-reported outcome (PRO) measures' change scores to evaluate efficacy of an intervention and to communicate PRO results to regulators, patients, physicians, and providers. The 2009 Food and Drug Administration (FDA) Guidance for Industry Patient-Reported Outcomes (PRO) Measures: Use in Medical Product Development to Support Labeling Claims (hereafter referred to as the final FDA PRO Guidance) provides some recommendations for the interpretation of change in PRO scores as evidence of treatment efficacy. This article reviews the evolution of the methods and the terminology used to describe and aid in the communication of meaningful PRO change score thresholds. Anchor- and distribution-based methods have played important roles, and the FDA has recently stressed the importance of cross-sectional patient global assessments of concept as anchor-based methods for estimation of the responder definition, which describes an individual-level treatment benefit. The final FDA PRO Guidance proposes the cumulative distribution function (CDF) of responses as a useful method to depict the effect of treatments across the study population. While CDFs serve an important role, they should not be a replacement for the careful investigation of a PRO's relevant responder definition using anchor-based methods and providing stakeholders with a relevant threshold for the interpretation of change over time.
Classroom Organizational Structure in Fifth Grade Math Classrooms and the Effect on Standardized Test Scores

Science.gov (United States)

Lane, Dallas Marie

2017-01-01

The purpose of this study was to determine if there is a relationship between the classroom organizational structure and MCT2 test scores of fifth-grade math students. The researcher gained insight regarding which structure teachers believe is most beneficial to them and students, and whether or not their belief of classroom organizational…
A risk score for predicting coronary artery disease in women with angina pectoris and abnormal stress test finding.

Science.gov (United States)

Lo, Monica Y; Bonthala, Nirupama; Holper, Elizabeth M; Banks, Kamakki; Murphy, Sabina A; McGuire, Darren K; de Lemos, James A; Khera, Amit

2013-03-15

Women with angina pectoris and abnormal stress test findings commonly have no epicardial coronary artery disease (CAD) at catheterization. The aim of the present study was to develop a risk score to predict obstructive CAD in such patients. Data were analyzed from 337 consecutive women with angina pectoris and abnormal stress test findings who underwent cardiac catheterization at our center from 2003 to 2007. Forward selection multivariate logistic regression analysis was used to identify the independent predictors of CAD, defined by ≥50% diameter stenosis in ≥1 epicardial coronary artery. The independent predictors included age ≥55 years (odds ratio 2.3, 95% confidence interval 1.3 to 4.0), body mass index stress imaging (odds ratio 2.8, 95% confidence interval 1.5 to 5.5), and exercise capacity statistic of 0.745 (95% confidence interval 0.70 to 0.79), and an optimized cutpoint of a score of ≤2 included 62% of the subjects and had a negative predictive value of 80%. In conclusion, a simple clinical risk score of 7 characteristics can help differentiate those more or less likely to have CAD among women with angina pectoris and abnormal stress test findings. This tool, if validated, could help to guide testing strategies in women with angina pectoris. Copyright © 2013 Elsevier Inc. All rights reserved.
Examining Testlet Effects in the TestDaF Listening Section: A Testlet Response Theory Modeling Approach

Science.gov (United States)

Eckes, Thomas

2014-01-01

Testlets are subsets of test items that are based on the same stimulus and are administered together. Tests that contain testlets are in widespread use in language testing, but they also share a fundamental problem: Items within a testlet are locally dependent with possibly adverse consequences for test score interpretation and use. Building on…
Your move: The effect of chess on mathematics test scores

DEFF Research Database (Denmark)

Rosholm, Michael; Mikkelsen, Mai Bjørnskov; Gumede, Kamilla Trille

2017-01-01

We analyse the effect of substituting a weekly mathematics lesson in primary school grades 1–3 with a lesson in mathematics based on chess instruction. We use data from the City of Aarhus in Denmark, combining test score data with a comprehensive data set obtained from administrative registers. We...... use two different methodological approaches to identify and estimate treatment effects and we tend to find positive effects, indicating that knowledge acquired through chess play can be transferred to the domain of mathematics. We also find larger impacts for unhappy children and children who...... are bored in school, perhaps because chess instruction facilitates learning by providing an alternative approach to mathematics for these children. The results are encouraging and suggest that chess may be an important and effective tool for improving mathematical capacity in young students....
Your move: The effect of chess on mathematics test scores.

Directory of Open Access Journals (Sweden)

Michael Rosholm

Full Text Available We analyse the effect of substituting a weekly mathematics lesson in primary school grades 1-3 with a lesson in mathematics based on chess instruction. We use data from the City of Aarhus in Denmark, combining test score data with a comprehensive data set obtained from administrative registers. We use two different methodological approaches to identify and estimate treatment effects and we tend to find positive effects, indicating that knowledge acquired through chess play can be transferred to the domain of mathematics. We also find larger impacts for unhappy children and children who are bored in school, perhaps because chess instruction facilitates learning by providing an alternative approach to mathematics for these children. The results are encouraging and suggest that chess may be an important and effective tool for improving mathematical capacity in young students.
Visual-Constructional Ability in Individuals with Severe Obesity: Rey Complex Figure Test Accuracy and the Q-Score

Directory of Open Access Journals (Sweden)

Hanna L. Sargénius

2017-09-01

Full Text Available The aims of this study were to investigate visual-construction and organizational strategy among individuals with severe obesity, as measured by the Rey Complex Figure Test (RCFT, and to examine the validity of the Q-score as a measure for the quality of performance on the RCFT. Ninety-six non-demented morbidly obese (MO patients and 100 healthy controls (HC completed the RCFT. Their performance was calculated by applying the standard scoring criteria. The quality of the copying process was evaluated per the directions of the Q-score scoring system. Results revealed that the MO did not perform significantly lower than the HC on Copy accuracy (mean difference −0.302, CI −1.374 to 0.769, p = 0.579. In contrast, the groups did statistically differ from each other, with MO performing poorer than the HC on the Q-score (mean −1.784, CI −3.237 to −0.331, p = 0.016 and the Unit points (mean −1.409, CI −2.291 to −0.528, p = 0.002, but not on the Order points score (mean −0.351, CI −0.994 to 0.293, p = 0.284. Differences on the Unit score and the Q-score were slightly reduced when adjusting for gender, age, and education. This study presents evidence supporting the presence of inefficiency in visuospatial constructional ability among MO patients. We believe we have found an indication that the Q-score captures a wider range of cognitive processes that are not described by traditional scoring methods. Rather than considering accuracy and placement of the different elements only, the Q-score focuses more on how the subject has approached the task.
Role of human neurobehavioural tests in regulatory activity on chemicals

Science.gov (United States)

Stephens, R.; Barker, P.

1998-01-01

Psychological performance tests have been used since the mid-1960s in occupational and environmental health toxicology. The interpretation of significantly different test scores in neurobehavioural studies is not straightforward in the regulation of chemicals. This paper sets out some issues which emerged from discussions at an international workshop, organised by the United Kingdom Health and Safety Executive (HSE), to discuss differences in interpretation of human neurobehavioural test data in regulatory risk assessments. The difficulties encountered by regulators confronted with neurobehavioural studies seem to be twofold; some studies lack scientific rigor; other studies, although scientifically sound, are problematic because it is not clear what interpretation to place on the results. Issues relating to each of these points are discussed. Next, scenarios within which to consider the outcomes of neurobehavioural studies are presented. Finally, conclusions and recommendations for further work are put forward. PMID:9624273
Use of e-rater[R] in Scoring of the TOEFL iBT[R] Writing Test. Research Report. ETS RR-11-25

Science.gov (United States)

Haberman, Shelby J.

2011-01-01

Alternative approaches are discussed for use of e-rater[R] to score the TOEFL iBT[R] Writing test. These approaches involve alternate criteria. In the 1st approach, the predicted variable is the expected rater score of the examinee's 2 essays. In the 2nd approach, the predicted variable is the expected rater score of 2 essay responses by the…
Detection of acute deterioration in health status visit among COPD patients by monitoring COPD assessment test score

Directory of Open Access Journals (Sweden)

Pothirat C

2015-02-01

Full Text Available Chaicharn Pothirat, Warawut Chaiwong, Atikun Limsukon, Athavudh Deesomchok, Chalerm Liwsrisakun, Chaiwat Bumroongkit, Theerakorn Theerakittikul, Nittaya PhetsukDivision of Pulmonary, Critical Care and Allergy, Department of Internal Medicine, Faculty of Medicine, Chiang Mai University, Chiang Mai, ThailandBackground: The Chronic Obstructive Pulmonary Disease Assessment Test (CAT could play a role in detecting acute deterioration in health status during monitoring visits in routine clinical practice.Objective: To evaluate the discriminative property of a change in CAT score from a stable baseline visit for detecting acute deterioration in health status visits of chronic obstructive pulmonary disease (COPD patients.Methods: The CAT questionnaire was administered to stable COPD patients routinely attending the chest clinic of Chiang Mai University Hospital who were monitored using the CAT score every 1–3 months for 15 months. Acute deterioration in health status was defined as worsening or exacerbation. CAT scores at baseline, and subsequent visits with acute deterioration in health status were analyzed using the t-test. The receiver operating characteristic curve was performed to evaluate the discriminative property of change in CAT score for detecting acute deterioration during a health status visit.Results: A total of 354 follow-up visits were made by 140 patients, aged 71.1±8.4 years, with a forced expiratory volume in 1 second of 47.49%±18.2% predicted, who were monitored for 15 months. The mean CAT score change between stable baseline visits, by patients’ and physicians’ global assessments, were 0.05 (95% confidence interval [CI], -0.37–0.46 and 0.18 (95% CI, -0.23–0.60, respectively. At worsening visits, as assessed by patients, there was significant increase in CAT score (6.07; 95% CI, 4.95–7.19. There were also significant increases in CAT scores at visits with mild and moderate exacerbation (5.51 [95% CI, 4.39–6
Reassessing the "traditional background hypothesis" for elevated MMPI and MMPI-2 Lie-scale scores.

Science.gov (United States)

Rosen, Gerald M; Baldwin, Scott A; Smith, Ronald E

2016-10-01

The Lie (L) scale of the Minnesota Multiphasic Personality Inventory (MMPI) is widely regarded as a measure of conscious attempts to deny common human foibles and to present oneself in an unrealistically positive light. At the same time, the current MMPI-2 manual states that "traditional" and religious backgrounds can account for elevated L scale scores as high as 65T-79T, thereby tempering impression management interpretations for faith-based individuals. To assess the validity of the traditional background hypothesis, we reviewed 11 published studies that employed the original MMPI with religious samples and found that only 1 obtained an elevated mean L score. We then conducted a meta-analysis of 12 published MMPI-2 studies in which we compared L scores of religious samples to the test normative group. The meta-analysis revealed large between-study heterogeneity (I2 = 87.1), L scale scores for religious samples that were somewhat higher but did not approach the upper limits specified in the MMPI-2 manual, and an overall moderate effect size (d¯ = 0.54, p < .001; 95% confidence interval [0.37, 0.70]). Our analyses indicated that religious-group membership accounts, on average, for elevations on L of about 5 t-score points. Whether these scores reflect conscious "fake good" impression management or religious-based virtuousness remains unanswered. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Interpretation of shared culture of Baba and Nyonya for tourism linkage of four countries in the ASEAN community

Directory of Open Access Journals (Sweden)

Umaporn Muneenam

2017-09-01

The results showed that there were personal and non-personal interpretations providing differences in the Baba and Nyonya tourism areas. The results from the t-test between the treatment and controlled groups found that before the treatment group had read the 10 postcards, their knowledge was minimal; however, after they had read the 10 postcards for self-guiding interpretation, their knowledge after was significantly different at .05. Moreover, the treatment group recorded “satisfied” gradings for the 10 postcards overall with a score of 4.49 out of 5 using a Likert scale; while the highest satisfaction was with the quality of printing (4.80, but the lowest satisfaction was increased Southeast Asia culture concern and awareness (4.07.
Interpretation of time-domain electromagnetic soundings in the Calico Hills area, Nevada Test Site, Nye County, Nevada

International Nuclear Information System (INIS)

Kauahikaua, J.

1981-01-01

A controlled source, time-domain electromagnetic (TDEM) sounding survey was conducted in the Calico Hills area of the Nevada Test Site (NTS). The goal of this survey was the determination of the geoelectric structure as an aid in the evaluation of the site for possible future storage of spent nuclear fuel or high-level nuclear waste. The data were initially interpreted with a simple scheme that produces an apparent resistivity versus depth curve from the vertical magnetic field data. These curves can be qualitatively interpreted much like standard Schlumberger resistivity sounding curves. Final interpretation made use of a layered-earth Marquardt inversion computer program (Kauahikaua, 1980). The results combined with those from a set of Schlumberger soundings in the area show that there is a moderately resistive basement at a depth no greater than 800 meters. The basement resistivity is greater than 100 ohm-meters

Interobserver agreement in interpretation of radiographic pulmonary changes in dogs in relation to radiology training

Directory of Open Access Journals (Sweden)

Tilde Rodrigues Froes

2014-09-01

Full Text Available Interpretation of pulmonary radiographs is one of the most difficult aspects of radiology and interobserver variability is high. The aim of this study was to assess variations in interpretation of pulmonary pathology amongst Brazilian veterinarians with different levels of training and experience, using the interpretation by American board-certified radiologists as a reference. We identified areas where interpretation is particularly challenging. Sixty digital canine thoracic radiographic examinations were interpreted by four groups of three Brazilian observers, each group being defined by different levels of training and experience. The radiographic findings of the 4 groups of observers in the study were compared to a reference interpretation established from the findings of three ACVR board-certified radiologists. The degree of discrepancy for each list between each group and the reference interpretation was assessed according to a three-level scoring system: no discrepancy, minor discrepancy, or major discrepancy. Data was analyzed using a Kappa and Cochran-Mantel-Haenszel tests. Brazilian veterinarians with the most training and experience showed the least interobserver variation and best performance when compared to the reference interpretation, followed by those with practical training, but with little work experience in professional practice. The radiographic patterns that were associated with the highest interobserver variability were the vascular, unstructured interstitial and bronchial patterns. Interobserver major discrepancies occurred in all groups, but is more evident in groups with the least training (44.4% and the general practitioners (26.7% group. It can be concluded that training positively influences the accuracy of radiographic interpretation and is recommended to reduce erroneous diagnoses.
Proposal for agar disk diffusion interpretive criteria for susceptibility testing of bovine mastitis pathogens using cefoperazone 30μg disks.

Science.gov (United States)

Feßler, Andrea T; Kaspar, Heike; Lindeman, Cynthia J; Peters, Thomas; Watts, Jeffrey L; Schwarz, Stefan

2017-02-01

Cefoperazone is a third generation cephalosporin which is commonly used for bovine mastitis therapy. Bacterial pathogens involved in bovine mastitis are frequently tested for their susceptibility to cefoperazone. So far, the cefoperazone susceptibility testing using 30μg disks has been hampered by the lack of quality control (QC) ranges as well as the lack of interpretive criteria. In 2014, QC ranges for 30 μg cefoperazone disks have been established for Staphylococcus aureus ATCC ® 25923 and Escherichia coli ATCC ® 25922. As a next step, interpretive criteria for the susceptibility testing of bovine mastitis pathogens should be developed. For this, 637 bovine mastitis pathogens (including 112 S. aureus, 121 coagulase-negative staphylococci (CoNS), 103 E. coli, 101 Streptococcus agalactiae, 100 Streptococcus dysgalactiae and 100 Streptococcus uberis) were investigated by agar disk diffusion according to the document Vet01-A4 of the Clinical and Laboratory Standards Institute (CLSI) using 30μg cefoperazone disks and the results were compared to the corresponding MIC values as determined by broth microdilution also according to the aforementioned CLSI document. Based on the results obtained and taking into account the achievable milk concentration of cefoperazone after regular dosing, the following interpretive criteria were proposed as a guidance for mastitis diagnostic laboratories: for staphylococci and E. coli ≥23mm (susceptible), 18-22mm (intermediate) and ≤17mm (resistant) and for streptococci ≥18mm (susceptible), and ≤17mm (non-susceptible). These proposed interpretive criteria shall contribute to a harmonization of cefoperazone susceptibility testing of bovine mastitis pathogens. Copyright © 2016 Elsevier B.V. All rights reserved.
A simple scoring system for breast MRI interpretation: does it compensate for reader experience?

International Nuclear Information System (INIS)

Marino, Maria Adele; Clauser, Paola; Woitek, Ramona; Wengert, Georg J.; Kapetas, Panagiotis; Bernathova, Maria; Pinker-Domenig, Katja; Helbich, Thomas H.; Baltzer, Pascal A.T.; Preidler, Klaus

2016-01-01

To investigate the impact of a scoring system (Tree) on inter-reader agreement and diagnostic performance in breast MRI reading. This IRB-approved, single-centre study included 100 patients with 121 consecutive histopathologically verified lesions (52 malignant, 68 benign). Four breast radiologists with different levels of MRI experience and blinded to histopathology retrospectively evaluated all examinations. Readers independently applied two methods to classify breast lesions: BI-RADS and Tree. BI-RADS provides a reporting lexicon that is empirically translated into likelihoods of malignancy; Tree is a scoring system that results in a diagnostic category. Readings were compared by ROC analysis and kappa statistics. Inter-reader agreement was substantial to almost perfect (kappa: 0.643-0.896) for Tree and moderate (kappa: 0.455-0.657) for BI-RADS. Diagnostic performance using Tree (AUC: 0.889-0.943) was similar to BI-RADS (AUC: 0.872-0.953). Less experienced radiologists achieved AUC: improvements up to 4.7 % using Tree (P-values: 0.042-0.698); an expert's performance did not change (P = 0.526). The least experienced reader improved in specificity using Tree (16 %, P = 0.001). No further sensitivity and specificity differences were found (P > 0.1). The Tree scoring system improves inter-reader agreement and achieves a diagnostic performance similar to that of BI-RADS. Less experienced radiologists, in particular, benefit from Tree. (orig.)
The NeBoP score - a clinical prediction test for evaluation of children with Lyme Neuroborreliosis in Europe.

Science.gov (United States)

Skogman, Barbro H; Sjöwall, Johanna; Lindgren, Per-Eric

2015-12-17

The diagnosis of Lyme neuroborreliosis (LNB) in Europe is based on clinical symptoms and laboratory data, such as pleocytosis and anti-Borrelia antibodies in serum and CSF according to guidelines. However, the decision to start antibiotic treatment on admission cannot be based on Borrelia serology since results are not available at the time of lumbar puncture. Therefore, an early prediction test would be useful in clinical practice. The aim of the study was to develop and evaluate a clinical prediction test for children with LNB in a relevant European setting. Clinical and laboratory data were collected retrospectively from a cohort of children being evaluated for LNB in Southeast Sweden. A clinical neuroborreliosis prediction test, the NeBoP score, was designed to differentiate between a high and a low risk of having LNB. The NeBoP score was then prospectively validated in a cohort of children being evaluated for LNB in Central and Southeast Sweden (n = 190) and controls with other specific diagnoses (n = 49). The sensitivity of the NeBoP score was 90 % (CI 95 %; 82-99 %) and the specificity was 90 % (CI 95 %; 85-96 %). Thus, the diagnostic accuracy (i.e. how the test correctly discriminates patients from controls) was 90 % and the area under the curve in a ROC analysis was 0.95. The positive predictive value (PPV) was 0.83 (CI 95 %; 0.75-0.93) and the negative predictive value (NPV) was 0.95 (CI 95 %; 0.90-0.99). The overall diagnostic performance of the NeBoP score is high (90 %) and the test is suggested to be useful for decision-making about early antibiotic treatment in children being evaluated for LNB in European Lyme endemic areas.
Test Score Gaps between Private and Government Sector Students at School Entry Age in India

Science.gov (United States)

Singh, Abhijeet

2014-01-01

Various studies have noted that students enrolled in private schools in India perform better on average than students in government schools. In this paper, I show that large gaps in the test scores of children in private and public sector education are evident even at the point of initial enrollment in formal schooling and are associated with…
Preoperative differentiation between T1a and ≥T1b gallbladder cancer: combined interpretation of high-resolution ultrasound and multidetector-row computed tomography

International Nuclear Information System (INIS)

Joo, Ijin; Baek, Jee Hyun; Kim, Jung Hoon; Han, Joon Koo; Choi, Byung Ihn; Lee, Jae Young; Park, Hee Sun

2014-01-01

To determine the diagnostic value of combined interpretation of high-resolution ultrasound (HRUS) and multidetector-row computed tomography (MDCT) for preoperative differentiation between T1a and ≥T1b gallbladder (GB) cancer. Eighty-seven patients with pathologically confirmed GB cancers (T1a, n = 15; ≥T1b, n = 72), who preoperatively underwent both HRUS and MDCT, were included in this retrospective study. Two reviewers independently determined the T-stages of the GB cancers on HRUS and MDCT using a five-point confidence scale (5, definitely T1a; 1, definitely ≥T1b). For individual modality interpretation, the lesions with scores ≥4 were classified as T1a, and, for combined modality interpretation, the lesions with all scores ≥4 in both modalities were classified as T1a. The McNemar test was used to compare diagnostic performance. The diagnostic accuracy of differentiation between T1a and ≥T1b GB cancer was higher using combined interpretation (90.8 % and 88.5 % for reviewers 1 and 2, respectively) than individual interpretation of HRUS (82.8 % and 83.9 %) or MDCT (74.7 % and 82.8 %) (P < 0.05, reviewer 1). Combined interpretations demonstrated 100 % specificity for both reviewers, which was significantly higher than individual interpretations (P < 0.05, both reviewers). Combined HRUS and MDCT interpretation may improve the diagnostic accuracy and specificity for differentiating between T1a and ≥T1b GB cancers. circle Differentiating between T1a and ≥T1b gallbladder cancer can help surgical planning. (orig.)
The Effects of Video Game Experience on Computer-Based Air Traffic Controller Specialist, Air Traffic Scenario Test Scores.

Science.gov (United States)

1997-02-01

application with a strong resemblance to a video game , concern has been raised that prior video game experience might have a moderating effect on scores. Much...such as spatial ability. The effects of computer or video game experience on work sample scores have not been systematically investigated. The purpose...of this study was to evaluate the incremental validity of prior video game experience over that of general aptitude as a predictor of work sample test
Interpretations of Tracer Tests Performed in the Culebra Dolomite at the Waste Isolation Pilot Plant Site

Energy Technology Data Exchange (ETDEWEB)

MEIGS,LUCY C.; BEAUHEIM,RICHARD L.; JONES,TOYA L.

2000-08-01

This report provides (1) an overview of all tracer testing conducted in the Culebra Dolomite Member of the Rustler Formation at the Waste Isolation Pilot Plant (WPP) site, (2) a detailed description of the important information about the 1995-96 tracer tests and the current interpretations of the data, and (3) a summary of the knowledge gained to date through tracer testing in the Culebra. Tracer tests have been used to identify transport processes occurring within the Culebra and quantify relevant parameters for use in performance assessment of the WIPP. The data, especially those from the tests performed in 1995-96, provide valuable insight into transport processes within the Culebra. Interpretations of the tracer tests in combination with geologic information, hydraulic-test information, and laboratory studies have resulted in a greatly improved conceptual model of transport processes within the Culebra. At locations where the transmissivity of the Culebra is low (< 4 x 10{sup -6} m{sup 2}/s), we conceptualize the Culebra as a single-porosity medium in which advection occurs largely through the primary porosity of the dolomite matrix. At locations where the transmissivity of the Culebra is high (> 4 x 10{sup -6} m{sup 2}/s), we conceptualize the Culebra as a heterogeneous, layered, fractured medium in which advection occurs largely through fractures and solutes diffuse between fractures and matrix at multiple rates. The variations in diffusion rate can be attributed to both variations in fracture spacing (or the spacing of advective pathways) and matrix heterogeneity. Flow and transport appear to be concentrated in the lower Culebra. At all locations, diffusion is the dominant transport process in the portions of the matrix that tracer does not access by flow.
NCACO-score: An effective main-chain dependent scoring function for structure modeling

Directory of Open Access Journals (Sweden)

Dong Xiaoxi

2011-05-01

Full Text Available Abstract Background Development of effective scoring functions is a critical component to the success of protein structure modeling. Previously, many efforts have been dedicated to the development of scoring functions. Despite these efforts, development of an effective scoring function that can achieve both good accuracy and fast speed still presents a grand challenge. Results Based on a coarse-grained representation of a protein structure by using only four main-chain atoms: N, Cα, C and O, we develop a knowledge-based scoring function, called NCACO-score, that integrates different structural information to rapidly model protein structure from sequence. In testing on the Decoys'R'Us sets, we found that NCACO-score can effectively recognize native conformers from their decoys. Furthermore, we demonstrate that NCACO-score can effectively guide fragment assembly for protein structure prediction, which has achieved a good performance in building the structure models for hard targets from CASP8 in terms of both accuracy and speed. Conclusions Although NCACO-score is developed based on a coarse-grained model, it is able to discriminate native conformers from decoy conformers with high accuracy. NCACO is a very effective scoring function for structure modeling.
Derivation and Cross-Validation of Cutoff Scores for Patients With Schizophrenia Spectrum Disorders on WAIS-IV Digit Span-Based Performance Validity Measures.

Science.gov (United States)

Glassmire, David M; Toofanian Ross, Parnian; Kinney, Dominique I; Nitch, Stephen R

2016-06-01

Two studies were conducted to identify and cross-validate cutoff scores on the Wechsler Adult Intelligence Scale-Fourth Edition Digit Span-based embedded performance validity (PV) measures for individuals with schizophrenia spectrum disorders. In Study 1, normative scores were identified on Digit Span-embedded PV measures among a sample of patients (n = 84) with schizophrenia spectrum diagnoses who had no known incentive to perform poorly and who put forth valid effort on external PV tests. Previously identified cutoff scores resulted in unacceptable false positive rates and lower cutoff scores were adopted to maintain specificity levels ≥90%. In Study 2, the revised cutoff scores were cross-validated within a sample of schizophrenia spectrum patients (n = 96) committed as incompetent to stand trial. Performance on Digit Span PV measures was significantly related to Full Scale IQ in both studies, indicating the need to consider the intellectual functioning of examinees with psychotic spectrum disorders when interpreting scores on Digit Span PV measures. © The Author(s) 2015.
The interpretation of diagnostic tests

International Nuclear Information System (INIS)

Lamk, M.; Lamki, M.D.

1987-01-01

The progress of nuclear and other diagnostic imaging is near rampant. With almost every issue of the major journals in this field, a new diagnostic test, or at least a new utility of an old test is described. Before we accept these innovations, we have to have a clear understanding of the clinical performance of the test. The major criteria are the sensitivity and the specificity of the test. From these derived other statistical parameters such as accuracy or efficiency of that test; also, the receiver operating characteristic (ROC) curves may then be evaluated and used in comparison of different tests. When we know the prevalence of the disease tested in the population we are investigating, we can then derive the predictive value of a positive or a negative result. This introduction tries to explain these parameters to help the reader understand the literature dealing with the subject of efficacy of imaging procedures. It is not intended as a critical review of the literature on the subject or a comprehensive overview of the subject matter. The benefit derived from explaination of statistical concepts to physicians is documented in a recent publication. Explaination of these basic statistical parameters will be followed by a demonstration of the utility of multiple testing with these parameters. The reader is thereby introduced to relevant statistical concepts that must be grasped for full comprehension of published results of a new diagnostic imaging modality, or before clinical decision making
Effects of Public Preschool Expenditures on the Test Scores of 4th Graders: Evidence from TIMSS

Science.gov (United States)

Waldfogel, Jane; Zhai, Fuhua

2011-01-01

This study examines the effects of public preschool expenditures on the math and science scores of 4th graders, holding constant child, family, and school characteristics, other relevant social expenditures, and country and year effects, in seven Organization for Economic Co-operation and Development (OECD) countries -- Australia, Japan, Netherlands, New Zealand, Norway, U.K., and U.S -- using data from the 1995 and 2003 Trends in International Mathematics and Science Study (TIMSS). Our results indicate that there are small but significant positive effects of public preschool expenditures on the math and science scores of 4th graders and preschool expenditures reduce the risk of children scoring at the low level of proficiency. We also find some evidence that children from low-resource homes and homes where the test language is not always spoken may tend to gain more from increased public preschool expenditures than other children,. PMID:21442008
Are WISC IQ scores in children with mathematical learning disabilities underestimated? The influence of a specialized intervention on test performance.

Science.gov (United States)

Lambert, Katharina; Spinath, Birgit

2018-01-01

Intelligence measures play a pivotal role in the diagnosis of mathematical learning disabilities (MLD). Probably as a result of math-related material in IQ tests, children with MLD often display reduced IQ scores. However, it remains unclear whether the effects of math remediation extend to IQ scores. The present study investigated the impact of a special remediation program compared to a control group receiving private tutoring (PT) on the WISC IQ scores of children with MLD. We included N=45 MLD children (7-12 years) in a study with a pre- and post-test control group design. Children received remediation for two years on average. The analyses revealed significantly greater improvements in the experimental group on the Full-Scale IQ, and the Verbal Comprehension, Perceptual Reasoning, and Working Memory indices, but not Processing Speed, compared to the PT group. Children in the experimental group showed an average WISC IQ gain of more than ten points. Results indicate that the WISC IQ scores of MLD children might be underestimated and that an effective math intervention can improve WISC IQ test performance. Taking limitations into account, we discuss the use of IQ measures more generally for defining MLD in research and practice. Copyright © 2017 Elsevier Ltd. All rights reserved.
Interpretation of the gamma interferon test for diagnosis of subclinical paratuberculosis in cattle

DEFF Research Database (Denmark)

Jungersen, Gregers; Huda, A.; Hansen, J.J.

2002-01-01

A group of 252 cattle without clinical signs of paratuberculosis (paraTB) in 10 herds infected with paraTB and a group of 117 cattle in 5 herds without paraTB were selected. Whole-blood samples were stimulated with bovine, avian, and johnin purified protein derivative (PPD) and examined for gamma...... interferon (IFN-gamma) release. For diagnosis of paraTB, satisfactory estimated specificities (95 to 99%) could be obtained by johnin PPD stimulation irrespective of interpretation relative to bovine PPD or no-antigen stimulation alone, but numbers of test positives in the infected herds varied from 64...
Causal Indicators Can Help to Interpret Factors

Science.gov (United States)

Bentler, Peter M.

2016-01-01

The latent factor in a causal indicator model is no more than the latent factor of the factor part of the model. However, if the causal indicator variables are well-understood and help to improve the prediction of individuals' factor scores, they can help to interpret the meaning of the latent factor. Aguirre-Urreta, Rönkkö, and Marakas (2016)…
The Impact of Scholastic Instrumental Music and Scholastic Chess Study on the Standardized Test Scores of Students in Grades Three, Four, and Five

Science.gov (United States)

Martinez, Edwin E.

2012-01-01

This study examines the impact of instrumental music study and group chess lessons on the standardized test scores of suburban elementary public school students (grades three through five) in Levittown, New York. The study divides the students into the following groups and compares the standardized test scores of each: a) instrumental music…
Equating error in observed-score equating

NARCIS (Netherlands)

van der Linden, Willem J.

2006-01-01

Traditionally, error in equating observed scores on two versions of a test is defined as the difference between the transformations that equate the quantiles of their distributions in the sample and population of test takers. But it is argued that if the goal of equating is to adjust the scores of
Use of Automated Scoring in Spoken Language Assessments for Test Takers with Speech Impairments. Research Report. ETS RR-17-42

Science.gov (United States)

Loukina, Anastassia; Buzick, Heather

2017-01-01

This study is an evaluation of the performance of automated speech scoring for speakers with documented or suspected speech impairments. Given that the use of automated scoring of open-ended spoken responses is relatively nascent and there is little research to date that includes test takers with disabilities, this small exploratory study focuses…
A simple approximation of productivity scores of fuzzy production plans

DEFF Research Database (Denmark)

Hougaard, Jens Leth

2005-01-01

This paper suggests a simple approximation procedure for the assessment of productivity scores with respect to fuzzy production plans. The procedure has a clear economic interpretation and all the necessary calculations can be performed in a spreadsheet making it highly operational...
Characterization of Rock Mechanical Properties Using Lab Tests and Numerical Interpretation Model of Well Logs

Directory of Open Access Journals (Sweden)

Hao Xu

2016-01-01

Full Text Available The tight gas reservoir in the fifth member of the Xujiahe formation contains heterogeneous interlayers of sandstone and shale that are low in both porosity and permeability. Elastic characteristics of sandstone and shale are analyzed in this study based on petrophysics tests. The tests indicate that sandstone and mudstone samples have different stress-strain relationships. The rock tends to exhibit elastic-plastic deformation. The compressive strength correlates with confinement pressure and elastic modulus. The results based on thin-bed log interpretation match dynamic Young’s modulus and Poisson’s ratio predicted by theory. The compressive strength is calculated from density, elastic impedance, and clay contents. The tensile strength is calibrated using compressive strength. Shear strength is calculated with an empirical formula. Finally, log interpretation of rock mechanical properties is performed on the fifth member of the Xujiahe formation. Natural fractures in downhole cores and rock microscopic failure in the samples in the cross section demonstrate that tensile fractures were primarily observed in sandstone, and shear fractures can be observed in both mudstone and sandstone. Based on different elasticity and plasticity of different rocks, as well as the characteristics of natural fractures, a fracture propagation model was built.

REPRODUCIBILITY OF THE MODIFIED STAR EXCURSION BALANCE TEST COMPOSITE AND SPECIFIC REACH DIRECTION SCORES.

Science.gov (United States)

van Lieshout, Remko; Reijneveld, Elja A E; van den Berg, Sandra M; Haerkens, Gijs M; Koenders, Niek H; de Leeuw, Arina J; van Oorsouw, Roel G; Paap, Davy; Scheffer, Else; Weterings, Stijn; Stukstette, Mirelle J

2016-06-01

The mSEBT is a screening tool used to evaluate dynamic balance. Most research investigating measurement properties focused on intrarater reliability and was done in small samples. To know whether the mSEBT is useful to discriminate dynamic balance between persons and to evaluate changes in dynamic balance, more research into intra- and interrater reliability and smallest detectable change (synonymous with minimal detectable change) is needed. To estimate intra- and interrater reliability and smallest detectable change of the mSEBT in adults at risk for ankle sprain. Cross-sectional, test-retest design. Fifty-five healthy young adults participating in sports at risk for ankle sprain participated (mean ± SD age, 24.0 ± 2.9 years). Each participant performed three test sessions within one hour and was rated by two physical therapists (session 1, rater 1; session 2, rater 2; session 3, rater 1). Participants and raters were blinded for previous measurements. Normalized composite and reach direction scores for the right and left leg were collected. Analysis of variance was used to calculate intraclass correlation coefficient values for intra- and interrater reliability. Smallest detectable change values were calculated based on the standard error of measurement. Intra- and interrater reliability for both legs was good to excellent (intraclass correlation coefficient ranging from 0.87 to 0.94). The intrarater smallest detectable change for the composite score of the right leg was 7.2% and for the left 6.2%. The interrater smallest detectable change for the composite score of the right leg was 6.9% and for the left 5.0%. The mSEBT is a reliable measurement instrument to discriminate dynamic balance between persons. Most smallest detectable change values of the mSEBT appear to be large. More research is needed to investigate if the mSEBT is usable for evaluative purposes. Level 2.
Polytrauma Defined by the New Berlin Definition: A Validation Test Based on Propensity-Score Matching Approach.

Science.gov (United States)

Rau, Cheng-Shyuan; Wu, Shao-Chun; Kuo, Pao-Jen; Chen, Yi-Chun; Chien, Peng-Chen; Hsieh, Hsiao-Yun; Hsieh, Ching-Hua

2017-09-11

Background: Polytrauma patients are expected to have a higher risk of mortality than that obtained by the summation of expected mortality owing to their individual injuries. This study was designed to investigate the outcome of patients with polytrauma, which was defined using the new Berlin definition, as cases with an Abbreviated Injury Scale (AIS) ≥ 3 for two or more different body regions and one or more additional variables from five physiologic parameters (hypotension [systolic blood pressure ≤ 90 mmHg], unconsciousness [Glasgow Coma Scale score ≤ 8], acidosis [base excess ≤ -6.0], coagulopathy [partial thromboplastin time ≥ 40 s or international normalized ratio ≥ 1.4], and age [≥70 years]). Methods: We retrieved detailed data on 369 polytrauma patients and 1260 non-polytrauma patients with an overall Injury Severity Score (ISS) ≥ 18 who were hospitalized between 1 January 2009 and 31 December 2015 for the treatment of all traumatic injuries, from the Trauma Registry System at a level I trauma center. Patients with burn injury or incomplete registered data were excluded. Categorical data were compared with two-sided Fisher exact or Pearson chi-square tests. The unpaired Student t -test and the Mann-Whitney U -test was used to analyze normally distributed continuous data and non-normally distributed data, respectively. Propensity-score matched cohort in a 1:1 ratio was allocated using the NCSS software with logistic regression to evaluate the effect of polytrauma on patient outcomes. Results: The polytrauma patients had a significantly higher ISS than non-polytrauma patients (median (interquartile range Q1-Q3), 29 (22-36) vs. 24 (20-25), respectively; p Polytrauma patients had a 1.9-fold higher odds of mortality than non-polytrauma patients (95% CI 1.38-2.49; p polytrauma patients, polytrauma patients had a substantially longer hospital length of stay (LOS). In addition, a higher proportion of polytrauma patients were admitted to the intensive
The quantitative LOD score: test statistic and sample size for exclusion and linkage of quantitative traits in human sibships.

Science.gov (United States)

Page, G P; Amos, C I; Boerwinkle, E

1998-04-01

We present a test statistic, the quantitative LOD (QLOD) score, for the testing of both linkage and exclusion of quantitative-trait loci in randomly selected human sibships. As with the traditional LOD score, the boundary values of 3, for linkage, and -2, for exclusion, can be used for the QLOD score. We investigated the sample sizes required for inferring exclusion and linkage, for various combinations of linked genetic variance, total heritability, recombination distance, and sibship size, using fixed-size sampling. The sample sizes required for both linkage and exclusion were not qualitatively different and depended on the percentage of variance being linked or excluded and on the total genetic variance. Information regarding linkage and exclusion in sibships larger than size 2 increased as approximately all possible pairs n(n-1)/2 up to sibships of size 6. Increasing the recombination (theta) distance between the marker and the trait loci reduced empirically the power for both linkage and exclusion, as a function of approximately (1-2theta)4.
Predicting Freshman Grade Point Average From College Admissions Test Scores and State High School Test Scores

Directory of Open Access Journals (Sweden)

Daniel Koretz

2016-09-01

Full Text Available The current focus on assessing “college and career readiness” raises an empirical question: How do high school tests compare with college admissions tests in predicting performance in college? We explored this using data from the City University of New York and public colleges in Kentucky. These two systems differ in the choice of college admissions test, the stakes for students on the high school test, and demographics. We predicted freshman grade point average (FGPA from high school GPA and both college admissions and high school tests in mathematics and English. In both systems, the choice of tests had only trivial effects on the aggregate prediction of FGPA. Adding either test to an equation that included the other had only trivial effects on prediction. Although the findings suggest that the choice of test might advantage or disadvantage different students, it had no substantial effect on the over- and underprediction of FGPA for students classified by race-ethnicity or poverty.
Evaluation of airway protection: Quantitative timing measures versus penetration/aspiration score.

Science.gov (United States)

Kendall, Katherine A

2017-10-01

Quantitative measures of swallowing function may improve the reliability and accuracy of modified barium swallow (MBS) study interpretation. Quantitative study analysis has not been widely instituted, however, secondary to concerns about the time required to make measures and a lack of research demonstrating impact on MBS interpretation. This study compares the accuracy of the penetration/aspiration (PEN/ASP) scale (an observational visual-perceptual assessment tool) to quantitative measures of airway closure timing relative to the arrival of the bolus at the upper esophageal sphincter in identifying a failure of airway protection during deglutition. Retrospective review of clinical swallowing data from a university-based outpatient clinic. Swallowing data from 426 patients were reviewed. Patients with normal PEN/ASP scores were identified, and the results of quantitative airway closure timing measures for three liquid bolus sizes were evaluated. The incidence of significant airway closure delay with and without a normal PEN/ASP score was determined. Inter-rater reliability for the quantitative measures was calculated. In patients with a normal PEN/ASP score, 33% demonstrated a delay in airway closure on at least one swallow during the MBS study. There was no correlation between PEN/ASP score and airway closure delay. Inter-rater reliability for the quantitative measure of airway closure timing was nearly perfect (intraclass correlation coefficient = 0.973). The use of quantitative measures of swallowing function, in conjunction with traditional visual perceptual methods of MBS study interpretation, improves the identification of airway closure delay, and hence, potential aspiration risk, even when no penetration or aspiration is apparent on the MBS study. 4. Laryngoscope, 127:2314-2318, 2017. © 2017 The American Laryngological, Rhinological and Otological Society, Inc.
The Score Reliability of Draw-a-Person Intellectual Ability Test (DAP: IQ) for Rural Malawi Students

Science.gov (United States)

Khasu, Denis S.; Williams, Thomas O., Jr.

2016-01-01

In this brief article, the reliability of scores for the Draw-A-Person Intellectual Ability Test for Children, Adolescents, and Adults (DAP: IQ; Reynolds & Hickman, 2004) was examined through several analyses with a sample of 147 children from rural Malawi, Africa using a Chichewa translation of instructions. Cronbach alpha coefficients for…
Automated Cervical Screening and Triage, Based on HPV Testing and Computer-Interpreted Cytology.

Science.gov (United States)

Yu, Kai; Hyun, Noorie; Fetterman, Barbara; Lorey, Thomas; Raine-Bennett, Tina R; Zhang, Han; Stamps, Robin E; Poitras, Nancy E; Wheeler, William; Befano, Brian; Gage, Julia C; Castle, Philip E; Wentzensen, Nicolas; Schiffman, Mark

2018-04-11

State-of-the-art cervical cancer prevention includes human papillomavirus (HPV) vaccination among adolescents and screening/treatment of cervical precancer (CIN3/AIS and, less strictly, CIN2) among adults. HPV testing provides sensitive detection of precancer but, to reduce overtreatment, secondary "triage" is needed to predict women at highest risk. Those with the highest-risk HPV types or abnormal cytology are commonly referred to colposcopy; however, expert cytology services are critically lacking in many regions. To permit completely automatable cervical screening/triage, we designed and validated a novel triage method, a cytologic risk score algorithm based on computer-scanned liquid-based slide features (FocalPoint, BD, Burlington, NC). We compared it with abnormal cytology in predicting precancer among 1839 women testing HPV positive (HC2, Qiagen, Germantown, MD) in 2010 at Kaiser Permanente Northern California (KPNC). Precancer outcomes were ascertained by record linkage. As additional validation, we compared the algorithm prospectively with cytology results among 243 807 women screened at KPNC (2016-2017). All statistical tests were two-sided. Among HPV-positive women, the algorithm matched the triage performance of abnormal cytology. Combined with HPV16/18/45 typing (Onclarity, BD, Sparks, MD), the automatable strategy referred 91.7% of HPV-positive CIN3/AIS cases to immediate colposcopy while deferring 38.4% of all HPV-positive women to one-year retesting (compared with 89.1% and 37.4%, respectively, for typing and cytology triage). In the 2016-2017 validation, the predicted risk scores strongly correlated with cytology (P < .001). High-quality cervical screening and triage performance is achievable using this completely automated approach. Automated technology could permit extension of high-quality cervical screening/triage coverage to currently underserved regions.
Interpreting the Customary Rules on Interpretation

NARCIS (Netherlands)

Merkouris, Panos

2017-01-01

International courts have at times interpreted the customary rules on interpretation. This is interesting because what is being interpreted is: i) rules of interpretation, which sounds dangerously tautological, and ii) customary law, the interpretation of which has not been the object of critical
Timed up & go test score in patients with hip fracture is related to the type of walking aid

DEFF Research Database (Denmark)

Kristensen, Morten T; Bandholm, Thomas; Holm, Bente

2009-01-01

Kristensen MT, Bandholm T, Holm B, Ekdahl C, Kehlet H. Timed Up & Go test score in patients with hip fracture is related to the type of walking aid. OBJECTIVE: To determine the relationship between Timed Up & Go (TUG) test scores and type of walking aid used during the test, and to determine...... the feasibility of using the rollator as a standardized walking aid during the TUG in patients with hip fracture who were allowed full weight-bearing (FWB). DESIGN: Prospective methodological study. SETTING: An acute orthopedic hip fracture unit at a university hospital. PARTICIPANTS: Patients (N=126; 90 women......, 36 men) with hip fracture with a mean age +/- SD of 74.8+/-12.7 years performed the TUG the day before discharge from the orthopedic ward. INTERVENTIONS: Not applicable. MAIN OUTCOME MEASURES: The TUG was performed with the walking aid the patient was to be discharged with: a walker (n=88) or elbow...
Zero Calcium Score as a Filter for Further Testing in Patients Admitted to the Coronary Care Unit with Chest Pain.

Science.gov (United States)

Correia, Luis Cláudio Lemos; Esteves, Fábio P; Carvalhal, Manuela; Souza, Thiago Menezes Barbosa de; Sá, Nicole de; Correia, Vitor Calixto de Almeida; Alexandre, Felipe Kalil Beirão; Lopes, Fernanda; Ferreira, Felipe; Noya-Rabelo, Márcia

2017-06-12

The accuracy of zero coronary calcium score as a filter in patients with chest pain has been demonstrated at the emergency room and outpatient clinics, populations with low prevalence of coronary artery disease (CAD). To test the gatekeeping role of zero calcium score in patients with chest pain admitted to the coronary care unit (CCU), where the pretest probability of CAD is higher than that of other populations. Patients underwent computed tomography for calcium scoring, and obstructive CAD was defined by a minimum 70% stenosis on invasive angiography. In 146 patients studied, the prevalence of CAD was 41%. A zero calcium score was present in 35% of the patients. The sensitivity and specificity of zero calcium score yielded a negative likelihood ratio of 0.16. After logistic regression adjustment for pretest probability, zero calcium score was independently associated with lower odds of CAD (OR = 0.12, 95%CI = 0.04-0.36), increasing the area under the ROC curve of the clinical model from 0.76 to 0.82 (p = 0.006). Zero calcium score provided a net reclassification improvement of 0.20 (p = 0.0018) over the clinical model when using a pretest probability threshold of 10% for discharging without further testing. In patients with pretest probability zero calcium score had a negative predictive value of 95% (95%CI = 83%-99%), with a number needed to test of 2.1 for obtaining one additional discharge. Zero calcium score substantially reduces the pretest probability of obstructive CAD in patients admitted to the CCU with acute chest pain. (Arq Bras Cardiol. 2017; [online].ahead print, PP.0-0). A acurácia do escore de cálcio coronário zero como um filtro nos pacientes com dor torácica aguda tem sido demonstrada na sala de emergência e nos ambulatórios, populações com baixa prevalência de doença arterial coronariana (DAC). Testar o papel do escore de cálcio zero como filtro nos pacientes com dor torácica admitidos numa unidade coronariana intensiva (UCI), na
Extension of the lod score: the mod score.

Science.gov (United States)

Clerget-Darpoux, F

2001-01-01

In 1955 Morton proposed the lod score method both for testing linkage between loci and for estimating the recombination fraction between them. If a disease is controlled by a gene at one of these loci, the lod score computation requires the prior specification of an underlying model that assigns the probabilities of genotypes from the observed phenotypes. To address the case of linkage studies for diseases with unknown mode of inheritance, we suggested (Clerget-Darpoux et al., 1986) extending the lod score function to a so-called mod score function. In this function, the variables are both the recombination fraction and the disease model parameters. Maximizing the mod score function over all these parameters amounts to maximizing the probability of marker data conditional on the disease status. Under the absence of linkage, the mod score conforms to a chi-square distribution, with extra degrees of freedom in comparison to the lod score function (MacLean et al., 1993). The mod score is asymptotically maximum for the true disease model (Clerget-Darpoux and Bonaïti-Pellié, 1992; Hodge and Elston, 1994). Consequently, the power to detect linkage through mod score will be highest when the space of models where the maximization is performed includes the true model. On the other hand, one must avoid overparametrization of the model space. For example, when the approach is applied to affected sibpairs, only two constrained disease model parameters should be used (Knapp et al., 1994) for the mod score maximization. It is also important to emphasize the existence of a strong correlation between the disease gene location and the disease model. Consequently, there is poor resolution of the location of the susceptibility locus when the disease model at this locus is unknown. Of course, this is true regardless of the statistics used. The mod score may also be applied in a candidate gene strategy to model the potential effect of this gene in the disease. Since, however, it
Developing a Cloze Procedure as a Reading Comprehension Achievement Test

Directory of Open Access Journals (Sweden)

I Ketut Seken

2004-01-01

Full Text Available The project was concerned with developing a cloze procedure as a reading comprehension achievement test. The subjects were students of the English Department of the Faculty of Letters, State University of Malang, who were halfway in the semester to complete Reading II course. The test was planned and constructed on the foundation of existing theory of cloze test construction. A review of theory concerning reading comprehension, testing reading comprehension, and cloze testing led to the construction of the test, including the decision concerning how to score the test and to interpret the scores. Using a class of 28 students, the test was tried out a week after the mid-semester test was administered by the Reading II teacher. It was found that the test is sufficienty reliable on the basis of a reliability coefficient of .79 through split-half procedure and a coefficient value of .78 by K-R 20. The test also showed high inter-section correlation. The validity of the test was viewed in terms of face, content, and construct. The test scores correlate moderately with those obtained from the mid-semester test by the teacher. Some problems are discussed and a suggestion made with regard to a possible solution to these problems.
Characterizing neuropathic pain profiles: enriching interpretation of painDETECT

Directory of Open Access Journals (Sweden)

Cappelleri JC

2016-07-01

. Conclusion: painDETECT differentiates pain profiles across the range of scores such that, for a particular score, the probability of experiencing at least a moderate sensation of each symptom was determined and compared. These results can help characterize NeP symptomatology, enrich interpretation of painDETECT scores, and provide a basis for individualizing NeP management. Keywords: neuropathic pain, painDETECT, sensory symptoms, pain profile, interpretation, patient-reported outcomes
Evaluation of Two Methods for Modeling Measurement Errors When Testing Interaction Effects with Observed Composite Scores

Science.gov (United States)

Hsiao, Yu-Yu; Kwok, Oi-Man; Lai, Mark H. C.

2018-01-01

Path models with observed composites based on multiple items (e.g., mean or sum score of the items) are commonly used to test interaction effects. Under this practice, researchers generally assume that the observed composites are measured without errors. In this study, we reviewed and evaluated two alternative methods within the structural…
Normal variability of children's scaled scores on subtests of the Dutch Wechsler Preschool and Primary scale of Intelligence - third edition.

Science.gov (United States)

Hurks, P P M; Hendriksen, J G M; Dek, J E; Kooij, A P

2013-01-01

Intelligence tests are included in millions of assessments of children and adults each year (Watkins, Glutting, & Lei, 2007a , Applied Neuropsychology, 14, 13). Clinicians often interpret large amounts of subtest scatter, or large differences between the highest and lowest scaled subtest scores, on an intelligence test battery as an index for abnormality or cognitive impairment. The purpose of the present study is to characterize "normal" patterns of variability among subtests of the Dutch Wechsler Preschool and Primary Scale of Intelligence - Third Edition (WPPSI-III-NL; Wechsler, 2010 ). Therefore, the frequencies of WPPSI-III-NL scaled subtest scatter were reported for 1039 healthy children aged 4:0-7:11 years. Results indicated that large differences between highest and lowest scaled subtest scores (or subtest scatter) were common in this sample. Furthermore, degree of subtest scatter was related to: (a) the magnitude of the highest scaled subtest score, i.e., more scatter was seen in children with the highest WPPSI-III-NL scaled subtest scores, (b) Full Scale IQ (FSIQ) scores, i.e., higher FSIQ scores were associated with an increase in subtest scatter, and (c) sex differences, with boys showing a tendency to display more scatter than girls. In conclusion, viewing subtest scatter as an index for abnormality in WPPSI-III-NL scores is an oversimplification as this fails to recognize disparate subtest heterogeneity that occurs within a population of healthy children aged 4:0-7:11 years.
A comparison of likelihood ratio tests and Rao's score test for three separable covariance matrix structures.

Science.gov (United States)

Filipiak, Katarzyna; Klein, Daniel; Roy, Anuradha

2017-01-01

The problem of testing the separability of a covariance matrix against an unstructured variance-covariance matrix is studied in the context of multivariate repeated measures data using Rao's score test (RST). The RST statistic is developed with the first component of the separable structure as a first-order autoregressive (AR(1)) correlation matrix or an unstructured (UN) covariance matrix under the assumption of multivariate normality. It is shown that the distribution of the RST statistic under the null hypothesis of any separability does not depend on the true values of the mean or the unstructured components of the separable structure. A significant advantage of the RST is that it can be performed for small samples, even smaller than the dimension of the data, where the likelihood ratio test (LRT) cannot be used, and it outperforms the standard LRT in a number of contexts. Monte Carlo simulations are then used to study the comparative behavior of the null distribution of the RST statistic, as well as that of the LRT statistic, in terms of sample size considerations, and for the estimation of the empirical percentiles. Our findings are compared with existing results where the first component of the separable structure is a compound symmetry (CS) correlation matrix. It is also shown by simulations that the empirical null distribution of the RST statistic converges faster than the empirical null distribution of the LRT statistic to the limiting χ 2 distribution. The tests are implemented on a real dataset from medical studies. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
The Implementation of Role-Playing Model in Principles of Finance Accounting Learning to Improve Students’ Enjoyment and Students’ Test Scores

Directory of Open Access Journals (Sweden)

L. Saptono

2010-01-01

Full Text Available This research is a classroom action research. The goal of conducting this research is to improve students’ enjoyment level and their test scores by implementing role-playing method. The research is conducted in Accounting Education Study Program of Sanata Dharma University at odd semester on academic year 2010/2011. The participants were divided into two classes. The first class was the class that got the treatment, while the second class was the control class. The result of the study showed that there was an improvement of students’ enjoyment level and test scores in the class which implemented role-playing method.
Cardiac Society of Australia and New Zealand Position Statement: Coronary Artery Calcium Scoring.

Science.gov (United States)

Liew, Gary; Chow, Clara; van Pelt, Niels; Younger, John; Jelinek, Michael; Chan, Jonathan; Hamilton-Craig, Christian

2017-12-01

Coronary Artery Calcium Scoring (CAC) is a non-invasive quantitation of coronary artery calcification using computed tomography (CT). It is a marker of atherosclerotic plaque burden and an independent predictor of future myocardial infarction and mortality. Coronary Artery Calcium Scoring provides incremental risk information beyond traditional risk calculators (eg. Framingham Risk Score). Its use for risk stratification is confined to primary prevention of cardiovascular events, and can be considered as "individualised coronary risk scoring" for those not considered to be of high or low risk. Medical practitioners should carefully counsel patients prior to CAC. Coronary Artery Calcium Scoring should only be undertaken if an alteration in therapy including embarking on pharmacotherapy is being considered based on the test result. Patient Groups to Consider Coronary Calcium Scoring: Patient Groups in Whom Coronary Calcium Scoring Should Not be Considered: Coronary Artery Calcium Scoring is not recommended for patients who are: Interpretation of CAC CAC=0 A zero score confers a very low risk of death, 75th centile. Moderately high risk, 15-20% CAC >400 High risk, >20% Management Recommendations Based on CAC Optimal diet and lifestyle measures are encouraged in all risk groups and form the basis of primary prevention strategies. Patients with moderately-high or high risk based on CAC score are recommended to receive preventative medical therapy such as aspirin and statins. The evidence for pharmacotherapy is less robust in patients at intermediate levels of CAC 100-400, with modest benefit for aspirin use; though statins may be reasonable if they are above 75th centile. Aspirin and statins are generally not recommended in patients with CAC calcium score, routine re-scanning is not currently recommended. However, an annual increase in CAC of >15% or annual increase of CAC >100 units are predictive of future myocardial infarction and mortality. Cost Effectiveness of CAC
Heart valve surgery: EuroSCORE vs. EuroSCORE II vs. Society of Thoracic Surgeons score

Directory of Open Access Journals (Sweden)

Muhammad Sharoz Rabbani

2014-12-01

Full Text Available Background This is a validation study comparing the European System for Cardiac Operative Risk Evaluation (EuroSCORE II with the previous additive (AES and logistic EuroSCORE (LES and the Society of Thoracic Surgeons’ (STS risk prediction algorithm, for patients undergoing valve replacement with or without bypass in Pakistan. Patients and Methods Clinical data of 576 patients undergoing valve replacement surgery between 2006 and 2013 were retrospectively collected and individual expected risks of death were calculated by all four risk prediction algorithms. Performance of these risk algorithms was evaluated in terms of discrimination and calibration. Results There were 28 deaths (4.8% among 576 patients, which was lower than the predicted mortality of 5.16%, 6.96% and 4.94% by AES, LES and EuroSCORE II but was higher than 2.13% predicted by STS scoring system. For single and double valve replacement procedures, EuroSCORE II was the best predictor of mortality with highest Hosmer and Lemmeshow test (H-L p value (0.346 to 0.689 and area under the receiver operating characteristic (ROC curve (0.637 to 0.898. For valve plus concomitant coronary artery bypass grafting (CABG patients actual mortality was 1.88%. STS calculator came out to be the best predictor of mortality for this subgroup with H-L p value (0.480 to 0.884 and ROC (0.657 to 0.775. Conclusions For Pakistani population EuroSCORE II is an accurate predictor for individual operative risk in patients undergoing isolated valve surgery, whereas STS performs better in the valve plus CABG group.
Interpretive criteria of antimicrobial disk susceptibility tests with flomoxef.

Science.gov (United States)

Grimm, H

1991-01-01

320 recently isolated pathogens, 20 strains from each of 16 species, were investigated using Mueller-Hinton agar and DIN as well as NCCLS standards. The geometric mean of the agar dilution MICs of flomoxef were 0.44 mg/l for Staphylococcus aureus, 0.05 mg/l (Klebsiella oxytoca) to 12.6 mg/l (Enterobacter spp.) for enterobacteriaceae, 33.1 mg/l for Acinetobacter anitratus, 64 mg/l for Enterococcus faecalis, and more than 256 mg/l for Pseudomonas aeruginosa. For disk susceptibility testing of flomoxef a 30 micrograms disk loading and the following interpretation of inhibition zones using the DIN method were recommended: resistant-up to 22 mm (corresponding to MICs of 8 mg/l or more), moderately susceptible-23 to 29 mm (corresponding to MICs from 1 to 4 mg/l), and susceptible-30 mm or more (corresponding to MICs of 0.5 mg/l or less). The respective values for the NCCLS method using the American high MIC breakpoints are: resistant--up to 14 mm (corresponding to MICs of 32 mg/l or more), moderately susceptible--15 to 17 mm (corresponding to MICs of 16 mg/l), and susceptible--18 mm or more (corresponding to MICs of 8 mg/l or less).

Interpretation of coagulation test results using a web-based reporting system.

Science.gov (United States)

Quesada, Andres E; Jabcuga, Christine E; Nguyen, Alex; Wahed, Amer; Nedelcu, Elena; Nguyen, Andy N D

2014-01-01

Web-based synoptic reporting has been successfully integrated into diverse fields of pathology, improving efficiency and reducing typographic errors. Coagulation is a challenging field for practicing pathologists and pathologists-in-training alike. To develop a Web-based program that can expedite the generation of a individualized interpretive report for a variety of coagulation tests. We developed a Web-based synoptic reporting system composed of 119 coagulation report templates and 38 thromboelastography (TEG) report templates covering a wide range of findings. Our institution implemented this reporting system in July 2011; it is currently used by pathology residents and attending pathologists. Feedback from the users of these reports have been overwhelmingly positive. Surveys note the time saved and reduced errors. Our easily accessible, user-friendly, Web-based synoptic reporting system for coagulation is a valuable asset to our laboratory services. Copyright© by the American Society for Clinical Pathology (ASCP).
A computer-aided detection system for rheumatoid arthritis MRI data interpretation and quantification of synovial activity

DEFF Research Database (Denmark)

Kubassove, Olga; Boesen, Mikael; Cimmino, Marco A

2009-01-01

RATIONAL AND OBJECTIVE: Disease assessment and follow-up of rheumatoid arthritis (RA) patients require objective evaluation and quantification. Magnetic resonance imaging (MRI) has a large potential to supplement such information for the clinician, however, time spent on data reading...... and interpretation slow down development in this area. Existing scoring systems of especially synovitis are too rigid and insensitive to measure early treatment response and quantify inflammation. This study tested a novel automated, computer system for analysis of dynamic MRI data acquired from patients with RA......, Dynamika-RA, which incorporates efficient data processing and analysis techniques....
Do candidate reactions relate to job performance or affect criterion-related validity? A multistudy investigation of relations among reactions, selection test scores, and job performance.

Science.gov (United States)

McCarthy, Julie M; Van Iddekinge, Chad H; Lievens, Filip; Kung, Mei-Chuan; Sinar, Evan F; Campion, Michael A

2013-09-01

Considerable evidence suggests that how candidates react to selection procedures can affect their test performance and their attitudes toward the hiring organization (e.g., recommending the firm to others). However, very few studies of candidate reactions have examined one of the outcomes organizations care most about: job performance. We attempt to address this gap by developing and testing a conceptual framework that delineates whether and how candidate reactions might influence job performance. We accomplish this objective using data from 4 studies (total N = 6,480), 6 selection procedures (personality tests, job knowledge tests, cognitive ability tests, work samples, situational judgment tests, and a selection inventory), 5 key candidate reactions (anxiety, motivation, belief in tests, self-efficacy, and procedural justice), 2 contexts (industry and education), 3 continents (North America, South America, and Europe), 2 study designs (predictive and concurrent), and 4 occupational areas (medical, sales, customer service, and technological). Consistent with previous research, candidate reactions were related to test scores, and test scores were related to job performance. Further, there was some evidence that reactions affected performance indirectly through their influence on test scores. Finally, in no cases did candidate reactions affect the prediction of job performance by increasing or decreasing the criterion-related validity of test scores. Implications of these findings and avenues for future research are discussed. PsycINFO Database Record (c) 2013 APA, all rights reserved
Lord-Wingersky Algorithm Version 2.0 for Hierarchical Item Factor Models with Applications in Test Scoring, Scale Alignment, and Model Fit Testing.

Science.gov (United States)

Cai, Li

2015-06-01

Lord and Wingersky's (Appl Psychol Meas 8:453-461, 1984) recursive algorithm for creating summed score based likelihoods and posteriors has a proven track record in unidimensional item response theory (IRT) applications. Extending the recursive algorithm to handle multidimensionality is relatively simple, especially with fixed quadrature because the recursions can be defined on a grid formed by direct products of quadrature points. However, the increase in computational burden remains exponential in the number of dimensions, making the implementation of the recursive algorithm cumbersome for truly high-dimensional models. In this paper, a dimension reduction method that is specific to the Lord-Wingersky recursions is developed. This method can take advantage of the restrictions implied by hierarchical item factor models, e.g., the bifactor model, the testlet model, or the two-tier model, such that a version of the Lord-Wingersky recursive algorithm can operate on a dramatically reduced set of quadrature points. For instance, in a bifactor model, the dimension of integration is always equal to 2, regardless of the number of factors. The new algorithm not only provides an effective mechanism to produce summed score to IRT scaled score translation tables properly adjusted for residual dependence, but leads to new applications in test scoring, linking, and model fit checking as well. Simulated and empirical examples are used to illustrate the new applications.
Italian normative data and validation of two neuropsychological tests of face recognition: Benton Facial Recognition Test and Cambridge Face Memory Test.

Science.gov (United States)

Albonico, Andrea; Malaspina, Manuela; Daini, Roberta

2017-09-01

The Benton Facial Recognition Test (BFRT) and Cambridge Face Memory Test (CFMT) are two of the most common tests used to assess face discrimination and recognition abilities and to identify individuals with prosopagnosia. However, recent studies highlighted that participant-stimulus match ethnicity, as much as gender, has to be taken into account in interpreting results from these tests. Here, in order to obtain more appropriate normative data for an Italian sample, the CFMT and BFRT were administered to a large cohort of young adults. We found that scores from the BFRT are not affected by participants' gender and are only slightly affected by participant-stimulus ethnicity match, whereas both these factors seem to influence the scores of the CFMT. Moreover, the inclusion of a sample of individuals with suspected face recognition impairment allowed us to show that the use of more appropriate normative data can increase the BFRT efficacy in identifying individuals with face discrimination impairments; by contrast, the efficacy of the CFMT in classifying individuals with a face recognition deficit was confirmed. Finally, our data show that the lack of inversion effect (the difference between the total score of the upright and inverted versions of the CFMT) could be used as further index to assess congenital prosopagnosia. Overall, our results confirm the importance of having norms derived from controls with a similar experience of faces as the "potential" prosopagnosic individuals when assessing face recognition abilities.
Evaluation of an Automated System for Reading and Interpreting Disk Diffusion Antimicrobial Susceptibility Testing of Fastidious Bacteria.

Science.gov (United States)

Idelevich, Evgeny A; Becker, Karsten; Schmitz, Janne; Knaack, Dennis; Peters, Georg; Köck, Robin

2016-01-01

Results of disk diffusion antimicrobial susceptibility testing depend on individual visual reading of inhibition zone diameters. Therefore, automated reading using camera systems might represent a useful tool for standardization. In this study, the ADAGIO automated system (Bio-Rad) was evaluated for reading disk diffusion tests of fastidious bacteria. 144 clinical isolates (68 β-haemolytic streptococci, 28 Streptococcus pneumoniae, 18 viridans group streptococci, 13 Haemophilus influenzae, 7 Moraxella catarrhalis, and 10 Campylobacter jejuni) were tested on Mueller-Hinton agar supplemented with 5% defibrinated horse blood and 20 mg/L β-NAD (MH-F, Oxoid) according to EUCAST. Plates were read manually with a ruler and automatically using the ADAGIO system. Inhibition zone diameters, indicated by the automated system, were visually controlled and adjusted, if necessary. Among 1548 isolate-antibiotic combinations, comparison of automated vs. manual reading yielded categorical agreement (CA) without visual adjustment of the automatically determined zone diameters in 81.4%. In 20% (309 of 1548) of tests it was deemed necessary to adjust the automatically determined zone diameter after visual control. After adjustment, CA was 94.8%; very major errors (false susceptible interpretation), major errors (false resistant interpretation) and minor errors (false categorization involving intermediate result), calculated according to the ISO 20776-2 guideline, accounted to 13.7% (13 of 95 resistant results), 3.3% (47 of 1424 susceptible results) and 1.4% (21 of 1548 total results), respectively, compared to manual reading. The ADAGIO system allowed for automated reading of disk diffusion testing in fastidious bacteria and, after visual validation of the automated results, yielded good categorical agreement with manual reading.
Comparison of physical therapy anatomy performance and anxiety scores in timed and untimed practical tests.

Science.gov (United States)

Schwartz, Sarah M; Evans, Cathy; Agur, Anne M R

2015-01-01

Students in health care professional programs face many stressful tests that determine successful completion of their program. Test anxiety during these high stakes examinations can affect working memory and lead to poor outcomes. Methods of decreasing test anxiety include lengthening the time available to complete examinations or evaluating students using untimed examinations. There is currently no consensus in the literature regarding whether untimed examinations provide a benefit to test performance in clinical anatomy. This study aimed to determine the impact of timed versus untimed practical tests on Master of Physical Therapy student anatomy performance and test anxiety. Test anxiety was measured using the State-Trait Anxiety Inventory (STAI). Differences in performance, anxiety scores, and time taken were compared using paired sample Student's t-tests. Eighty-one of the 84 students completed the study and provided feedback. Students performed significantly higher on the untimed test (P = 0.005), with a significant reduction in test anxiety (P anxiety. If the intended goal of evaluating health care professional students is to determine fundamental competencies, these factors should be considered when designing future curricula. © 2014 American Association of Anatomists.
Interpretation of the CABRI LT1 test with SAS4A-code analysis

International Nuclear Information System (INIS)

Sato, Ikken; Onoda, Yu-uichi

2001-03-01

In the CABRI-FAST LT1 test, simulating a ULOF (Unprotected Loss of Flow) accident of LMFBR, pin failure took place rather early during the transient. No fuel melting is expected at this failure because the energy injection was too low and a rapid gas-release-like response leading to coolant-channel voiding was observed. This channel voiding was followed by a gradual fuel breakup and axial relocation. With an aid of SAS4A analysis, interpretation of this test was performed. Although the original SAS4A model was not well fitted to this type of early pin failure, the global behavior after the pin failure was reasonably simulated with temporary modifications. Through this study, gas release behavior from the failed fuel pin and its effect on further transient were well understood. It was also demonstrated that the SAS4A code has a potential to simulate the post-failure behavior initiated by a very early pin failure provided that necessary model modification is given. (author)
Predicting Freshman Grade Point Average From College Admissions Test Scores and State High School Test Scores

OpenAIRE

Koretz, Daniel; Yu, C; Mbekeani, Preeya Pandya; Langi, M.; Dhaliwal, Tasminda Kaur; Braslow, David Arthur

2016-01-01

The current focus on assessing “college and career readiness” raises an empirical question: How do high school tests compare with college admissions tests in predicting performance in college? We explored this using data from the City University of New York and public colleges in Kentucky. These two systems differ in the choice of college admissions test, the stakes for students on the high school test, and demographics. We predicted freshman grade point average (FGPA) from high school GPA an...
Evaluation of the validity of osteoporosis and fracture risk assessment tools (IOF One Minute Test, SCORE, and FRAX) in postmenopausal Palestinian women.

Science.gov (United States)

Kharroubi, Akram; Saba, Elias; Ghannam, Ibrahim; Darwish, Hisham

2017-12-01

The need for simple self-assessment tools is necessary to predict women at high risk for developing osteoporosis. In this study, tools like the IOF One Minute Test, Fracture Risk Assessment Tool (FRAX), and Simple Calculated Osteoporosis Risk Estimation (SCORE) were found to be valid for Palestinian women. The threshold for predicting women at risk for each tool was estimated. The purpose of this study is to evaluate the validity of the updated IOF (International Osteoporosis Foundation) One Minute Osteoporosis Risk Assessment Test, FRAX, SCORE as well as age alone to detect the risk of developing osteoporosis in postmenopausal Palestinian women. Three hundred eighty-two women 45 years and older were recruited including 131 women with osteoporosis and 251 controls following bone mineral density (BMD) measurement, 287 completed questionnaires of the different risk assessment tools. Receiver operating characteristic (ROC) curves were evaluated for each tool using bone BMD as the gold standard for osteoporosis. The area under the ROC curve (AUC) was the highest for FRAX calculated with BMD for predicting hip fractures (0.897) followed by FRAX for major fractures (0.826) with cut-off values ˃1.5 and ˃7.8%, respectively. The IOF One Minute Test AUC (0.629) was the lowest compared to other tested tools but with sufficient accuracy for predicting the risk of developing osteoporosis with a cut-off value ˃4 total yes questions out of 18. SCORE test and age alone were also as good predictors of risk for developing osteoporosis. According to the ROC curve for age, women ≥64 years had a higher risk of developing osteoporosis. Higher percentage of women with low BMD (T-score ≤-1.5) or osteoporosis (T-score ≤-2.5) was found among women who were not exposed to the sun, who had menopause before the age of 45 years, or had lower body mass index (BMI) compared to controls. Women who often fall had lower BMI and approximately 27% of the recruited postmenopausal
Intelligence Test Scores and Birth Order among Young Norwegian Men (Conscripts) Analyzed within and between Families

Science.gov (United States)

Bjerkedal, Tor; Kristensen, Petter; Skjeret, Geir A.; Brevik, John I.

2007-01-01

The present paper reports the results of a within and between family analysis of the relation between birth order and intelligence. The material comprises more than a quarter of a million test scores for intellectual performance of Norwegian male conscripts recorded during 1984-2004. Conscripts, mostly 18-19 years of age, were born to women for…
Reliability and validity of the new Tanaka B Intelligence Scale scores: a group intelligence test.

Science.gov (United States)

Uno, Yota; Mizukami, Hitomi; Ando, Masahiko; Yukihiro, Ryoji; Iwasaki, Yoko; Ozaki, Norio

2014-01-01

The present study evaluated the reliability and concurrent validity of the new Tanaka B Intelligence Scale, which is an intelligence test that can be administered on groups within a short period of time. The new Tanaka B Intelligence Scale and Wechsler Intelligence Scale for Children-Third Edition were administered to 81 subjects (mean age ± SD 15.2 ± 0.7 years) residing in a juvenile detention home; reliability was assessed using Cronbach's alpha coefficient, and concurrent validity was assessed using the one-way analysis of variance intraclass correlation coefficient. Moreover, receiver operating characteristic analysis for screening for individuals who have a deficit in intellectual function (an FIQIntelligence Scale IQ (BIQ) was 0.86, and the intraclass correlation coefficient with FIQ was 0.83. Receiver operating characteristic analysis demonstrated an area under the curve of 0.89 (95% CI: 0.85-0.96). In addition, the stratum-specific likelihood ratio for the BIQ≤65 stratum was 13.8 (95% CI: 3.9-48.9), and the stratum-specific likelihood ratio for the BIQ≥76 stratum was 0.1 (95% CI: 0.03-0.4). Thus, intellectual disability could be ruled out or determined. The present results demonstrated that the new Tanaka B Intelligence Scale score had high reliability and concurrent validity with the Wechsler Intelligence Scale for Children-Third Edition score. Moreover, the post-test probability for the BIQ could be calculated when screening for individuals who have a deficit in intellectual function. The new Tanaka B Intelligence Test is convenient and can be administered within a variety of settings. This enables evaluation of intellectual development even in settings where performing intelligence tests have previously been difficult.
Evaluation of Veterinary-Specific Interpretive Criteria for Susceptibility Testing of Streptococcus equi Subspecies with Trimethoprim-Sulfamethoxazole and Trimethoprim-Sulfadiazine.

Science.gov (United States)

Sadaka, Carmen; Kanellos, Theo; Guardabassi, Luca; Boucher, Joseph; Watts, Jeffrey L

2017-01-01

Antimicrobial susceptibility test results for trimethoprim-sulfadiazine with Streptococcus equi subspecies are interpreted based on human data for trimethoprim-sulfamethoxazole. The veterinary-specific data generated in this study support a single breakpoint for testing trimethoprim-sulfamethoxazole and/or trimethoprim-sulfadiazine with S. equi This study indicates trimethoprim-sulfamethoxazole as an acceptable surrogate for trimethoprim-sulfadiazine with S. equi. Copyright © 2016 Sadaka et al.
Concordance of Motion Sensor and Clinician-Rated Fall Risk Scores in Older Adults.

Science.gov (United States)

Elledge, Julie

2017-12-01

As the older adult population in the United States continues to grow, developing reliable, valid, and practical methods for identifying fall risk is a high priority. Falls are prevalent in older adults and contribute significantly to morbidity and mortality rates and rising health costs. Identifying at-risk older adults and intervening in a timely manner can reduce falls. Conventional fall risk assessment tools require a health professional trained in the use of each tool for administration and interpretation. Motion sensor technology, which uses three-dimensional cameras to measure patient movements, is promising for assessing older adults' fall risk because it could eliminate or reduce the need for provider oversight. The purpose of this study was to assess the concordance of fall risk scores as measured by a motion sensor device, the OmniVR Virtual Rehabilitation System, with clinician-rated fall risk scores in older adult outpatients undergoing physical rehabilitation. Three standardized fall risk assessments were administered by the OmniVR and by a clinician. Validity of the OmniVR was assessed by measuring the concordance between the two assessment methods. Stability of the OmniVR fall risk ratings was assessed by measuring test-retest reliability. The OmniVR scores showed high concordance with the clinician-rated scores and high stability over time, demonstrating comparability with provider measurements.
Ordinal convolutional neural networks for predicting RDoC positive valence psychiatric symptom severity scores.

Science.gov (United States)

Rios, Anthony; Kavuluru, Ramakanth

2017-11-01

The CEGS N-GRID 2016 Shared Task in Clinical Natural Language Processing (NLP) provided a set of 1000 neuropsychiatric notes to participants as part of a competition to predict psychiatric symptom severity scores. This paper summarizes our methods, results, and experiences based on our participation in the second track of the shared task. Classical methods of text classification usually fall into one of three problem types: binary, multi-class, and multi-label classification. In this effort, we study ordinal regression problems with text data where misclassifications are penalized differently based on how far apart the ground truth and model predictions are on the ordinal scale. Specifically, we present our entries (methods and results) in the N-GRID shared task in predicting research domain criteria (RDoC) positive valence ordinal symptom severity scores (absent, mild, moderate, and severe) from psychiatric notes. We propose a novel convolutional neural network (CNN) model designed to handle ordinal regression tasks on psychiatric notes. Broadly speaking, our model combines an ordinal loss function, a CNN, and conventional feature engineering (wide features) into a single model which is learned end-to-end. Given interpretability is an important concern with nonlinear models, we apply a recent approach called locally interpretable model-agnostic explanation (LIME) to identify important words that lead to instance specific predictions. Our best model entered into the shared task placed third among 24 teams and scored a macro mean absolute error (MMAE) based normalized score (100·(1-MMAE)) of 83.86. Since the competition, we improved our score (using basic ensembling) to 85.55, comparable with the winning shared task entry. Applying LIME to model predictions, we demonstrate the feasibility of instance specific prediction interpretation by identifying words that led to a particular decision. In this paper, we present a method that successfully uses wide features and
Use of Verbal Descriptors, Thermal Scores and Electrical Pulp Testing Scores as Predictors of Tooth Pain Before and After Application of Benzocaine Gels into Cavities of Teeth with Pulpitis

Science.gov (United States)

Gangarosa, Louis P.; Ciarlone, Alfred E.; Neaverth, Elmer J.; Johnston, Carey A.; Snowden, J. Douglas; Thompson, William O.

1989-01-01

A double-blind pilot study was conducted on 27 consenting human volunteers who had irreversible pulpitis associated with persistent toothache pain from open carious lesions. Formulations tested contained either 0, 10%, or 20% benzocaine and were identified only by a numbered code. Before the experiment started, a small amount of a known 5% benzocaine gel was placed for 1 minute on the tongue of each patient to assure a sensation of numbness within the oral cavity. Then the test tooth was washed with a gentle stream of warm water and dried with gauze. A randomly selected test medication was placed into the open cavity and around the gingival margins for 5 minutes. Pre- and posttreatment tests were conducted at the following timed intervals: 0, 5, 15, 30, 45, 60, 75 and 90 minutes. The tests included degree of pain (rated: 0 = none, 1 = mild, 2 = moderate, 3 = severe); electrical pulp testing (EPT) by a modified, voltage-ramping instrument; and ice water testing (0.5 mL directed quickly onto sound enamel of the tooth and rated: 0 to 4, with 4 being intolerable). After testing, or when pain returned to baseline, endodontic procedures were performed. There was a significant increase (p pulpitis and control teeth, 3) there were no correlations between direction of EPT scores and pain relief, 4) cold water testing was a good predictor of whether or not a tooth had pulpitis, and 5) changes in cold water testing scores after treatment could not be correlated to relief of pain according to verbal descriptors. The effectiveness of benzocaine in relieving toothache pain verifies previous studies; however, a difference between 10% and 20% benzocaine could not be demonstrated probably because of two factors: 1) the present experiment had a small sample size, and 2) there was no direct measurement of duration of local anesthesia. PMID:2490060
Investigating the Value of Section Scores for the "TOEFL iBT"® Test. "TOEFL iBT"® Research Report. TOEFL iBT-21. ETS Research Report RR-13-35

Science.gov (United States)

Sawaki, Yasuyo; Sinharay, Sandip

2013-01-01

This study investigates the value of reporting the reading, listening, speaking, and writing section scores for the "TOEFL iBT"® test, focusing on 4 related aspects of the psychometric quality of the TOEFL iBT section scores: reliability of the section scores, dimensionality of the test, presence of distinct score profiles, and the…
7 CFR 201.56 - Interpretation.

Science.gov (United States)

2010-01-01

... REGULATIONS Germination Tests in the Administration of the Act § 201.56 Interpretation. (a) A seed shall be... and the final count. During the progress of the germination test, seeds which are obviously dead and... evaluation of germination tests made on approved artificial media. This is intended to provide a method of...
Predicting Pre-Service Classroom Teachers' Civil Servant Recruitment Examination's Educational Sciences Test Scores Using Artificial Neural Networks

Science.gov (United States)

Demir, Metin

2015-01-01

This study predicts the number of correct answers given by pre-service classroom teachers in Civil Servant Recruitment Examination's (CSRE) educational sciences test based on their high school grade point averages, university entrance scores, and grades (mid-term and final exams) from their undergraduate educational courses. This study was…
[Prognostic scores for pulmonary embolism].

Science.gov (United States)

Junod, Alain

2016-03-23

Nine prognostic scores for pulmonary embolism (PE), based on retrospective and prospective studies, published between 2000 and 2014, have been analyzed and compared. Most of them aim at identifying PE cases with a low risk to validate their ambulatory care. Important differences in the considered outcomes: global mortality, PE-specific mortality, other complications, sizes of low risk groups, exist between these scores. The most popular score appears to be the PESI and its simplified version. Few good quality studies have tested the applicability of these scores to PE outpatient care, although this approach tends to already generalize in the medical practice.

Conceptual Scoring and Classification Accuracy of Vocabulary Testing in Bilingual Children

Science.gov (United States)

Anaya, Jissel B.; Peña, Elizabeth D.; Bedore, Lisa M.

2018-01-01

Purpose: This study examined the effects of single-language and conceptual scoring on the vocabulary performance of bilingual children with and without specific language impairment. We assessed classification accuracy across 3 scoring methods. Method: Participants included Spanish-English bilingual children (N = 247) aged 5;1 (years;months) to…
Differences in physical-fitness test scores between actively and passively recruited older adults : Consequences for norm-based classification

NARCIS (Netherlands)

van Heuvelen, M.J.G.; Stevens, M.; Kempen, G.I.J.M.

This study investigated differences in physical-fitness test scores between actively and passively recruited older adults and the consequences thereof for norm-based classification of individuals. Walking endurance, grip strength, hip flexibility, balance, manual dexterity, and reaction time were
Factors with independent influence on the 'timed up and go' test in patients with hip fracture

DEFF Research Database (Denmark)

Kristensen, Morten Tange; Foss, Nicolai Bang; Kehlet, Henrik

2009-01-01

an intertrochanteric fracture (B = 7), performing TUG with a walker (B = 15), and performing TUG in the later postoperative period (B = 0.39) were independently associated with having a poorer TUG score. CONCLUSIONS: These preliminary normative reference values of TUG performances in patients with hip fracture can...... be used as references, to which individuals can expect to perform. Multivariate testing suggests that clinicians should use age, pre-fracture function, fracture type and walking-aid specific data when interpreting the TUG test results. Physiotherapists should be aware of this if TUG scores are to be used...
Neutral vs positive oral contrast in diagnosing acute appendicitis with contrast-enhanced CT: sensitivity, specificity, reader confidence and interpretation time

Science.gov (United States)

Naeger, D M; Chang, S D; Kolli, P; Shah, V; Huang, W; Thoeni, R F

2011-01-01

Objective The study compared the sensitivity, specificity, confidence and interpretation time of readers of differing experience in diagnosing acute appendicitis with contrast-enhanced CT using neutral vs positive oral contrast agents. Methods Contrast-enhanced CT for right lower quadrant or right flank pain was performed in 200 patients with neutral and 200 with positive oral contrast including 199 with proven acute appendicitis and 201 with other diagnoses. Test set disease prevalence was 50%. Two experienced gastrointestinal radiologists, one fellow and two first-year residents blindly assessed all studies for appendicitis (2000 readings) and assigned confidence scores (1=poor to 4=excellent). Receiver operating characteristic (ROC) curves were generated. Total interpretation time was recorded. Each reader's interpretation with the two agents was compared using standard statistical methods. Results Average reader sensitivity was found to be 96% (range 91–99%) with positive and 95% (89–98%) with neutral oral contrast; specificity was 96% (92–98%) and 94% (90–97%). For each reader, no statistically significant difference was found between the two agents (sensitivities p-values >0.6; specificities p-values>0.08), in the area under the ROC curve (range 0.95–0.99) or in average interpretation times. In cases without appendicitis, positive oral contrast demonstrated improved appendix identification (average 90% vs 78%) and higher confidence scores for three readers. Average interpretation times showed no statistically significant differences between the agents. Conclusion Neutral vs positive oral contrast does not affect the accuracy of contrast-enhanced CT for diagnosing acute appendicitis. Although positive oral contrast might help to identify normal appendices, we continue to use neutral oral contrast given its other potential benefits. PMID:20959365
The Effect of a Reading Accommodation on Standardized Test Scores of Learning Disabled and Non Learning Disabled Students.

Science.gov (United States)

Meloy, Linda L.; Deville, Craig; Frisbie, David

The effect of the Read Aloud accommodation on the performances of learning disabled in reading (LD-R) and non-learning disabled (non LD) middle school students was studied using selected texts from the Iowa Tests of Basic Skills (ITBS) achievement battery. Science, Usage and Expression, Math Problem Solving and Data Interpretation, and Reading…
Interactive or static reports to guide clinical interpretation of cancer genomics.

Science.gov (United States)

Gray, Stacy W; Gagan, Jeffrey; Cerami, Ethan; Cronin, Angel M; Uno, Hajime; Oliver, Nelly; Lowenstein, Carol; Lederman, Ruth; Revette, Anna; Suarez, Aaron; Lee, Charlotte; Bryan, Jordan; Sholl, Lynette; Van Allen, Eliezer M

2018-05-01

Misinterpretation of complex genomic data presents a major challenge in the implementation of precision oncology. We sought to determine whether interactive genomic reports with embedded clinician education and optimized data visualization improved genomic data interpretation. We conducted a randomized, vignette-based survey study to determine whether exposure to interactive reports for a somatic gene panel, as compared to static reports, improves physicians' genomic comprehension and report-related satisfaction (overall scores calculated across 3 vignettes, range 0-18 and 1-4, respectively, higher score corresponding with improved endpoints). One hundred and five physicians at a tertiary cancer center participated (29% participation rate): 67% medical, 20% pediatric, 7% radiation, and 7% surgical oncology; 37% female. Prior to viewing the case-based vignettes, 34% of the physicians reported difficulty making treatment recommendations based on the standard static report. After vignette/report exposure, physicians' overall comprehension scores did not differ by report type (mean score: interactive 11.6 vs static 10.5, difference = 1.1, 95% CI, -0.3, 2.5, P = .13). However, physicians exposed to the interactive report were more likely to correctly assess sequencing quality (P < .001) and understand when reports needed to be interpreted with caution (eg, low tumor purity; P = .02). Overall satisfaction scores were higher in the interactive group (mean score 2.5 vs 2.1, difference = 0.4, 95% CI, 0.2-0.7, P = .001). Interactive genomic reports may improve physicians' ability to accurately assess genomic data and increase report-related satisfaction. Additional research in users' genomic needs and efforts to integrate interactive reports into electronic health records may facilitate the implementation of precision oncology.
Relationship Between Broiler Body Weights, Eimeria maxima Gross Lesion Scores, and Microscores in Three Anticoccidial Sensitivity Tests.

Science.gov (United States)

Barrios, Miguel A; Da Costa, Manuel; Kimminau, Emily; Fuller, Lorraine; Clark, Steven; Pesti, Gene; Beckstead, Robert

2017-06-01

Anticoccidial sensitivity tests (ASTs) serve to determine the efficacy of anticoccidial drugs against Eimeria field isolates in a controlled laboratory setting. The most commonly measured parameters are body weight gain, feed conversion ratio, gross intestinal lesion scores, and mortality. Due to the difficulty in reliably scoring gross lesion scores of Eimeria maxima , microscopic analysis of intestinal scrapings (microscores) can be used in the field to indicate the presence of this particular Eimeria. The goal of this study was to determine the relationship between E. maxima microscores and broiler body weights and gross E. maxima lesion scores in three ASTs. Day-old broiler chicks were raised for 12 days on a standard corn-soy diet. On Day 12, chicks were placed in Petersime batteries and treatment diets were provided. There were six birds per pen, four pens per treatment, and 12 treatments, for a total of 288 chicks per AST. The treatments were as follows: 1) nonmedicated, noninfected; 2) nonmedicated, infected; 3) lasalocid, infected; 4) salinomycin, infected; 5) diclazuril, infected; 6) monensin, infected; 7) decoquinate, infected; 8) narasin + nicarbazin, infected; 9) narasin, infected; 10) nicarbazin, infected; 11) robenidine, infected; and 12) zoalene, infected. On Day 14, chicks were challenged with an Eimeria field isolate by oral gavage. On Day 20, broilers were weighed, and gross lesion scores and microscores were classified from 0 to 4 depending on the severity of the gross lesion scores and E. maxima microscores. Data from three trials using different field isolates were statistically analyzed using a logarithmic regression model. There was no relationship (P = 0.1224) between microscores and body weight gain. There was a positive relationship between microscores and gross lesion scores (P = 0.004). However, there was also an interaction between isolate and treatment (P Eimeria or the amount of E. maxima in the inoculum.
Stroop Color-Word Interference Test: Normative data for Spanish-speaking pediatric population.

Science.gov (United States)

Rivera, D; Morlett-Paredes, A; Peñalver Guia, A I; Irías Escher, M J; Soto-Añari, M; Aguayo Arelis, A; Rute-Pérez, S; Rodríguez-Lorenzana, A; Rodríguez-Agudelo, Y; Albaladejo-Blázquez, N; García de la Cadena, C; Ibáñez-Alfonso, J A; Rodriguez-Irizarry, W; García-Guerrero, C E; Delgado-Mejía, I D; Padilla-López, A; Vergara-Moragues, E; Barrios Nevado, M D; Saracostti Schwartzman, M; Arango-Lasprilla, J C

2017-01-01

To generate normative data for the Stroop Word-Color Interference test in Spanish-speaking pediatric populations. The sample consisted of 4,373 healthy children from nine countries in Latin America (Chile, Cuba, Ecuador, Guatemala, Honduras, Mexico, Paraguay, Peru, and Puerto Rico) and Spain. Each participant was administered the Stroop Word-Color Interference test as part of a larger neuropsychological battery. The Stroop Word, Stroop Color, Stroop Word-Color, and Stroop Interference scores were normed using multiple linear regressions and standard deviations of residual values. Age, age2, sex, and mean level of parental education (MLPE) were included as predictors in the analyses. The final multiple linear regression models showed main effects for age on all scores, except on Stroop Interference for Guatemala, such that scores increased linearly as a function of age. Age2 affected Stroop Word scores for all countries, Stroop Color scores for Ecuador, Mexico, Peru, and Spain; Stroop Word-Color scores for Ecuador, Mexico, and Paraguay; and Stroop Interference scores for Cuba, Guatemala, and Spain. MLPE affected Stroop Word scores for Chile, Mexico, and Puerto Rico; Stroop Color scores for Mexico, Puerto Rico, and Spain; Stroop Word-Color scores for Ecuador, Guatemala, Mexico, Puerto Rico and Spain; and Stroop-Interference scores for Ecuador, Mexico, and Spain. Sex affected Stroop Word scores for Spain, Stroop Color scores for Mexico, and Stroop Interference for Honduras. This is the largest Spanish-speaking pediatric normative study in the world, and it will allow neuropsychologists from these countries to have a more accurate approach to interpret the Stroop Word-Color Interference test in pediatric populations.
Reliability and validity of the new Tanaka B Intelligence Scale scores: a group intelligence test.

Directory of Open Access Journals (Sweden)

Yota Uno

Full Text Available OBJECTIVE: The present study evaluated the reliability and concurrent validity of the new Tanaka B Intelligence Scale, which is an intelligence test that can be administered on groups within a short period of time. METHODS: The new Tanaka B Intelligence Scale and Wechsler Intelligence Scale for Children-Third Edition were administered to 81 subjects (mean age ± SD 15.2 ± 0.7 years residing in a juvenile detention home; reliability was assessed using Cronbach's alpha coefficient, and concurrent validity was assessed using the one-way analysis of variance intraclass correlation coefficient. Moreover, receiver operating characteristic analysis for screening for individuals who have a deficit in intellectual function (an FIQ<70 was performed. In addition, stratum-specific likelihood ratios for detection of intellectual disability were calculated. RESULTS: The Cronbach's alpha for the new Tanaka B Intelligence Scale IQ (BIQ was 0.86, and the intraclass correlation coefficient with FIQ was 0.83. Receiver operating characteristic analysis demonstrated an area under the curve of 0.89 (95% CI: 0.85-0.96. In addition, the stratum-specific likelihood ratio for the BIQ≤65 stratum was 13.8 (95% CI: 3.9-48.9, and the stratum-specific likelihood ratio for the BIQ≥76 stratum was 0.1 (95% CI: 0.03-0.4. Thus, intellectual disability could be ruled out or determined. CONCLUSION: The present results demonstrated that the new Tanaka B Intelligence Scale score had high reliability and concurrent validity with the Wechsler Intelligence Scale for Children-Third Edition score. Moreover, the post-test probability for the BIQ could be calculated when screening for individuals who have a deficit in intellectual function. The new Tanaka B Intelligence Test is convenient and can be administered within a variety of settings. This enables evaluation of intellectual development even in settings where performing intelligence tests have previously been difficult.
Teachers' perceptions of the effectiveness of an urban health sciences curriculum in closing the Black-White test score gap: A participatory case study

Science.gov (United States)

Prince, Joan Marie

1999-12-01

Over the past years, progress in Black academic achievement, particularly in the area of science, has generally slowed or ceased. According to the 1994 NAEP assessment, twelfth-grade Black students are performing at the level of White eighth-grade students in the discipline of science (Department of Education, 1996). These students, in their last year of required schooling, are about to graduate, yet they lag at least four years behind their white counterparts in science achievement. Despite the establishment and implementation of numerous science intervention programs, Black students still suffer from a disparate gap in standardized test score achievement. The purpose of this research is to investigate teachers' perceptions of the effectiveness of an urban sciences intervention tool that was designed to assist in narrowing the Black-White science academic achievement gap. Specifically, what factors affect teachers' personal sense of instructional efficacy, and how does this translate into their outcome expectancy for student academic success? A multiple-case, replicative design, grounded in descriptive theory, was selected for the study. Multiple sources of evidence were queried to provide robust findings. These sources included a validated health sciences self-efficacy instrument, an interview protocol, a classroom observation, and a review of archival material that included case study participants' personnel files and meeting minutes. A cross-comparative analytic approach was selected for interpretation (Yin, 1994). Findings indicate that teachers attribute the success or failure of educational intervention tools in closing the Black-White test score gap to a variety of internal and external factors. These factors included a perceived lack of both monetary and personal support by the school leadership, as well as a perceived lack of parental involvement which impacted negatively on student achievement patterns. The case study participants displayed a depressed
Objective interpretation as conforming interpretation

Directory of Open Access Journals (Sweden)

Lidka Rodak

2011-12-01

Full Text Available The practical discourse willingly uses the formula of “objective interpretation”, with no regards to its controversial nature that has been discussed in literature.The main aim of the article is to investigate what “objective interpretation” could mean and how it could be understood in the practical discourse, focusing on the understanding offered by judicature.The thesis of the article is that objective interpretation, as identified with textualists’ position, is not possible to uphold, and should be rather linked with conforming interpretation. And what this actually implies is that it is not the virtue of certainty and predictability – which are usually associated with objectivity- but coherence that makes the foundation of applicability of objectivity in law.What could be observed from the analyses, is that both the phenomenon of conforming interpretation and objective interpretation play the role of arguments in the interpretive discourse, arguments that provide justification that interpretation is not arbitrary or subjective. With regards to the important part of the ideology of legal application which is the conviction that decisions should be taken on the basis of law in order to exclude arbitrariness, objective interpretation could be read as a question “what kind of authority “supports” certain interpretation”? that is almost never free of judicial creativity and judicial activism.One can say that, objective and conforming interpretation are just another arguments used in legal discourse.
Associations between MMPI-2-RF validity scale scores and extra-test measures of personality and psychopathology.

Science.gov (United States)

Forbey, Johnathan D; Lee, Tayla T C; Ben-Porath, Yossef S; Arbisi, Paul A; Gartland, Diane

2013-08-01

The current study explored associations between two potentially invalidating self-report styles detected by the Validity scales of the Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF), over-reporting and under-reporting, and scores on the MMPI-2-RF substantive, as well as eight collateral self-report measures administered either at the same time or within 1 to 10 days of MMPI-2-RF administration. Analyses were conducted with data provided by college students, male prisoners, and male psychiatric outpatients from a Veterans Administration facility. Results indicated that if either an over- or under-reporting response style was suggested by the MMPI-2-RF Validity scales, scores on the majority of the MMPI-2-RF substantive scales, as well as a number of collateral measures, were significantly affected in all three groups in the expected directions. Test takers who were identified as potentially engaging in an over- or under-reporting response style by the MMPI-2-RF Validity scales appeared to approach extra-test measures similarly regardless of when these measures were administered in relation to the MMPI-2-RF. Limitations and suggestions for future study are discussed.
Detection and validation of unscalable item score patterns using item response theory: an illustration with Harter's Self-Perception Profile for Children.

Science.gov (United States)

Meijer, Rob R; Egberink, Iris J L; Emons, Wilco H M; Sijtsma, Klaas

2008-05-01

We illustrate the usefulness of person-fit methodology for personality assessment. For this purpose, we use person-fit methods from item response theory. First, we give a nontechnical introduction to existing person-fit statistics. Second, we analyze data from Harter's (1985) Self-Perception Profile for Children (Harter, 1985) in a sample of children ranging from 8 to 12 years of age (N = 611) and argue that for some children, the scale scores should be interpreted with care and caution. Combined information from person-fit indexes and from observation, interviews, and self-concept theory showed that similar score profiles may have a different interpretation. For some children in the sample, item scores did not adequately reflect their trait level. Based on teacher interviews, this was found to be due most likely to a less developed self-concept and/or problems understanding the meaning of the questions. We recommend investigating the scalability of score patterns when using self-report inventories to help the researcher interpret respondents' behavior correctly.
The Role of Policy Assumptions in Validating High-stakes Testing Programs.

Science.gov (United States)

Kane, Michael

L. Cronbach has made the point that for validity arguments to be convincing to diverse audiences, they need to be based on assumptions that are credible to these audiences. The interpretations and uses of high stakes test scores rely on a number of policy assumptions about what should be taught in schools, and more specifically, about the content…
Differential Predictive Validity of High School GPA and College Entrance Test Scores for University Students in Yemen

Science.gov (United States)

Al-Hattami, Abdulghani Ali Dawod

2012-01-01

High school grade point average and college entrance test scores are two admission criteria that are currently used by most colleges in Yemen to select their prospective students. Given their widespread use, it is important to investigate their predictive validity to ensure the accuracy of the admission decisions in these institutions. This study…
Chronic obstructive pulmonary disease (COPD) assessment test scores corresponding to modified Medical Research Council grades among COPD patients.

Science.gov (United States)

Lee, Chang-Hoon; Lee, Jinwoo; Park, Young Sik; Lee, Sang-Min; Yim, Jae-Joon; Kim, Young Whan; Han, Sung Koo; Yoo, Chul-Gyu

2015-09-01

In assigning patients with chronic obstructive pulmonary disease (COPD) to subgroups according to the updated guidelines of the Global Initiative for Chronic Obstructive Lung Disease, discrepancies have been noted between the COPD assessment test (CAT) criteria and modified Medical Research Council (mMRC) criteria. We investigated the determinants of symptom and risk groups and sought to identify a better CAT criterion. This retrospective study included COPD patients seen between June 20, 2012, and December 5, 2012. The CAT score that can accurately predict an mMRC grade ≥ 2 versus COPD patients, the percentages of patients classified into subgroups A, B, C, and D were 24.5%, 47.2%, 4.2%, and 24.1% based on CAT criteria and 49.3%, 22.4%, 8.9%, and 19.4% based on mMRC criteria, respectively. More than 90% of the patients who met the mMRC criteria for the 'more symptoms group' also met the CAT criteria. AUROC and CART analyses suggested that a CAT score ≥ 15 predicted an mMRC grade ≥ 2 more accurately than the current CAT score criterion. During follow-up, patients with CAT scores of 10 to 14 did not have a different risk of exacerbation versus those with CAT scores COPD patients.
Optimization and Interpretation of Serial QuantiFERON Testing to Measure Acquisition of Mycobacterium tuberculosis Infection.

Science.gov (United States)

Nemes, Elisa; Rozot, Virginie; Geldenhuys, Hennie; Bilek, Nicole; Mabwe, Simbarashe; Abrahams, Deborah; Makhethe, Lebohang; Erasmus, Mzwandile; Keyser, Alana; Toefy, Asma; Cloete, Yolundi; Ratangee, Frances; Blauenfeldt, Thomas; Ruhwald, Morten; Walzl, Gerhard; Smith, Bronwyn; Loxton, Andre G; Hanekom, Willem A; Andrews, Jason R; Lempicki, Maria D; Ellis, Ruth; Ginsberg, Ann M; Hatherill, Mark; Scriba, Thomas J

2017-09-01

Conversion from a negative to positive QuantiFERON-TB test is indicative of Mycobacterium tuberculosis (Mtb) infection, which predisposes individuals to tuberculosis disease. Interpretation of serial tests is confounded by immunological and technical variability. To improve the consistency of serial QuantiFERON-TB testing algorithms and provide a data-driven definition of conversion. Sources of QuantiFERON-TB variability were assessed, and optimal procedures were identified. Distributions of IFN-γ response levels were analyzed in healthy adolescents, Mtb-unexposed control subjects, and patients with pulmonary tuberculosis. Individuals with no known Mtb exposure had IFN-γ values less than 0.2 IU/ml. Among individuals with IFN-γ values less than 0.2 IU/ml, 0.2-0.34 IU/ml, 0.35-0.7 IU/ml, and greater than 0.7 IU/ml, tuberculin skin test positivity results were 15%, 53%, 66%, and 91% (P 0.7 IU/ml) would allow more definitive detection of recent Mtb infection and potentially improve identification of those more likely to develop disease.
An Argument against Using Standardized Test Scores for Placement of International Undergraduate Students in English as a Second Language (ESL) Courses

Science.gov (United States)

Kokhan, Kateryna

2013-01-01

Development and administration of institutional ESL placement tests require a great deal of financial and human resources. Due to a steady increase in the number of international students studying in the United States, some US universities have started to consider using standardized test scores for ESL placement. The English Placement Test (EPT)…
Cross cultural comparison of JTCI inventory of temperament and character scores of 11-13 year olds

Directory of Open Access Journals (Sweden)

Dukanac Vesna

2008-01-01

Full Text Available The study compares characteristics of Serbian and American children on the dimensions of temperament and character on the Junior TCI (JTCI for assessment of 9 to 13 year olds - based on Robert Cloninger’s Psychobiological model of temperament and character. Given the lack of assessment tools for this age group, the goal of the present study was to test the factor structure and main psychometric characteristics of the JTCI in order to determine the applicability of this questionnaire on Serbian children. The sample consisted of 222 boys and girls from the normal population, ages 11 to 13 and who attended grades 6 to 8. The results showed significant differences between Serbian and American sample. Namely, Serbian children had higher scores on the Novelty seeking and Harm Avoidance and lower scores on Reward Dependence and Persistency. As to the Character Dimensions, Serbian children had lower scores on Reward dependence and persistency, and significantly lower on Self-directedness and Cooperativeness. Scores on the Self-transcendence were higher among the Serbian children. The differences on Character dimensions between children from different cultures suppose to be primarily a result of the socialization process. They reflect a lower level of maturity, cooperation and probably compensatory reliance on the religion. Although it is a temperament dimension, being prone to negative emotions (higher scores on Danger avoidance may also be a result of a situational sensitivity. This result could be interpreted as a reflection of the negative effects that the general socio cultural milieu had on the children who grew up during the social crisis and transitional periods of our society. The result did not confirm a seven factor personality structure of children in this age group. It is likely that at the age of 11 to 13, dimensions of character and temperament did not yet clearly differentiate. Finally, poor reliability of the JTCI scales imposes
Vocational students' learning preferences: the interpretability of ipsative data.

Science.gov (United States)

Smith, P J

2000-02-01

A number of researchers have argued that ipsative data are not suitable for statistical procedures designed for normative data. Others have argued that the interpretability of such analyses of ipsative data are little affected where the number of variables and the sample size are sufficiently large. The research reported here represents a factor analysis of the scores on the Canfield Learning Styles Inventory for 1,252 students in vocational education. The results of the factor analysis of these ipsative data were examined in a context of existing theory and research on vocational students and lend support to the argument that the factor analysis of ipsative data can provide sensibly interpretable results.

The interpretation of proverbs by elderly with high, medium and low educational level: Abstract reasoning as an aspect of executive functions

Science.gov (United States)

Wachholz, Thalita Bianchi de Oliveira; Yassuda, Mônica Sanches

2011-01-01

It is now known that cognitive functions tend to decline with age. Executive functions (EF) are among the first abilities to decline with aging. A subcomponent of the EF is abstract reasoning. The Test of Proverbs is an instrument that can be used to evaluate the capacity of abstract reasoning. Objective To examine the association of performance in interpretation of proverbs, with education and with episodic memory and EF tasks. Methods A total of 67 individuals aged between 60 and 75 years were evaluated, and divided into three categories of education: 1-4 years, 5-8 years, and 9 or more years of schooling. The instruments used were a sociodemographic questionnaire (gender, age, marital status, education, income, previous occupation, current occupation and health perception), the Mini Mental State Examination, Brief Cognitive Screening Battery; Geriatric Depression Scale; Forward and Backward Digit Span (WAIS-III), and the Test of Proverbs. Results A high impact of education was seen on the interpretation of proverbs, with lower performance among the elderly with less education. A significant association between performance on the Test of Proverbs and scores on the MMSE, GDS, and verbal fluency tests was found. There was a modest association with incidental memory. Conclusions The capacity to interpret proverbs is strongly associated with education and with performance on other EF tasks. PMID:29213717
Application of a numerical model in the interpretation of a leaky aquifer test

International Nuclear Information System (INIS)

Schroth, B.; Narasimhan, T.N.

1997-01-01

The potential use of numerical models in aquifer analysis is by no means a new concept; yet relatively few engineers and scientists are taking advantage of this powerful tool that is more convenient to use now than ever before. In this technical note the authors present an example of using a numerical model in an integrated analysis of data from a three-layer leaky aquifer system involving well-bore storage, skin effects, variable discharge, and observation wells in the pumped aquifer and in an unpumped aquifer. The modeling detail may differ for other cases. The intent is to show that interpretation can be achieved with reduced bias by reducing assumptions in regard to system geometry, flow rate, and other details. A multiwell aquifer test was carried out at a site on the western part of the Lawrence Livermore National Laboratory (LLNL), located about 60 kilometers east of San Francisco. The test was conducted to hydraulically characterize one part of the site and thus help develop remediation strategies to alleviate the ground-water contamination
Interpretation bias and anxiety in childhood: stability, specificity and longitudinal associations.

Science.gov (United States)

Creswell, Cathy; O'Connor, Thomas G

2011-03-01

Biases in the interpretation of ambiguous material are central to cognitive models of anxiety; however, understanding of the association between interpretation and anxiety in childhood is limited. To address this, a prospective investigation of the stability and specificity of anxious cognitions and anxiety and the relationship between these factors was conducted. Sixty-five children (10-11 years) from a community sample completed measures of self-reported anxiety, depression, and conduct problems, and responded to ambiguous stories at three time points over one-year. Individual differences in biases in interpretation of ambiguity (specifically "anticipated distress" and "threat interpretation") were stable over time. Furthermore, anticipated distress and threat interpretation were specifically associated with anxiety symptoms. Distress anticipation predicted change in anxiety symptoms over time. In contrast, anxiety scores predicted change in threat interpretation over time. The results suggest that different cognitive constructs may show different longitudinal links with anxiety. These preliminary findings extend research and theory on anxious cognitions and their link with anxiety in children, and suggest that these cognitive processes may be valuable targets for assessment and intervention.
Soetomo score: score model in early identification of acute haemorrhagic stroke

Directory of Open Access Journals (Sweden)

Moh Hasan Machfoed

2016-06-01

Full Text Available Aim of the study: On financial or facility constraints of brain imaging, score model is used to predict the occurrence of acute haemorrhagic stroke. Accordingly, this study attempts to develop a new score model, called Soetomo score. Material and methods: The researchers performed a cross-sectional study of 176 acute stroke patients with onset of ≤24 hours who visited emergency unit of Dr. Soetomo Hospital from July 14th to December 14th, 2014. The diagnosis of haemorrhagic stroke was confirmed by head computed tomography scan. There were seven predictors of haemorrhagic stroke which were analysed by using bivariate and multivariate analyses. Furthermore, a multiple discriminant analysis resulted in an equation of Soetomo score model. The receiver operating characteristic procedure resulted in the values of area under curve and intersection point identifying haemorrhagic stroke. Afterward, the diagnostic test value was determined. Results: The equation of Soetomo score model was (3 × loss of consciousness + (3.5 × headache + (4 × vomiting − 4.5. Area under curve value of this score was 88.5% (95% confidence interval = 83.3–93.7%. In the Soetomo score model value of ≥−0.75, the score reached the sensitivity of 82.9%, specificity of 83%, positive predictive value of 78.8%, negative predictive value of 86.5%, positive likelihood ratio of 4.88, negative likelihood ratio of 0.21, false negative of 17.1%, false positive of 17%, and accuracy of 83%. Conclusions: The Soetomo score model value of ≥−0.75 can identify acute haemorrhagic stroke properly on the financial or facility constrains of brain imaging.
Scoring of radiation-induced micronuclei in cytokinesis-blocked human lymphocytes by automated image analysis

International Nuclear Information System (INIS)

Verhaegen, F.; Seuntjens, J.; Thierens, H.

1994-01-01

The micronucleus assay in human lymphocytes is, at present, frequently used to assess chromosomal damage caused by ionizing radiation or mutagens. Manual scoring of micronuclei (MN) by trained personnel is very time-consuming, tiring work, and the results depend on subjective interpretation of scoring criteria. More objective scoring can be accomplished only if the test can be automated. Furthermore, an automated system allows scoring of large numbers of cells, thereby increasing the statistical significance of the results. This is of special importance for screening programs for low doses of chromosome-damaging agents. In this paper, the first results of our effort to automate the micronucleus assay with an image-analysis system are represented. The method we used is described in detail, and the results are compared to those of other groups. Our system is able to detect 88% of the binucleated lymphocytes on the slides. The procedure consists of a fully automated localization of binucleated cells and counting of the MN within these cells, followed by a simple and fast manual operation in which the false positives are removed. Preliminary measurements for blood samples irradiated with a dose of 1 Gy X-rays indicate that the automated system can find 89% ± 12% of the micronuclei within the binucleated cells compared to a manual screening. 18 refs., 8 figs., 1 tab
Handbook of univariate and multivariate data analysis and interpretation with SPSS

CERN Document Server

Ho, Robert

2006-01-01

Many statistics texts tend to focus more on the theory and mathematics underlying statistical tests than on their applications and interpretation. This can leave readers with little understanding of how to apply statistical tests or how to interpret their findings. While the SPSS statistical software has done much to alleviate the frustrations of social science professionals and students who must analyze data, they still face daunting challenges in selecting the proper tests, executing the tests, and interpreting the test results.With emphasis firmly on such practical matters, this handbook se
Educational technology improves ECG interpretation of acute myocardial infarction among medical students and emergency medicine residents.

Science.gov (United States)

Pourmand, Ali; Tanski, Mary; Davis, Steven; Shokoohi, Hamid; Lucas, Raymond; Zaver, Fareen

2015-01-01

Asynchronous online training has become an increasingly popular educational format in the new era of technology-based professional development. We sought to evaluate the impact of an online asynchronous training module on the ability of medical students and emergency medicine (EM) residents to detect electrocardiogram (ECG) abnormalities of an acute myocardial infarction (AMI). We developed an online ECG training and testing module on AMI, with emphasis on recognizing ST elevation myocardial infarction (MI) and early activation of cardiac catheterization resources. Study participants included senior medical students and EM residents at all post-graduate levels rotating in our emergency department (ED). Participants were given a baseline set of ECGs for interpretation. This was followed by a brief interactive online training module on normal ECGs as well as abnormal ECGs representing an acute MI. Participants then underwent a post-test with a set of ECGs in which they had to interpret and decide appropriate intervention including catheterization lab activation. 148 students and 35 EM residents participated in this training in the 2012-2013 academic year. Students and EM residents showed significant improvements in recognizing ECG abnormalities after taking the asynchronous online training module. The mean score on the testing module for students improved from 5.9 (95% CI [5.7-6.1]) to 7.3 (95% CI [7.1-7.5]), with a mean difference of 1.4 (95% CI [1.12-1.68]) (p<0.0001). The mean score for residents improved significantly from 6.5 (95% CI [6.2-6.9]) to 7.8 (95% CI [7.4-8.2]) (p<0.0001). An online interactive module of training improved the ability of medical students and EM residents to correctly recognize the ECG evidence of an acute MI.
PI-RADS version 2: quantitative analysis aids reliable interpretation of diffusion-weighted imaging for prostate cancer

Energy Technology Data Exchange (ETDEWEB)

Park, Sung Yoon; Jung, Dae Chul; Oh, Young Taik [Yonsei University College of Medicine, Department of Radiology and Research Institute of Radiological Science, Severance Hospital, Seoul (Korea, Republic of); Shin, Su-Jin [Yonsei University College of Medicine, Department of Pathology, Seoul (Korea, Republic of); Hanyang University College of Medicine, Department of Pathology, Seoul (Korea, Republic of); Cho, Nam Hoon [Yonsei University College of Medicine, Department of Pathology, Seoul (Korea, Republic of); Choi, Young Deuk; Rha, Koon Ho; Hong, Sung Joon [Yonsei University College of Medicine, Department of Urology, Seoul (Korea, Republic of)

2017-07-15

To determine whether apparent diffusion coefficient (ADC) ratio aids reliable interpretation of diffusion-weighted imaging (DWI) for prostate cancer (PCa). Seventy-six consecutive patients with PCa who underwent DWI and surgery were included. Based on pathologic tumour location, two readers independently performed DWI scoring according to the revised Prostate Imaging Reporting and Data System (PI-RADSv2). ADC ratios of benign to cancerous prostatic tissue were then measured independently and compared between cases showing concordant and discordant DWI scores ≥4. Area under the curve (AUC) and threshold of ADC ratio were analyzed for DWI scores ≥4. The rate of inter-reader disagreement for DWI score ≥4 was 11.8% (9/76). ADC ratios were higher in concordant vs. discordant DWI scores ≥4 (median, 1.7 vs. 1.1-1.2; p < 0.001). For DWI scores ≥4, the AUCs of ADC ratios were 0.970 for reader 1 and 0.959 for reader 2. In patients with an ADC ratio >1.3, the rate of inter-reader disagreement for DWI score ≥4 decreased to 5.9-6.0%. An ADC ratio >1.3 yielded 100% (reader 1, 54/54; reader 2, 51/51) positive predictive value for clinically significant cancer. ADC ratios may be useful for reliable interpretation of DWI score ≥4 in PI-RADSv2. (orig.)
Cardiopulmonary exercise testing (CPET) in the United Kingdom-a national survey of the structure, conduct, interpretation and funding.

Science.gov (United States)

Reeves, T; Bates, S; Sharp, T; Richardson, K; Bali, S; Plumb, J; Anderson, H; Prentis, J; Swart, M; Levett, D Z H

2018-01-01

Cardiopulmonary exercise testing (CPET) is an exercise stress test with concomitant expired gas analysis that provides an objective, non-invasive measure of functional capacity under stress. CPET-derived variables predict postoperative morbidity and mortality after major abdominal and thoracic surgery. Two previous surveys have reported increasing utilisation of CPET preoperatively in England. We aimed to evaluate current CPET practice in the UK, to identify who performs CPET, how it is performed, how the data generated are used and the funding models. All anaesthetic departments in trusts with adult elective surgery in the UK were contacted by telephone to obtain contacts for their pre-assessment and CPET service leads. An online survey was sent to all leads between November 2016 and March 2017. The response rate to the online survey was 73.1% (144/197) with 68.1% (98/144) reporting an established clinical service and 3.5% (5/144) setting up a service. Approximately 30,000 tests are performed a year with 93.0% (80/86) using cycle ergometry. Colorectal surgical patients are the most frequently tested (89.5%, 77/86). The majority of tests are performed and interpreted by anaesthetists. There is variability in the methods of interpretation and reporting of CPET and limited external validation of results. This survey has identified the continued expansion of perioperative CPET services in the UK which have doubled since 2011. The vast majority of CPET tests are performed and reported by anaesthetists. It has highlighted variation in practice and a lack of standardised reporting implying a need for practice guidelines and standardised training to ensure high-quality data to inform perioperative decision making.
Meaningful Change Scores in the Knee Injury and Osteoarthritis Outcome Score in Patients Undergoing Anterior Cruciate Ligament Reconstruction

DEFF Research Database (Denmark)

Ingelsrud, Lina Holm; Terwee, Caroline B; Terluin, Berend

2018-01-01

BACKGROUND: Meaningful change scores in the Knee injury and Osteoarthritis Outcome Score (KOOS) in patients undergoing anterior cruciate ligament (ACL) reconstruction have not yet been established. PURPOSE: To define the minimal important change (MIC) for the KOOS after ACL reconstruction. STUDY...... data for at least one of the KOOS subscales were obtained from 542 (45.3%) participants. Predictive modeling MIC values were 12.1 for the KOOS subscales of Sport and Recreational Function and 18.3 for Knee-Related Quality of Life. These values aid in interpreting within-group improvement over time...... and can be used as responder criteria when comparing groups. The corresponding and much lower values for the subscales of Pain (2.5), Symptoms (-1.2), and Activities of Daily Living (2.4) are the results from patients reporting, on average, only mild problems with these domains preoperatively. Although 4...
Challenges in interpretation of thyroid function tests in pregnant women with autoimmune thyroid disease

DEFF Research Database (Denmark)

Feldt-Rasmussen, Ulla; Bliddal, Sofie; Rasmussen, Åse Krogh

2011-01-01

Physiological changes during gestation are important to be aware of in measurement and interpretation of thyroid function tests in women with autoimmune thyroid diseases. Thyroid autoimmune activity is decreasing in pregnancy. Measurement of serum TSH is the first-line screening variable....... Measurement of antithyroperoxidase and/or TSH receptor antibodies adds to the differential diagnosis of autoimmune and nonautoimmune thyroid diseases....... for thyroid dysfunction also in pregnancy. However, using serum TSH for control of treatment of maternal thyroid autoimmunity infers a risk for compromised foetal development. Peripheral thyroid hormone values are highly different among laboratories, and there is a need for laboratory-specific gestational age...
External Validation of the Simple Clinical Score and the HOTEL Score, Two Scores for Predicting Short-Term Mortality after Admission to an Acute Medical Unit

DEFF Research Database (Denmark)

Stræde, Mia; Brabrand, Mikkel

2014-01-01

with the objective of validating the Simple Clinical Score (SCS) and the HOTEL score, two existing risk stratification systems that predict mortality for medical patients based solely on clinical information, but not only vital signs. METHODS: Pre-planned prospective observational cohort study. SETTING: Danish 460.......932 to 0.988) for 24-hours mortality and 0.826 (95% CI, 0.774-0.879) for 30-day mortality, and goodness-of-fit test, χ2 = 2.68 (10 degrees of freedom), P = 0.998 and χ2 = 4.00, P = 0.947, respectively. We included 1470 patients when calculating the HOTEL score. Discriminatory power (AUROC) was 0.931 (95......% CI, 0.901-0.962) for 24-hours mortality and goodness-of-fit test, χ2 = 5.56 (10 degrees of freedom), P = 0.234. CONCLUSION: We find that both the SCS and HOTEL scores showed an excellent to outstanding ability in identifying patients at high risk of dying with good or acceptable precision....
Psychometric properties for the Balanced Inventory of Desirable Responding: dichotomous versus polytomous conventional and IRT scoring.

Science.gov (United States)

Vispoel, Walter P; Kim, Han Yi

2014-09-01

[Correction Notice: An Erratum for this article was reported in Vol 26(3) of Psychological Assessment (see record 2014-16017-001). The mean, standard deviation and alpha coefficient originally reported in Table 1 should be 74.317, 10.214 and .802, respectively. The validity coefficients in the last column of Table 4 are affected as well. Correcting this error did not change the substantive interpretations of the results, but did increase the mean, standard deviation, alpha coefficient, and validity coefficients reported for the Honesty subscale in the text and in Tables 1 and 4. The corrected versions of Tables 1 and Table 4 are shown in the erratum.] Item response theory (IRT) models were applied to dichotomous and polytomous scoring of the Self-Deceptive Enhancement and Impression Management subscales of the Balanced Inventory of Desirable Responding (Paulhus, 1991, 1999). Two dichotomous scoring methods reflecting exaggerated endorsement and exaggerated denial of socially desirable behaviors were examined. The 1- and 2-parameter logistic models (1PLM, 2PLM, respectively) were applied to dichotomous responses, and the partial credit model (PCM) and graded response model (GRM) were applied to polytomous responses. For both subscales, the 2PLM fit dichotomous responses better than did the 1PLM, and the GRM fit polytomous responses better than did the PCM. Polytomous GRM and raw scores for both subscales yielded higher test-retest and convergent validity coefficients than did PCM, 1PLM, 2PLM, and dichotomous raw scores. Information plots showed that the GRM provided consistently high measurement precision that was superior to that of all other IRT models over the full range of both construct continuums. Dichotomous scores reflecting exaggerated endorsement of socially desirable behaviors provided noticeably weak precision at low levels of the construct continuums, calling into question the use of such scores for detecting instances of "faking bad." Dichotomous
A Case for Adjusting Subjectively Rated Scores in the Advanced Placement Tests. Program Statistics Research. Technical Report No. 94-5.

Science.gov (United States)

Longford, Nicholas T.

A case is presented for adjusting the scores for free response items in the Advanced Placement (AP) tests. Using information about the rating process from the reliability studies, administrations of the AP test for three subject areas, psychology, computer science, and English language and composition, are analyzed. In the reliability studies, 299…
Use of modeling and simulation in the planning, analysis and interpretation of ultrasonic testing

International Nuclear Information System (INIS)

Algernon, Daniel; Grosse, Christian U.

2016-01-01

Acoustic testing methods such as ultrasound and impact echo are an important tool in building diagnostics. The range includes thickness measurements, the representation of the internal component geometry as well as the detection of voids (gravel pockets), delaminations or possibly locating grouting faults in the interior of metallic cladding tubes of tendon ducts. Basically acoustic method for non-destructive testing (NDT) is based on the excitation of elastic waves that interact with the target object (e.g. to detect discontinuity in the component) at the acoustic interface. From the signal received at the component surface this interaction shall be detected and interpreted to draw conclusions about the presence of the target object, and optionally to determine its size and position (approximately). Although the basic underlying physical principles of the application of elastic waves in NDT are known, it can be complicated by complex relationships in the form of restricted access, component geometries, or the type and form of reflectors. To estimate the chances of success of a test is already often not trivial. These circumstances highlight the importance of using simulations that allow a theoretically sound basis for testing and allow easy optimizing test systems. The deployable simulation methods are varied. Common are in particular the finite element method, the Elasto Finite Integration Technique and semi-analytical calculation methods. [de
Pre-season adductor squeeze test and HAGOS function sport and recreation subscale scores predict groin injury in Gaelic football players.

Science.gov (United States)

Delahunt, Eamonn; Fitzpatrick, Helen; Blake, Catherine

2017-01-01

To determine if pre-season adductor squeeze test and HAGOS function, sport and recreation subscale scores can identify Gaelic football players at risk of developing groin injury. Prospective study. Senior inter-county Gaelic football team. Fifty-five male elite Gaelic football players (age = 24.0 ± 2.8 years, body mass = 84.48 ± 7.67 kg, height = 1.85 ± 0.06 m, BMI = 24.70 ± 1.77 kg/m 2 ) from a single senior inter-county Gaelic football team. Occurrence of groin injury during the season. Ten time-loss groin injuries were registered representing 13% of all injuries. The odds ratio for sustaining a groin injury if pre-season adductor squeeze test score was below 225 mmHg, was 7.78. The odds ratio for sustaining a groin injury if pre-season HAGOS function, sport and recreation subscale score was football players at risk of developing groin injury. Copyright © 2016 Elsevier Ltd. All rights reserved.
The physical interpretation of the parameters measured during the tensile testing of materials at elevated temperatures

International Nuclear Information System (INIS)

Burton, B.

1984-01-01

Hot tensile (or compression) testing, where the stress developed in a material is measured under an imposed strain rate, is often used as an alternative to conventional creep testing. The advantages of the hot tensile test are that its duration can be more closely controlled by the experimenter and also that the technique is more convenient, since high precision testing machines are available. The main disadvantage is that the interpretation of results is more complex. The present paper relates the parameters which are measured in hot tensile tests, to physical processes which occur in materials deforming by a variety of mechanisms. For cases where no significant structural changes occur, as in viscous or superplastic flow, analytical expressions are derived which relate the stresses measured in these tests to material constants. When deformation is controlled by recovery processes, account has to be taken of the structural changes which occur concurrently. A wide variety of behaviour may then be exhibited which depends on the initial dislocation density, the presence of second-phase particles and the relative values of the recovery rate parameters and the velocity imposed by the testing machine. Numerical examples are provided for simple recovery models. (author)
Performance on large-scale science tests: Item attributes that may impact achievement scores

Science.gov (United States)

Gordon, Janet Victoria

Significant differences in achievement among ethnic groups persist on the eighth-grade science Washington Assessment of Student Learning (WASL). The WASL measures academic performance in science using both scenario and stand-alone question types. Previous research suggests that presenting target items connected to an authentic context, like scenario question types, can increase science achievement scores especially in underrepresented groups and thus help to close the achievement gap. The purpose of this study was to identify significant differences in performance between gender and ethnic subgroups by question type on the 2005 eighth-grade science WASL. MANOVA and ANOVA were used to examine relationships between gender and ethnic subgroups as independent variables with achievement scores on scenario and stand-alone question types as dependent variables. MANOVA revealed no significant effects for gender, suggesting that the 2005 eighth-grade science WASL was gender neutral. However, there were significant effects for ethnicity. ANOVA revealed significant effects for ethnicity and ethnicity by gender interaction in both question types. Effect sizes were negligible for the ethnicity by gender interaction. Large effect sizes between ethnicities on scenario question types became moderate to small effect sizes on stand-alone question types. This indicates the score advantage the higher performing subgroups had over the lower performing subgroups was not as large on stand-alone question types compared to scenario question types. A further comparison examined performance on multiple-choice items only within both question types. Similar achievement patterns between ethnicities emerged; however, achievement patterns between genders changed in boys' favor. Scenario question types appeared to register differences between ethnic groups to a greater degree than stand-alone question types. These differences may be attributable to individual differences in cognition
Do medical students’ scores using different assessment instruments predict their scores in clinical reasoning using a computer-based simulation?

Directory of Open Access Journals (Sweden)

Fida M

2015-02-01

Full Text Available Mariam Fida,1 Salah Eldin Kassab2 1Department of Molecular Medicine, College of Medicine and Medical Sciences, Arabian Gulf University, Manama, Bahrain; 2Department of Medical Education, Faculty of Medicine, Suez Canal University, Ismailia, Egypt Purpose: The development of clinical problem-solving skills evolves over time and requires structured training and background knowledge. Computer-based case simulations (CCS have been used for teaching and assessment of clinical reasoning skills. However, previous studies examining the psychometric properties of CCS as an assessment tool have been controversial. Furthermore, studies reporting the integration of CCS into problem-based medical curricula have been limited. Methods: This study examined the psychometric properties of using CCS software (DxR Clinician for assessment of medical students (n=130 studying in a problem-based, integrated multisystem module (Unit IX during the academic year 2011–2012. Internal consistency reliability of CCS scores was calculated using Cronbach's alpha statistics. The relationships between students' scores in CCS components (clinical reasoning, diagnostic performance, and patient management and their scores in other examination tools at the end of the unit including multiple-choice questions, short-answer questions, objective structured clinical examination (OSCE, and real patient encounters were analyzed using stepwise hierarchical linear regression. Results: Internal consistency reliability of CCS scores was high (α=0.862. Inter-item correlations between students' scores in different CCS components and their scores in CCS and other test items were statistically significant. Regression analysis indicated that OSCE scores predicted 32.7% and 35.1% of the variance in clinical reasoning and patient management scores, respectively (P<0.01. Multiple-choice question scores, however, predicted only 15.4% of the variance in diagnostic performance scores (P<0.01, while
The Veterans Affairs Cardiac Risk Score: Recalibrating the Atherosclerotic Cardiovascular Disease Score for Applied Use.

Science.gov (United States)

Sussman, Jeremy B; Wiitala, Wyndy L; Zawistowski, Matthew; Hofer, Timothy P; Bentley, Douglas; Hayward, Rodney A

2017-09-01

Accurately estimating cardiovascular risk is fundamental to good decision-making in cardiovascular disease (CVD) prevention, but risk scores developed in one population often perform poorly in dissimilar populations. We sought to examine whether a large integrated health system can use their electronic health data to better predict individual patients' risk of developing CVD. We created a cohort using all patients ages 45-80 who used Department of Veterans Affairs (VA) ambulatory care services in 2006 with no history of CVD, heart failure, or loop diuretics. Our outcome variable was new-onset CVD in 2007-2011. We then developed a series of recalibrated scores, including a fully refit "VA Risk Score-CVD (VARS-CVD)." We tested the different scores using standard measures of prediction quality. For the 1,512,092 patients in the study, the Atherosclerotic cardiovascular disease risk score had similar discrimination as the VARS-CVD (c-statistic of 0.66 in men and 0.73 in women), but the Atherosclerotic cardiovascular disease model had poor calibration, predicting 63% more events than observed. Calibration was excellent in the fully recalibrated VARS-CVD tool, but simpler techniques tested proved less reliable. We found that local electronic health record data can be used to estimate CVD better than an established risk score based on research populations. Recalibration improved estimates dramatically, and the type of recalibration was important. Such tools can also easily be integrated into health system's electronic health record and can be more readily updated.

The Machine Scoring of Writing

Science.gov (United States)

McCurry, Doug

2010-01-01

This article provides an introduction to the kind of computer software that is used to score student writing in some high stakes testing programs, and that is being promoted as a teaching and learning tool to schools. It sketches the state of play with machines for the scoring of writing, and describes how these machines work and what they do.…
Interpreting Mini-Mental State Examination Performance in Highly Proficient Bilingual Spanish-English and Asian Indian-English Speakers: Demographic Adjustments, Item Analyses, and Supplemental Measures.

Science.gov (United States)

Milman, Lisa H; Faroqi-Shah, Yasmeen; Corcoran, Chris D; Damele, Deanna M

2018-04-17

Performance on the Mini-Mental State Examination (MMSE), among the most widely used global screens of adult cognitive status, is affected by demographic variables including age, education, and ethnicity. This study extends prior research by examining the specific effects of bilingualism on MMSE performance. Sixty independent community-dwelling monolingual and bilingual adults were recruited from eastern and western regions of the United States in this cross-sectional group study. Independent sample t tests were used to compare 2 bilingual groups (Spanish-English and Asian Indian-English) with matched monolingual speakers on the MMSE, demographically adjusted MMSE scores, MMSE item scores, and a nonverbal cognitive measure. Regression analyses were also performed to determine whether language proficiency predicted MMSE performance in both groups of bilingual speakers. Group differences were evident on the MMSE, on demographically adjusted MMSE scores, and on a small subset of individual MMSE items. Scores on a standardized screen of language proficiency predicted a significant proportion of the variance in the MMSE scores of both bilingual groups. Bilingual speakers demonstrated distinct performance profiles on the MMSE. Results suggest that supplementing the MMSE with a language screen, administering a nonverbal measure, and/or evaluating item-based patterns of performance may assist with test interpretation for this population.
Development and validation of a composite scoring system for robot-assisted surgical training--the Robotic Skills Assessment Score.

Science.gov (United States)

Chowriappa, Ashirwad J; Shi, Yi; Raza, Syed Johar; Ahmed, Kamran; Stegemann, Andrew; Wilding, Gregory; Kaouk, Jihad; Peabody, James O; Menon, Mani; Hassett, James M; Kesavadas, Thenkurussi; Guru, Khurshid A

2013-12-01

A standardized scoring system does not exist in virtual reality-based assessment metrics to describe safe and crucial surgical skills in robot-assisted surgery. This study aims to develop an assessment score along with its construct validation. All subjects performed key tasks on previously validated Fundamental Skills of Robotic Surgery curriculum, which were recorded, and metrics were stored. After an expert consensus for the purpose of content validation (Delphi), critical safety determining procedural steps were identified from the Fundamental Skills of Robotic Surgery curriculum and a hierarchical task decomposition of multiple parameters using a variety of metrics was used to develop Robotic Skills Assessment Score (RSA-Score). Robotic Skills Assessment mainly focuses on safety in operative field, critical error, economy, bimanual dexterity, and time. Following, the RSA-Score was further evaluated for construct validation and feasibility. Spearman correlation tests performed between tasks using the RSA-Scores indicate no cross correlation. Wilcoxon rank sum tests were performed between the two groups. The proposed RSA-Score was evaluated on non-robotic surgeons (n = 15) and on expert-robotic surgeons (n = 12). The expert group demonstrated significantly better performance on all four tasks in comparison to the novice group. Validation of the RSA-Score in this study was carried out on the Robotic Surgical Simulator. The RSA-Score is a valid scoring system that could be incorporated in any virtual reality-based surgical simulator to achieve standardized assessment of fundamental surgical tents during robot-assisted surgery. Copyright © 2013 Elsevier Inc. All rights reserved.
A score for measuring health risk perception in environmental surveys.

Science.gov (United States)

Marcon, Alessandro; Nguyen, Giang; Rava, Marta; Braggion, Marco; Grassi, Mario; Zanolin, Maria Elisabetta

2015-09-15

In environmental surveys, risk perception may be a source of bias when information on health outcomes is reported using questionnaires. Using the data from a survey carried out in the largest chipboard industrial district in Italy (Viadana, Mantova), we devised a score of health risk perception and described its determinants in an adult population. In 2006, 3697 parents of children were administered a questionnaire that included ratings on 7 environmental issues. Items dimensionality was studied by factor analysis. After testing equidistance across response options by homogeneity analysis, a risk perception score was devised by summing up item ratings. Factor analysis identified one latent factor, which we interpreted as health risk perception, that explained 65.4% of the variance of five items retained after scaling. The scale (range 0-10, mean ± SD 9.3 ± 1.9) had a good internal consistency (Cronbach's alpha 0.87). Most subjects (80.6%) expressed maximum risk perception (score = 10). Italian mothers showed significantly higher risk perception than foreign fathers. Risk perception was higher for parents of young children, and for older parents with a higher education, than for their counterparts. Actual distance to major roads was not associated with the score, while self-reported intense traffic and frequent air refreshing at home predicted higher risk perception. When investigating health effects of environmental hazards using questionnaires, care should be taken to reduce the possibility of awareness bias at the stage of study planning and data analysis. Including appropriate items in study questionnaires can be useful to derive a measure of health risk perception, which can help to identify confounding of association estimates by risk perception. Copyright © 2015 Elsevier B.V. All rights reserved.
Survival analysis of colorectal cancer patients with tumor recurrence using global score test methodology

Energy Technology Data Exchange (ETDEWEB)

Zain, Zakiyah, E-mail: zac@uum.edu.my; Ahmad, Yuhaniz, E-mail: yuhaniz@uum.edu.my [School of Quantitative Sciences, Universiti Utara Malaysia, UUM Sintok 06010, Kedah (Malaysia); Azwan, Zairul, E-mail: zairulazwan@gmail.com, E-mail: farhanaraduan@gmail.com, E-mail: drisagap@yahoo.com; Raduan, Farhana, E-mail: zairulazwan@gmail.com, E-mail: farhanaraduan@gmail.com, E-mail: drisagap@yahoo.com; Sagap, Ismail, E-mail: zairulazwan@gmail.com, E-mail: farhanaraduan@gmail.com, E-mail: drisagap@yahoo.com [Surgery Department, Universiti Kebangsaan Malaysia Medical Centre, Jalan Yaacob Latif, 56000 Bandar Tun Razak, Kuala Lumpur (Malaysia); Aziz, Nazrina, E-mail: nazrina@uum.edu.my

2014-12-04

Colorectal cancer is the third and the second most common cancer worldwide in men and women respectively, and the second in Malaysia for both genders. Surgery, chemotherapy and radiotherapy are among the options available for treatment of patients with colorectal cancer. In clinical trials, the main purpose is often to compare efficacy between experimental and control treatments. Treatment comparisons often involve several responses or endpoints, and this situation complicates the analysis. In the case of colorectal cancer, sets of responses concerned with survival times include: times from tumor removal until the first, the second and the third tumor recurrences, and time to death. For a patient, the time to recurrence is correlated to the overall survival. In this study, global score test methodology is used in combining the univariate score statistics for comparing treatments with respect to each survival endpoint into a single statistic. The data of tumor recurrence and overall survival of colorectal cancer patients are taken from a Malaysian hospital. The results are found to be similar to those computed using the established Wei, Lin and Weissfeld method. Key factors such as ethnic, gender, age and stage at diagnose are also reported.
Saudi normative data for the Wisconsin Card Sorting test, Stroop test, Test of Non-verbal Intelligence-3, Picture Completion and Vocabulary (subtest of the Wechsler Adult Intelligence Scale-Revised).

Science.gov (United States)

Al-Ghatani, Ali M; Obonsawin, Marc C; Binshaig, Basmah A; Al-Moutaery, Khalaf R

2011-01-01

There are 2 aims for this study: first, to collect normative data for the Wisconsin Card Sorting Test (WCST), Stroop test, Test of Non-verbal Intelligence (TONI-3), Picture Completion (PC) and Vocabulary (VOC) sub-test of the Wechsler Adult Intelligence Scale-Revised for use in a Saudi Arabian culture, and second, to use the normative data provided to generate the regression equations. To collect the normative data and generate the regression equations, 198 healthy individuals were selected to provide a representative distribution for age, gender, years of education, and socioeconomic class. The WCST, Stroop test, TONI-3, PC, and VOC were administrated to the healthy individuals. This study was carried out at the Department of Clinical Neurosciences, Riyadh Military Hospital, Riyadh, Kingdom of Saudi Arabia from January 2000 to July 2002. Normative data were obtained for all tests, and tables were constructed to interpret scores for different age groups. Regression equations to predict performance on the 3 tests of frontal function from scores on tests of fluid (TONI-3) and premorbid intelligence were generated from the data from the healthy individuals. The data collected in this study provide normative tables for 3 tests of frontal lobe function and for tests of general intellectual ability for use in Saudi Arabia. The data also provide a method to estimate pre-injury ability without the use of verbally based tests.
Antiretroviral neuropenetration scores better correlate with cognitive performance of HIV-infected patients after accounting for drug susceptibility.

Science.gov (United States)

Fabbiani, Massimiliano; Grima, Pierfrancesco; Milanini, Benedetta; Mondi, Annalisa; Baldonero, Eleonora; Ciccarelli, Nicoletta; Cauda, Roberto; Silveri, Maria C; De Luca, Andrea; Di Giambenedetto, Simona

2015-01-01

The aim of the study was to explore how viral resistance and antiretroviral central nervous system (CNS) penetration could impact on cognitive performance of HIV-infected patients. We performed a multicentre cross-sectional study enrolling HIV-infected patients undergoing neuropsychological testing, with a previous genotypic resistance test on plasma samples. CNS penetration-effectiveness (CPE) scores and genotypic susceptibility scores (GSS) were calculated for each regimen. A composite score (CPE-GSS) was then constructed. Factors associated with cognitive impairment were investigated by logistic regression analysis. A total of 215 patients were included. Mean CPE was 7.1 (95% CI 6.9, 7.3) with 206 (95.8%) patients showing a CPE≥6. GSS correction decreased the CPE value in 21.4% (mean 6.5, 95% CI 6.3, 6.7), 26.5% (mean 6.4, 95% CI 6.1, 6.6) and 24.2% (mean 6.4, 95% CI 6.2, 6.6) of subjects using ANRS, HIVDB and REGA rules, respectively. Overall, 66 (30.7%) patients were considered cognitively impaired. No significant association could be demonstrated between CPE and cognitive impairment. However, higher GSS-CPE was associated with a lower risk of cognitive impairment (CPE-GSSANRS odds ratio 0.75, P=0.022; CPE-GSSHIVDB odds ratio 0.77, P=0.038; CPE-GSSREGA odds ratio 0.78, P=0.038). Overall, a cutoff of CPE-GSS≥5 seemed the most discriminatory according to each different interpretation system. GSS-corrected CPE score showed a better correlation with neurocognitive performance than the standard CPE score. These results suggest that antiretroviral drug susceptibility, besides drug CNS penetration, can play a role in the control of HIV-associated neurocognitive disorders.
Mutagenicity in drug development: interpretation and significance of test results.

Science.gov (United States)

Clive, D

1985-03-01

The use of mutagenicity data has been proposed and widely accepted as a relatively fast and inexpensive means of predicting long-term risk to man (i.e., cancer in somatic cells, heritable mutations in germ cells). This view is based on the universal nature of the genetic material, the somatic mutation model of carcinogenesis, and a number of studies showing correlations between mutagenicity and carcinogenicity. An uncritical acceptance of this approach by some regulatory and industrial concerns is over-conservative, naive, and scientifically unjustifiable on a number of grounds: Human cancers are largely life-style related (e.g., cigarettes, diet, tanning). Mutagens (both natural and man-made) are far more prevalent in the environment than was originally assumed (e.g., the natural bases and nucleosides, protein pyrolysates, fluorescent lights, typewriter ribbon, red wine, diesel fuel exhausts, viruses, our own leukocytes). "False-positive" (relative to carcinogenicity) and "false-negative" mutagenicity results occur, often with rational explanations (e.g., high threshold, inappropriate metabolism, inadequate genetic endpoint), and thereby confound any straightforward interpretation of mutagenicity test results. Test battery composition affects both the proper identification of mutagens and, in many instances, the ability to make preliminary risk assessments. In vitro mutagenicity assays ignore whole animal protective mechanisms, may provide unphysiological metabolism, and may be either too sensitive (e.g., testing at orders-of-magnitude higher doses than can be ingested) or not sensitive enough (e.g., short-term treatments inadequately model chronic exposure in bioassay). Bacterial systems, particularly the Ames assay, cannot in principle detect chromosomal events which are involved in both carcinogenesis and germ line mutations in man. Some compounds induce only chromosomal events and little or no detectable single-gene events (e.g., acyclovir, caffeine
An Interpreter's Interpretation: Sign Language Interpreters' View of Musculoskeletal Disorders

National Research Council Canada - National Science Library

Johnson, William L

2003-01-01

Sign language interpreters are at increased risk for musculoskeletal disorders. This study used content analysis to obtain detailed information about these disorders from the interpreters' point of view...
Scoring Strategies for the TOEFL iBT A Complete Guide

CERN Document Server

Stirling, Bruce

2012-01-01

TOEFL students all ask: How can I get a high TOEFL iBT score? Answer: Learn argument scoring strategies. Why? Because the TOEFL iBT recycles opinion-based and fact-based arguments for testing purposes from start to finish. In other words, the TOEFL iBT is all arguments. That's right, all arguments. If you want a high score, you need essential argument scoring strategies. That is what Scoring Strategies for the TOEFL iBT gives you, and more!. TEST-PROVEN STRATEGIES. Learn essential TOEFL iBT scoring strategies developed in American university classrooms and proven successful on the TOEFL iBT. R
How to calculate an MMSE score from a MODA score (and vice versa) in patients with Alzheimer's disease.

Science.gov (United States)

Cazzaniga, R; Francescani, A; Saetti, C; Spinnler, H

2003-11-01

The aim of the present study was to provide a statistically sound way of reciprocally converting scores of the mini-mental state examination (MMSE) and the Milan overall dementia assessment (MODA). A consecutive series of 182 patients with "probable" Alzheimer's disease patients was examined with both tests. MODA and MMSE scores proved to be highly correlated. A formula for converting MODA and MMSE scores was generated.
Psychometric properties of the Neck OutcOme Score, Neck Disability Index, and Short Form-36 were evaluated in patients with neck pain.

Science.gov (United States)

Juul, Tina; Søgaard, Karen; Davis, Aileen M; Roos, Ewa M

2016-11-01

To assess reliability, construct validity, responsiveness, and interpretability for Neck OutcOme Score (NOOS), Neck Disability Index (NDI), and Short Form-36 (SF-36) in neck pain patients. Internal consistency was assessed by Cronbach alpha. Test-retest reliability was evaluated by intraclass correlation coefficient (ICC), and measurement error was estimated from the standard error of measurement. Responsiveness was assessed as standardized response mean (SRM) and interpretability from the minimal important difference (MID). Construct validity was tested correlating subscale scores from NOOS and SF-36 and NDI items. At baseline, 196 neck pain patients were included. Cronbach α was adequate for most NOOS subscales, NDI, and SF-36 with few exceptions. Good to excellent reliability was found for NOOS subscales (ICC 0.88-0.95), for NDI, and for SF-36 with few exceptions. For NOOS, minimal detectable changes varied between 1.1 and 1.9, and construct validity was supported. SRMs were higher for NOOS subscales (0.19-0.42), compared to SF-36 and NDI. MID values varied between 15.0 and 24.1 for NOOS subscales. In conclusion, the NOOS is a reliable, valid, and responsive measure of self-reported disability in neck pain patients, performing at least as well or better than the commonly used SF-36 and NDI. Copyright Â© 2016 Elsevier Inc. All rights reserved.
Confidence Intervals for Weighted Composite Scores under the Compound Binomial Error Model

Science.gov (United States)

Kim, Kyung Yong; Lee, Won-Chan

2018-01-01

Reporting confidence intervals with test scores helps test users make important decisions about examinees by providing information about the precision of test scores. Although a variety of estimation procedures based on the binomial error model are available for computing intervals for test scores, these procedures assume that items are randomly…
Automated Scoring for the "TOEFL Junior"® Comprehensive Writing and Speaking Test. Research Report. ETS RR-15-09

Science.gov (United States)

Evanini, Keelan; Heilman, Michael; Wang, Xinhao; Blanchard, Daniel

2015-01-01

This report describes the initial automated scoring results that were obtained using the constructed responses from the Writing and Speaking sections of the pilot forms of the "TOEFL Junior"® Comprehensive test administered in late 2011. For all of the items except one (the edit item in the Writing section), existing automated scoring…
A diagnostic scoring system for myxedema coma.

Science.gov (United States)

Popoveniuc, Geanina; Chandra, Tanu; Sud, Anchal; Sharma, Meeta; Blackman, Marc R; Burman, Kenneth D; Mete, Mihriye; Desale, Sameer; Wartofsky, Leonard

2014-08-01

To develop diagnostic criteria for myxedema coma (MC), a decompensated state of extreme hypothyroidism with a high mortality rate if untreated, in order to facilitate its early recognition and treatment. The frequencies of characteristics associated with MC were assessed retrospectively in patients from our institutions in order to derive a semiquantitative diagnostic point scale that was further applied on selected patients whose data were retrieved from the literature. Logistic regression analysis was used to test the predictive power of the score. Receiver operating characteristic (ROC) curve analysis was performed to test the discriminative power of the score. Of the 21 patients examined, 7 were reclassified as not having MC (non-MC), and they were used as controls. The scoring system included a composite of alterations of thermoregulatory, central nervous, cardiovascular, gastrointestinal, and metabolic systems, and presence or absence of a precipitating event. All 14 of our MC patients had a score of ≥60, whereas 6 of 7 non-MC patients had scores of 25 to 50. A total of 16 of 22 MC patients whose data were retrieved from the literature had a score ≥60, and 6 of 22 of these patients scored between 45 and 55. The odds ratio per each score unit increase as a continuum was 1.09 (95% confidence interval [CI], 1.01 to 1.16; P = .019); a score of 60 identified coma, with an odds ratio of 1.22. The area under the ROC curve was 0.88 (95% CI, 0.65 to 1.00), and the score of 60 had 100% sensitivity and 85.71% specificity. A score ≥60 in the proposed scoring system is potentially diagnostic for MC, whereas scores between 45 and 59 could classify patients at risk for MC.
Interpreting results of cluster surveys in emergency settings: is the LQAS test the best option?

Science.gov (United States)

Bilukha, Oleg O; Blanton, Curtis

2008-12-09

Cluster surveys are commonly used in humanitarian emergencies to measure health and nutrition indicators. Deitchler et al. have proposed to use Lot Quality Assurance Sampling (LQAS) hypothesis testing in cluster surveys to classify the prevalence of global acute malnutrition as exceeding or not exceeding the pre-established thresholds. Field practitioners and decision-makers must clearly understand the meaning and implications of using this test in interpreting survey results to make programmatic decisions. We demonstrate that the LQAS test--as proposed by Deitchler et al.--is prone to producing false-positive results and thus is likely to suggest interventions in situations where interventions may not be needed. As an alternative, to provide more useful information for decision-making, we suggest reporting the probability of an indicator's exceeding the threshold as a direct measure of "risk". Such probability can be easily determined in field settings by using a simple spreadsheet calculator. The "risk" of exceeding the threshold can then be considered in the context of other aggravating and protective factors to make informed programmatic decisions.
External validation of the simple clinical score and the HOTEL score, two scores for predicting short-term mortality after admission to an acute medical unit.

Science.gov (United States)

Stræde, Mia; Brabrand, Mikkel

2014-01-01

Clinical scores can be of aid to predict early mortality after admission to a medical admission unit. A developed scoring system needs to be externally validated to minimise the risk of the discriminatory power and calibration to be falsely elevated. We performed the present study with the objective of validating the Simple Clinical Score (SCS) and the HOTEL score, two existing risk stratification systems that predict mortality for medical patients based solely on clinical information, but not only vital signs. Pre-planned prospective observational cohort study. Danish 460-bed regional teaching hospital. We included 3046 consecutive patients from 2 October 2008 until 19 February 2009. 26 (0.9%) died within one calendar day and 196 (6.4%) died within 30 days. We calculated SCS for 1080 patients. We found an AUROC of 0.960 (95% confidence interval [CI], 0.932 to 0.988) for 24-hours mortality and 0.826 (95% CI, 0.774-0.879) for 30-day mortality, and goodness-of-fit test, χ(2) = 2.68 (10 degrees of freedom), P = 0.998 and χ(2) = 4.00, P = 0.947, respectively. We included 1470 patients when calculating the HOTEL score. Discriminatory power (AUROC) was 0.931 (95% CI, 0.901-0.962) for 24-hours mortality and goodness-of-fit test, χ(2) = 5.56 (10 degrees of freedom), P = 0.234. We find that both the SCS and HOTEL scores showed an excellent to outstanding ability in identifying patients at high risk of dying with good or acceptable precision.
An alternative to the balance error scoring system: using a low-cost balance board to improve the validity/reliability of sports-related concussion balance testing.

Science.gov (United States)

Chang, Jasper O; Levy, Susan S; Seay, Seth W; Goble, Daniel J

2014-05-01

Recent guidelines advocate sports medicine professionals to use balance tests to assess sensorimotor status in the management of concussions. The present study sought to determine whether a low-cost balance board could provide a valid, reliable, and objective means of performing this balance testing. Criterion validity testing relative to a gold standard and 7 day test-retest reliability. University biomechanics laboratory. Thirty healthy young adults. Balance ability was assessed on 2 days separated by 1 week using (1) a gold standard measure (ie, scientific grade force plate), (2) a low-cost Nintendo Wii Balance Board (WBB), and (3) the Balance Error Scoring System (BESS). Validity of the WBB center of pressure path length and BESS scores were determined relative to the force plate data. Test-retest reliability was established based on intraclass correlation coefficients. Composite scores for the WBB had excellent validity (r = 0.99) and test-retest reliability (R = 0.88). Both the validity (r = 0.10-0.52) and test-retest reliability (r = 0.61-0.78) were lower for the BESS. These findings demonstrate that a low-cost balance board can provide improved balance testing accuracy/reliability compared with the BESS. This approach provides a potentially more valid/reliable, yet affordable, means of assessing sports-related concussion compared with current methods.
[Interpretation and use of routine pulmonary function tests: Spirometry, static lung volumes, lung diffusion, arterial blood gas, methacholine challenge test and 6-minute walk test].

Science.gov (United States)

Bokov, P; Delclaux, C

2016-02-01

Resting pulmonary function tests (PFT) include the assessment of ventilatory capacity: spirometry (forced expiratory flows and mobilisable volumes) and static volume assessment, notably using body plethysmography. Spirometry allows the potential definition of obstructive defect, while static volume assessment allows the potential definition of restrictive defect (decrease in total lung capacity) and thoracic hyperinflation (increase in static volumes). It must be kept in mind that this evaluation is incomplete and that an assessment of ventilatory demand is often warranted, especially when facing dyspnoea: evaluation of arterial blood gas (searching for respiratory insufficiency) and measurement of the transfer coefficient of the lung, allowing with the measurement of alveolar volume to calculate the diffusing capacity of the lung for CO (DLCO: assessment of alveolar-capillary wall and capillary blood volume). All these pulmonary function tests have been the subject of an Americano-European Task force (standardisation of lung function testing) published in 2005, and translated in French in 2007. Interpretative strategies for lung function tests have been recommended, which define abnormal lung function tests using the 5th and 95th percentiles of predicted values (lower and upper limits of normal values). Thus, these recommendations need to be implemented in all pulmonary function test units. A methacholine challenge test will only be performed in the presence of an intermediate pre-test probability for asthma (diagnostic uncertainty), which is an infrequent setting. The most convenient exertional test is the 6-minute walk test that allows the assessment of walking performance, the search for arterial desaturation and the quantification of dyspnoea complaint. Copyright © 2015 Société nationale française de médecine interne (SNFMI). Published by Elsevier SAS. All rights reserved.
Micronucleus test for radiation biodosimetry in mass casualty events: Evaluation of visual and automated scoring

Energy Technology Data Exchange (ETDEWEB)

Bolognesi, Claudia, E-mail: claudia.bolognesi@istge.i [Environmental Carcinogenesis Unit, National Cancer Research Institute, Largo R. Benzi 10, 16132 Genoa (Italy); Balia, Cristina; Roggieri, Paola [Environmental Carcinogenesis Unit, National Cancer Research Institute, Largo R. Benzi 10, 16132 Genoa (Italy); Cardinale, Francesco [Clinical Epidemiology Unit, National Cancer Research Institute, Largo R. Benzi 10, 16132 Genoa (Italy); Department of Health Sciences, University of Genoa, Genoa (Italy); Bruzzi, Paolo [Clinical Epidemiology Unit, National Cancer Research Institute, Largo R. Benzi 10, 16132 Genoa (Italy); Sorcinelli, Francesca [Environmental Carcinogenesis Unit, National Cancer Research Institute, Largo R. Benzi 10, 16132 Genoa (Italy); Laboratory of Genetics, Histology and Molecular Biology Section, Army Medical and Veterinary, Research Center, Via Santo Stefano Rotondo 4, 00184 Roma (Italy); Lista, Florigio [Laboratory of Genetics, Histology and Molecular Biology Section, Army Medical and Veterinary, Research Center, Via Santo Stefano Rotondo 4, 00184 Roma (Italy); D' Amelio, Raffaele [Sapienza, Universita di Roma II Facolta di Medicina e Chirurgia and Ministero della Difesa, Direzione Generale Sanita Militare (Italy); Righi, Enzo [Frascati National Laboratories, National Institute of Nuclear Physics, Via Enrico Fermi 40, 00044 Frascati, Rome (Italy)

2011-02-15

In the case of a large-scale nuclear or radiological incidents a reliable estimate of dose is an essential tool for providing timely assessment of radiation exposure and for making life-saving medical decisions. Cytogenetics is considered as the 'gold standard' for biodosimetry. The dicentric analysis (DA) represents the most specific cytogenetic bioassay. The micronucleus test (MN) applied in interphase in peripheral lymphocytes is an alternative and simpler approach. A dose-effect calibration curve for the MN frequency in peripheral lymphocytes from 27 adult donors was established after in vitro irradiation at a dose range 0.15-8 Gy of {sup 137}Cs gamma rays (dose rate 6 Gy min{sup -1}). Dose prediction by visual scoring in a dose-blinded study (0.15-4.0 Gy) revealed a high level of accuracy (R = 0.89). The scoring of MN is time consuming and requires adequate skills and expertise. Automated image analysis is a feasible approach allowing to reduce the time and to increase the accuracy of the dose estimation decreasing the variability due to subjective evaluation. A good correlation (R = 0.705) between visual and automated scoring with visual correction was observed over the dose range 0-2 Gy. Almost perfect discrimination power for exposure to 1-2 Gy, and a satisfactory power for 0.6 Gy were detected. This threshold level can be considered sufficient for identification of sub lethally exposed individuals by automated CBMN assay.

Interpretation of Ersec tests on the backup cooling of pressurized water reactors, by using the FLIRA code

International Nuclear Information System (INIS)

Reviglio, Christiane

1977-01-01

This research thesis addresses the study of the most severe accident, or reference accident, which might occur in nuclear reactors, a clean break of a cold branch of the primary circuit, which may put the integrity of barriers against radioactive products dispersion outside of the reactor into question again. More particularly, the thesis addresses the study of the backup cooling system, and the fact that fluid flow during re-flooding must be predicted, and that heat exchange coefficients must be known in order to assess the evolution of sheath temperatures. The research comprised an experimental part which aimed at reproducing as faithfully as possible the re-flooding sequence on a tube with internal flow or on a cluster for a better core simulation. These are the ERSEC tests which are to be interpreted. It also comprised a theoretical part based on the use of computational codes which simulate the different phases of the accident and of backup fluid injection. These codes are based on physical models which describe two-phase flows and heat exchanges, and are adjusted to experimental results. The FLIRA code is used which simulates the re-flooding of a reactor duct, and determines the evolution of different values (pressure, temperatures, flow rate, and so on) during the re-flooding process. Thus, the author presents the reference accident, reports studies performed in the USA and in France (ERSEC tests), indicates the various flow regimes and describes heat exchange mechanisms during re-flooding, presents ERSEC test results, presents the FLIRA code, reports the elaboration of governing equations, indicates the various models introduced in the FLIRA code, and describes the numerical processing of equations. He finally gives a first interpretation of ERSEC tests based on the use of the FLIRA code
Interpreting the Relationships between TOEFL iBT Scores and GPA: Language Proficiency, Policy, and Profiles

Science.gov (United States)

Ginther, April; Yan, Xun

2018-01-01

This study examines the predictive validity of the TOEFL iBT with respect to academic achievement as measured by the first-year grade point average (GPA) of Chinese students at Purdue University, a large, public, Research I institution in Indiana, USA. Correlations between GPA, TOEFL iBT total and subsection scores were examined on 1990 mainland…
Linkage between company scores and stock returns

Directory of Open Access Journals (Sweden)

Saban Celik

2017-12-01

Full Text Available Previous studies on company scores conducted at firm-level, generally concluded that there exists a positive relation between company scores and stock returns. Motivated by these studies, this study examines the relationship between company scores (Corporate Governance Score, Economic Score, Environmental Score, and Social Score and stock returns, both at portfolio-level analysis and firm-level cross-sectional regressions. In portfolio-level analysis, stocks are sorted based on each company scores and quintile portfolio are formed with different levels of company scores. Then, existence and significance of raw returns and risk-adjusted returns difference between portfolios with the extreme company scores (portfolio 10 and portfolio 1 is tested. In addition, firm-level cross-sectional regression is performed to examine the significance of company scores effects with control variables. While portfolio-level analysis results indicate that there is no significant relation between company scores and stock returns; firm-level analysis indicates that economic, environmental, and social scores have effect on stock returns, however, significance and direction of these effects change, depending on the included control variables in the cross-sectional regression.
Interpretation of geophysical well-log measurements in drill hole UE25a-1, Nevada Test Site, Radioactive Waste Program

International Nuclear Information System (INIS)

Hagstrum, J.T.; Daniels, J.J.; Scott, J.H.

1980-01-01

An exploratory hole (UE25a-1) was drilled at Nevada Test Site (NTS) to determine the suitability of pyroclastic deposits as storage sites for radioactive waste. Studies have been conducted to investigate the stratigraphy, structure, mineralogy, petrology, and physical properties of the tuff units encountered in the drill hole. This report deals with the interpretation of physical properties for the tuff units from geophysical well-log measurements. The ash-flow and bedded tuff sequences at NTS comprise complex lithologies of variously welded tuffs with superimposed crystallization and altered zones. To characterize these units, resistivity, density, neutron, gamma-ray, induced polarization, and magnetic susceptibility geophysical well-log measurements were made. Although inherently subjective, a consistent interpretation of the well-log measurements was facilitated by a computer program designed to interpret well logs either individually or simultaneously. The broad features of the welded tuff units are readily distinguished by the geophysical well-log measurements. However, many details revealed by the logs indicate that more work is necessary to clarify the casual elements of well-log response in welded tuffs
Socioeconomic Status and MMPI-2 Interpretation.

Science.gov (United States)

Long, Kathleen A.; And Others

1994-01-01

Examined differences in Minnesota Multiphasic Personality Inventory-2 (MMPI-2) scores between persons of differing educational levels and family income in the MMPI-2 normative sample to determine if MMPI-2 scores are differentially accurate in predicting relevant extra-test characteristics of persons of differing socioeconomic levels. MMPI-2…
ISSUE PAPER: What Do Test Scores in Texas Tell Us?

National Research Council Canada - National Science Library

Klein, Stephen

2000-01-01

...) about possible unintended consequences of these programs. We conducted several analyses to examine the issue of whether TAAS scores can be trusted to provide an accurate index of student skills and abilities...
Ifuzzer : An evolutionary interpreter fuzzer using genetic programming

NARCIS (Netherlands)

Veggalam, Spandan; Rawat, Sanjay; Haller, Istvan; Bos, Herbert

We present an automated evolutionary fuzzing technique to find bugs in JavaScript interpreters. Fuzzing is an automated black box testing technique used for finding security vulnerabilities in the software by providing random data as input. However, in the case of an interpreter, fuzzing is
Hematoma Shape, Hematoma Size, Glasgow Coma Scale Score and ICH Score: Which Predicts the 30-Day Mortality Better for Intracerebral Hematoma?

Science.gov (United States)

Wang, Chih-Wei; Liu, Yi-Jui; Lee, Yi-Hsiung; Hueng, Dueng-Yuan; Fan, Hueng-Chuen; Yang, Fu-Chi; Hsueh, Chun-Jen; Kao, Hung-Wen; Juan, Chun-Jung; Hsu, Hsian-He

2014-01-01

Purpose To investigate the performance of hematoma shape, hematoma size, Glasgow coma scale (GCS) score, and intracerebral hematoma (ICH) score in predicting the 30-day mortality for ICH patients. To examine the influence of the estimation error of hematoma size on the prediction of 30-day mortality. Materials and Methods This retrospective study, approved by a local institutional review board with written informed consent waived, recruited 106 patients diagnosed as ICH by non-enhanced computed tomography study. The hemorrhagic shape, hematoma size measured by computer-assisted volumetric analysis (CAVA) and estimated by ABC/2 formula, ICH score and GCS score was examined. The predicting performance of 30-day mortality of the aforementioned variables was evaluated. Statistical analysis was performed using Kolmogorov-Smirnov tests, paired t test, nonparametric test, linear regression analysis, and binary logistic regression. The receiver operating characteristics curves were plotted and areas under curve (AUC) were calculated for 30-day mortality. A P value less than 0.05 was considered as statistically significant. Results The overall 30-day mortality rate was 15.1% of ICH patients. The hematoma shape, hematoma size, ICH score, and GCS score all significantly predict the 30-day mortality for ICH patients, with an AUC of 0.692 (P = 0.0018), 0.715 (P = 0.0008) (by ABC/2) to 0.738 (P = 0.0002) (by CAVA), 0.877 (Phematoma shape, hematoma size, ICH scores and GCS score all significantly predict the 30-day mortality in an increasing order of AUC. The effect of overestimation of hematoma size by ABC/2 formula in predicting the 30-day mortality could be remedied by using ICH score. PMID:25029592
Challenges in interpretation of thyroid function tests in pregnant women with autoimmune thyroid disease

DEFF Research Database (Denmark)

Feldt-Rasmussen, Ulla; Bliddal, Sofie; Rasmussen, Åse Krogh

2011-01-01

Physiological changes during gestation are important to be aware of in measurement and interpretation of thyroid function tests in women with autoimmune thyroid diseases. Thyroid autoimmune activity is decreasing in pregnancy. Measurement of serum TSH is the first-line screening variable...... for thyroid dysfunction also in pregnancy. However, using serum TSH for control of treatment of maternal thyroid autoimmunity infers a risk for compromised foetal development. Peripheral thyroid hormone values are highly different among laboratories, and there is a need for laboratory-specific gestational age......-related reference ranges. Equally important, the intraindividual variability of the thyroid hormone measurements is much narrower than the interindividual variation (reflecting the reference interval). The best laboratory assessment of thyroid function is a free thyroid hormone estimate combined with TSH...
Correlation Between Screening Mammography Interpretive Performance on a Test Set and Performance in Clinical Practice.

Science.gov (United States)

Miglioretti, Diana L; Ichikawa, Laura; Smith, Robert A; Buist, Diana S M; Carney, Patricia A; Geller, Berta; Monsees, Barbara; Onega, Tracy; Rosenberg, Robert; Sickles, Edward A; Yankaskas, Bonnie C; Kerlikowske, Karla

2017-10-01

Evidence is inconsistent about whether radiologists' interpretive performance on a screening mammography test set reflects their performance in clinical practice. This study aimed to estimate the correlation between test set and clinical performance and determine if the correlation is influenced by cancer prevalence or lesion difficulty in the test set. This institutional review board-approved study randomized 83 radiologists from six Breast Cancer Surveillance Consortium registries to assess one of four test sets of 109 screening mammograms each; 48 radiologists completed a fifth test set of 110 mammograms 2 years later. Test sets differed in number of cancer cases and difficulty of lesion detection. Test set sensitivity and specificity were estimated using woman-level and breast-level recall with cancer status and expert opinion as gold standards. Clinical performance was estimated using women-level recall with cancer status as the gold standard. Spearman rank correlations between test set and clinical performance with 95% confidence intervals (CI) were estimated. For test sets with fewer cancers (N = 15) that were more difficult to detect, correlations were weak to moderate for sensitivity (woman level = 0.46, 95% CI = 0.16, 0.69; breast level = 0.35, 95% CI = 0.03, 0.61) and weak for specificity (0.24, 95% CI = 0.01, 0.45) relative to expert recall. Correlations for test sets with more cancers (N = 30) were close to 0 and not statistically significant. Correlations between screening performance on a test set and performance in clinical practice are not strong. Test set performance more accurately reflects performance in clinical practice if cancer prevalence is low and lesions are challenging to detect. Copyright © 2017 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.
More Issues in Observed-Score Equating

Science.gov (United States)

van der Linden, Wim J.

2013-01-01

This article is a response to the commentaries on the position paper on observed-score equating by van der Linden (this issue). The response focuses on the more general issues in these commentaries, such as the nature of the observed scores that are equated, the importance of test-theory assumptions in equating, the necessity to use multiple…
An Integrated Model of Academic Self-Concept Development: Academic Self-Concept, Grades, Test Scores, and Tracking over 6 Years

Science.gov (United States)

Marsh, Herbert W.; Pekrun, Reinhard; Murayama, Kou; Arens, A. Katrin; Parker, Philip D.; Guo, Jiesi; Dicke, Theresa

2018-01-01

Our newly proposed integrated academic self-concept model integrates 3 major theories of academic self-concept formation and developmental perspectives into a unified conceptual and methodological framework. Relations among math self-concept (MSC), school grades, test scores, and school-level contextual effects over 6 years, from the end of…
Metabolic interpretation of ventilatory parameters during maximal effort test and their applicability to sports

Directory of Open Access Journals (Sweden)

Luis Eduardo Barreto Martins

2007-09-01

Full Text Available One important tool for producing specifi c and individualized training intensities is to determine ventilatory threshold (VT, respiratory compensation point (RCP and maximal oxygen uptake (VO2max by means of maximum effort testing. However, in order to be able to interpret these data in a wide-ranging manner, it is also important to understand the metabolic responses that occur during the test as the systems transporting and utilizing O2 and producing CO2 adjust. This review article presents an overview of the metabolic responses that take place during a hypothetical maximum effort test, and the applicability of the fi gures thus obtained to the training of athletes. ABSTRACT A determinação das velocidades atingidas no limiar ventilatório (LV, ponto de compensação respiratório (PCR e consumo máximo de O2 (VO2max através de um teste de esforço máximo, é uma ferramenta importante para a aplicação de intensidades de treinamento específicas e individualizadas. Mas para poder interpretar os dados de uma forma abrangente, também é importante o entendimento das respostas metabólicas presentes no ajuste dos sistemas de transporte e utilização de O2 e produção de CO2 durante a realização do teste. Esta revisão apresenta um panorama das respostas metabólicas que acontecem durante a realização de um teste de esforço máximo hipotético, e a aplicabilidade dos valores obtidos no treinamento de atletas.
¿Exito en California? A Validity Critique of Language Program Evaluations and Analysis of English Learner Test Scores

Directory of Open Access Journals (Sweden)

Marilyn S. Thompson

2002-01-01

Full Text Available Several states have recently faced ballot initiatives that propose to functionally eliminate bilingual education in favor of English-only approaches. Proponents of these initiatives have argued an overall rise in standardized achievement scores of California's limited English proficient (LEP students is largely due to the implementation of English immersion programs mandated by Proposition 227 in 1998, hence, they claim Exito en California (Success in California. However, many such arguments presented in the media were based on flawed summaries of these data. We first discuss the background, media coverage, and previous research associated with California's Proposition 227. We then present a series of validity concerns regarding use of Stanford-9 achievement data to address policy for educating LEP students; these concerns include the language of the test, alternative explanations, sample selection, and data analysis decisions. Finally, we present a comprehensive summary of scaled-score achievement means and trajectories for California's LEP and non-LEP students for 1998-2000. Our analyses indicate that although scores have risen overall, the achievement gap between LEP and EP students does not appear to be narrowing.
The Bayesian Score Statistic

NARCIS (Netherlands)

Kleibergen, F.R.; Kleijn, R.; Paap, R.

2000-01-01

We propose a novel Bayesian test under a (noninformative) Jeffreys'priorspecification. We check whether the fixed scalar value of the so-calledBayesian Score Statistic (BSS) under the null hypothesis is aplausiblerealization from its known and standardized distribution under thealternative. Unlike
A Novel Scoring System Approach to Assess Patients with Lyme Disease (Nutech Functional Score)

OpenAIRE

Geeta Shroff; Petra Hopf-Seidel

2018-01-01

Introduction: A bacterial infection by Borrelia burgdorferi referred to as Lyme disease (LD) or borreliosis is transmitted mostly by a bite of the tick Ixodes scapularis in the USA and Ixodes ricinus in Europe. Various tests are used for the diagnosis of LD, but their results are often unreliable. We compiled a list of clinically visible and patient-reported symptoms that are associated with LD. Based on this list, we developed a novel scoring system. Methodology: Nutech functional Score (NF...
A componential analysis of proverb interpretation in patients with frontal lobe epilepsy and temporal lobe epilepsy: relationships with disease-related factors.

Science.gov (United States)

McDonald, Carrie R; Delis, Dean C; Kramer, Joel H; Tecoma, Evelyn S; Iragui, Vicente J

2008-05-01

The ability to interpret nonliteral, metaphoric language was explored in patients with frontal lobe epilepsy (FLE) and temporal lobe epilepsy (TLE), and matched control participants, to determine (1) if patients with FLE were impaired in their interpretations relative to those with TLE and controls, and (2) if disease-related variables (e.g., age of seizure onset) predicted performances in either patient group. A total of 22 patients with FLE, 20 patients with TLE, and 23 controls were administered a test of proverb interpretation to assess their ability to grasp the abstract meaning of nonliteral language. Participants were presented with a series of proverbs and asked to provide an oral interpretation of each. Responses to each proverb were scored according to their accuracy and level of abstractness. Patients with FLE, but not TLE, were impaired relative to controls in their overall interpretation of proverbs. However, a subgroup analysis revealed that only patients with left FLE showed impaired interpretation accuracy relative to the other groups, whereas patients with both left FLE and left TLE showed impaired abstraction. Patients with FLE were also impaired when they were asked to select the best interpretation of the proverb from response alternatives. In patients with FLE, only a left-sided seizure focus was associated with poorer performance. In patients with TLE, both an early age of onset and a left-sided seizure focus predicted poorer performance. Overall, FLE patients exhibit greater impairment than TLE patients in interpreting proverbs. However, the nature and disease-specific correlates of impaired performances in proverb interpretation differ between the groups.
Impairments in proverb interpretation following focal frontal lobe lesions☆

Science.gov (United States)

Murphy, Patrick; Shallice, Tim; Robinson, Gail; MacPherson, Sarah E.; Turner, Martha; Woollett, Katherine; Bozzali, Marco; Cipolotti, Lisa

2013-01-01

The proverb interpretation task (PIT) is often used in clinical settings to evaluate frontal “executive” dysfunction. However, only a relatively small number of studies have investigated the relationship between frontal lobe lesions and performance on the PIT. We compared 52 patients with unselected focal frontal lobe lesions with 52 closely matched healthy controls on a proverb interpretation task. Participants also completed a battery of neuropsychological tests, including a fluid intelligence task (Raven’s Advanced Progressive Matrices). Lesions were firstly analysed according to a standard left/right sub-division. Secondly, a finer-grained analysis compared the performance of patients with medial, left lateral and right lateral lesions with healthy controls. Thirdly, a contrast of specific frontal subgroups compared the performance of patients with medial lesions with patients with lateral frontal lesions. The results showed that patients with left frontal lesions were significantly impaired on the PIT, while in patients with right frontal lesions the impairments approached significance. Medial frontal patients were the only frontal subgroup impaired on the PIT, relative to healthy controls and lateral frontal patients. Interestingly, an error analysis indicated that a significantly higher number of concrete responses were found in the left lateral subgroup compared to healthy controls. We found no correlation between scores on the PIT and on the fluid intelligence task. Overall our results suggest that specific regions of the frontal lobes contribute to the performance on the PIT. PMID:23850600
Impairments in proverb interpretation following focal frontal lobe lesions.

Science.gov (United States)

Murphy, Patrick; Shallice, Tim; Robinson, Gail; MacPherson, Sarah E; Turner, Martha; Woollett, Katherine; Bozzali, Marco; Cipolotti, Lisa

2013-09-01

The proverb interpretation task (PIT) is often used in clinical settings to evaluate frontal "executive" dysfunction. However, only a relatively small number of studies have investigated the relationship between frontal lobe lesions and performance on the PIT. We compared 52 patients with unselected focal frontal lobe lesions with 52 closely matched healthy controls on a proverb interpretation task. Participants also completed a battery of neuropsychological tests, including a fluid intelligence task (Raven's Advanced Progressive Matrices). Lesions were firstly analysed according to a standard left/right sub-division. Secondly, a finer-grained analysis compared the performance of patients with medial, left lateral and right lateral lesions with healthy controls. Thirdly, a contrast of specific frontal subgroups compared the performance of patients with medial lesions with patients with lateral frontal lesions. The results showed that patients with left frontal lesions were significantly impaired on the PIT, while in patients with right frontal lesions the impairments approached significance. Medial frontal patients were the only frontal subgroup impaired on the PIT, relative to healthy controls and lateral frontal patients. Interestingly, an error analysis indicated that a significantly higher number of concrete responses were found in the left lateral subgroup compared to healthy controls. We found no correlation between scores on the PIT and on the fluid intelligence task. Overall our results suggest that specific regions of the frontal lobes contribute to the performance on the PIT. © 2013 The Authors. Published by Elsevier Ltd. All rights reserved.
A figurative proverb test for dementia: rapid detection of disinhibition, excuse and confabulation, causing discommunication.

Science.gov (United States)

Yamaguchi, Haruyasu; Maki, Yohko; Yamaguchi, Tomoharu

2011-12-01

Communicative disability is regarded as a prominent symptom of demented patients, and many studies have been devoted to analyze deficits of lexical-semantic operations in demented patients. However, it is often observed that even patients with preserved lexical-semantic skills might fail in interactive social communication. Whereas social interaction requires pragmatic language skills, pragmatic language competencies in demented subjects have not been well understood. We propose here a brief stress-free test to detect pragmatic language deficits, focusing on non-literal understanding of figurative expression. We hypothesized that suppression of the literal interpretation was required for figurative language interpretation. We examined 69 demented subjects, 13 subjects with mild cognitive impairment and 61 healthy controls aged 65 years or more. The subjects were asked the meaning of a familiar proverb categorized as a figurative expression. The answers were analyzed based on five factors, and scored from 0 to 5. To consider the influence of cognitive inhibition on proverb comprehension, the scores of the Stroop Colour-Word Test were compared concerning correct and incorrect answers for each factor, respectively. Furthermore, the characteristics of answers were considered in the light of excuse and confabulation qualitatively. The proverb comprehension scores gradually decreased significantly as dementia progressed. The literal interpretation of the proverb, which showed difficulties in figurative language comprehension, was related to disinhibition. The qualitative analysis showed that excuse and confabulation increased as the dementia stage progressed. Deficits in cognitive inhibition partly explains the difficulties in interactive social communication in dementia. With qualitative analysis, asking the meaning of a proverb can be a brief test applied in a clinical setting to evaluate the stage of dementia, and to illustrate disinhibition, confabulation and

Some links on this page may take you to non-federal websites. Their policies may differ from this site.