WorldWideScience

Sample records for two-sample test scores

  1. The Bender-Gestalt test: Koppitz's Developmental Scoring System administered to two samples of Italian preschool and primary school children.

    Science.gov (United States)

    Mazzeschi, C; Lis, A

    1999-06-01

    The purpose of this paper was to extend research on Koppitz's Developmental Scoring System to Italian samples. Specific attention has been given to the study of errors for the single designs to assess the relationship of these errors with total errors and to assess the designs' varying difficulty. A second purpose was to study possible cultural influences between different Italian regions. According to Koppitz (1975) research findings support that the rate of development in visuomotor perception differs among children of various ethnic groups. Subjects were 538 boys and 527 girls enrolled in the regular kindergarten and elementary schools in Italy. Detailed analyses were carried out on total mean errors and mean errors for each design. Mean errors decrease across age groups; that is, perceptuomotor integration is improved for older children. No significant differences were found between Northern and Southern Italy.

  2. GOODNESS-OF-FIT TEST ON TWO SAMPLES

    Institute of Scientific and Technical Information of China (English)

    WANG Lixin; YANG Zhenhai; PANG Wankai

    2000-01-01

    In this paper, a new statistics for testing two samples coming from the same population is derived from a simple linear model with an artificial parameter. Its limit distribution is a chi-squared distribution with 2 degrees of freedom under null hypothesis and the limit distribution is a noncentral chi-squared distribution with 2 degrees of freedom under certain sequence of alternative hypothesis. Finally, we make power comparison with other tests on two samples, especially, with Smirnov statistics.

  3. Optimal tests for the two-sample spherical location problem

    CERN Document Server

    Ley, Christophe; Verdebout, Thomas

    2012-01-01

    We tackle the classical two-sample spherical location problem for directional data by having recourse to the Le Cam methodology, habitually used in classical "linear" multivariate analysis. More precisely we construct locally and asymptotically optimal (in the maximin sense) parametric tests, which we then turn into semi-parametric ones in two distinct ways. First, by using a studentization argument; this leads to so-called pseudo-FvML tests. Second, by resorting to the invariance principle; this leads to efficient rank-based tests. Within each construction, the semi-parametric tests inherit optimality under a given distribution (the FvML in the first case, any rotationally symmetric one in the second) from their parametric counterparts and also improve on the latter by being valid under the whole class of rotationally symmetric distributions. Asymptotic relative efficiencies are calculated and the finite-sample behavior of the proposed tests is investigated by means of a Monte Carlo simulation.

  4. Testing Homogeneity in a Semiparametric Two-Sample Problem

    Directory of Open Access Journals (Sweden)

    Yukun Liu

    2012-01-01

    Full Text Available We study a two-sample homogeneity testing problem, in which one sample comes from a population with density f(x and the other is from a mixture population with mixture density (1−λf(x+λg(x. This problem arises naturally from many statistical applications such as test for partial differential gene expression in microarray study or genetic studies for gene mutation. Under the semiparametric assumption g(x=f(xeα+βx, a penalized empirical likelihood ratio test could be constructed, but its implementation is hindered by the fact that there is neither feasible algorithm for computing the test statistic nor available research results on its theoretical properties. To circumvent these difficulties, we propose an EM test based on the penalized empirical likelihood. We prove that the EM test has a simple chi-square limiting distribution, and we also demonstrate its competitive testing performances by simulations. A real-data example is used to illustrate the proposed methodology.

  5. Test Scoring [book review].

    Science.gov (United States)

    Meijer, Rob R.

    2003-01-01

    This book discusses how to obtain test scores and, in particular, how to obtain test scores from tests that consist of a combination of multiple choice and open-ended questions. The strength of the book is that scoring solutions are presented for a diversity of real world scoring problems. (SLD)

  6. On Wasserstein Two-Sample Testing and Related Families of Nonparametric Tests

    Directory of Open Access Journals (Sweden)

    Aaditya Ramdas

    2017-01-01

    Full Text Available Nonparametric two-sample or homogeneity testing is a decision theoretic problem that involves identifying differences between two random variables without making parametric assumptions about their underlying distributions. The literature is old and rich, with a wide variety of statistics having being designed and analyzed, both for the unidimensional and the multivariate setting. Inthisshortsurvey,wefocusonteststatisticsthatinvolvetheWassersteindistance. Usingan entropic smoothing of the Wasserstein distance, we connect these to very different tests including multivariate methods involving energy statistics and kernel based maximum mean discrepancy and univariate methods like the Kolmogorov–Smirnov test, probability or quantile (PP/QQ plots and receiver operating characteristic or ordinal dominance (ROC/ODC curves. Some observations are implicit in the literature, while others seem to have not been noticed thus far. Given nonparametric two-sample testing’s classical and continued importance, we aim to provide useful connections for theorists and practitioners familiar with one subset of methods but not others.

  7. Modified likelihood ratio test for homogeneity in normal mixtures with two samples

    Institute of Scientific and Technical Information of China (English)

    QIN Yong-song; LEI Qing-zhu

    2008-01-01

    This paper investigates the modified likelihood ratio test(LRT) for homogeneity in normal mixtures of two samples with mixing proportions unknown. It is proved that the limit distribution of the modified likelihood ratio test is X2(1).

  8. A C++ Program for the Cramér-Von Mises Two-Sample Test

    Directory of Open Access Journals (Sweden)

    Yuanhui Xiao

    2006-12-01

    Full Text Available As larger sets of high-throughput data in genomics and proteomics become more readily available, there is a growing need for fast algorithms designed to compute exact p values of distribution-free statistical tests. We present a program for computing the exact distribution of the two-sample Cramér-von Mises test statistic under the null hypothesis that the two samples are drawn from the same continuous distribution. The program makes it possible to handle substantially larger sample sizes than earlier proposed computational tools. The C++ source code for the program is published with this paper, and an R package is under development.

  9. A MODIFIED LIKELIHOOD RATIO TEST FOR HOMOGENEITY IN BIVARIATE NORMAL MIXTURES OF TWO SAMPLES

    Institute of Scientific and Technical Information of China (English)

    Qingzhu LEI; Yongsong QIN

    2009-01-01

    This paper investigates the asymptotic properties of a modified likelihood ratio statistic for testing homogeneity in bivariate normal mixture models of two samples. The asymptotic null distribution of the modified likelihood ratio statistic is found to be X~2_2, where X~2_2 is a chi-squared distribution with 2 degrees of freedom.

  10. A simple powerful bivariate test for two sample location problems in experimental and observational studies

    Directory of Open Access Journals (Sweden)

    Ayatollahi S MT

    2010-05-01

    Full Text Available Abstract Background In many areas of medical research, a bivariate analysis is desirable because it simultaneously tests two response variables that are of equal interest and importance in two populations. Several parametric and nonparametric bivariate procedures are available for the location problem but each of them requires a series of stringent assumptions such as specific distribution, affine-invariance or elliptical symmetry. The aim of this study is to propose a powerful test statistic that requires none of the aforementioned assumptions. We have reduced the bivariate problem to the univariate problem of sum or subtraction of measurements. A simple bivariate test for the difference in location between two populations is proposed. Method In this study the proposed test is compared with Hotelling's T2 test, two sample Rank test, Cramer test for multivariate two sample problem and Mathur's test using Monte Carlo simulation techniques. The power study shows that the proposed test performs better than any of its competitors for most of the populations considered and is equivalent to the Rank test in specific distributions. Conclusions Using simulation studies, we show that the proposed test will perform much better under different conditions of underlying population distribution such as normality or non-normality, skewed or symmetric, medium tailed or heavy tailed. The test is therefore recommended for practical applications because it is more powerful than any of the alternatives compared in this paper for almost all the shifts in location and in any direction.

  11. A two-sample Bayesian t-test for microarray data

    Directory of Open Access Journals (Sweden)

    Dimmic Matthew W

    2006-03-01

    Full Text Available Abstract Background Determining whether a gene is differentially expressed in two different samples remains an important statistical problem. Prior work in this area has featured the use of t-tests with pooled estimates of the sample variance based on similarly expressed genes. These methods do not display consistent behavior across the entire range of pooling and can be biased when the prior hyperparameters are specified heuristically. Results A two-sample Bayesian t-test is proposed for use in determining whether a gene is differentially expressed in two different samples. The test method is an extension of earlier work that made use of point estimates for the variance. The method proposed here explicitly calculates in analytic form the marginal distribution for the difference in the mean expression of two samples, obviating the need for point estimates of the variance without recourse to posterior simulation. The prior distribution involves a single hyperparameter that can be calculated in a statistically rigorous manner, making clear the connection between the prior degrees of freedom and prior variance. Conclusion The test is easy to understand and implement and application to both real and simulated data shows that the method has equal or greater power compared to the previous method and demonstrates consistent Type I error rates. The test is generally applicable outside the microarray field to any situation where prior information about the variance is available and is not limited to cases where estimates of the variance are based on many similar observations.

  12. A two-sample test for high-dimensional data with applications to gene-set testing

    CERN Document Server

    Chen, Song Xi; 10.1214/09-AOS716

    2010-01-01

    We propose a two-sample test for the means of high-dimensional data when the data dimension is much larger than the sample size. Hotelling's classical $T^2$ test does not work for this "large $p$, small $n$" situation. The proposed test does not require explicit conditions in the relationship between the data dimension and sample size. This offers much flexibility in analyzing high-dimensional data. An application of the proposed test is in testing significance for sets of genes which we demonstrate in an empirical study on a leukemia data set.

  13. Do Test Scores Buy Happiness?

    Science.gov (United States)

    McCluskey, Neal

    2017-01-01

    Since at least the enactment of No Child Left Behind in 2002, standardized test scores have served as the primary measures of public school effectiveness. Yet, such scores fail to measure the ultimate goal of education: maximizing happiness. This exploratory analysis assesses nation level associations between test scores and happiness, controlling…

  14. Beyond the Test Scores.

    Science.gov (United States)

    Thibodeau, Janice J.

    1985-01-01

    A diagnostic-prescriptive scheme is illustrated using subtests of the Slingerland Screening Tests for Identifying Children with Specific Language Disability and the Detroit Tests of Learning Aptitude. The scheme is intended to focus on the child's learning style by examining the task and the strategies employed. (CL)

  15. Multivariate generalizations of the Wald--Wolfowitz and Smirnov two-sample tests

    Energy Technology Data Exchange (ETDEWEB)

    Friedman, J.H.; Rafsky, L.C.

    1979-01-01

    Multivariate generalizations of the Wald--Wolfowitz runs statistic and the Smirnov maximum deviation statistic for the two-sample problem are presented. They are based on the minimal spanning tree of the pooled sample points. Some null distribution results are derived and a simulation study of power is reported. 5 figures, 2 tables.

  16. A simple powerful bivariate test for two sample location problems in experimental and observational studies

    OpenAIRE

    2010-01-01

    Abstract Background In many areas of medical research, a bivariate analysis is desirable because it simultaneously tests two response variables that are of equal interest and importance in two populations. Several parametric and nonparametric bivariate procedures are available for the location problem but each of them requires a series of stringent assumptions such as specific distribution, affine-invariance or elliptical symmetry. The aim of this study is to propose a powerful test statistic...

  17. The Effects of Sample Size on Expected Value, Variance and Fraser Efficiency for Nonparametric Independent Two Sample Tests

    Directory of Open Access Journals (Sweden)

    Ismet DOGAN

    2015-10-01

    Full Text Available Objective: Choosing the most efficient statistical test is one of the essential problems of statistics. Asymptotic relative efficiency is a notion which enables to implement in large samples the quantitative comparison of two different tests used for testing of the same statistical hypothesis. The notion of the asymptotic efficiency of tests is more complicated than that of asymptotic efficiency of estimates. This paper discusses the effect of sample size on expected values and variances of non-parametric tests for independent two samples and determines the most effective test for different sample sizes using Fraser efficiency value. Material and Methods: Since calculating the power value in comparison of the tests is not practical most of the time, using the asymptotic relative efficiency value is favorable. Asymptotic relative efficiency is an indispensable technique for comparing and ordering statistical test in large samples. It is especially useful in nonparametric statistics where there exist numerous heuristic tests such as the linear rank tests. In this study, the sample size is determined as 2 ≤ n ≤ 50. Results: In both balanced and unbalanced cases, it is found that, as the sample size increases expected values and variances of all the tests discussed in this paper increase as well. Additionally, considering the Fraser efficiency, Mann-Whitney U test is found as the most efficient test among the non-parametric tests that are used in comparison of independent two samples regardless of their sizes. Conclusion: According to Fraser efficiency, Mann-Whitney U test is found as the most efficient test.

  18. Some new methods for testing randomness of a binomial sequence and its applications in two sample problems

    Directory of Open Access Journals (Sweden)

    P.V. Krishna Iyer

    1957-01-01

    Full Text Available The t-test commonly used for testing two samples is based on the assumption that the sample are random and belong to the same normal population. These assumptions may or may not be valid for different types of experimental data. In cases where these assumptions do not hold good, it would be preferable to use tests which are independent of the nature of the distribution of the parent population. A number of such tests, some developed in the Defence Science Laboratory, is given in this paper. The test depend on a sequence of A's and B's obtained by pooling together the two samples {Xm}and {Yn} and arranging them in ascending or descending order and treating the observations belonging to {xm} and {yn} as A's and B's respectively. For this sequence the number of AB's or AB's and BA's are noted for the following cases: (1 Between any two observations of the sequence separated by (k-1 observations or less; (2 Between any two observations in blocks of (k+1 consecutive observations moving from one end to the other end. It has been found that the standardized deviates of these statics serve as more reliable tests than any of other existing tests. Further work is in progress to confirm these findings.

  19. Distribution of the two-sample t-test statistic following blinded sample size re-estimation.

    Science.gov (United States)

    Lu, Kaifeng

    2016-05-01

    We consider the blinded sample size re-estimation based on the simple one-sample variance estimator at an interim analysis. We characterize the exact distribution of the standard two-sample t-test statistic at the final analysis. We describe a simulation algorithm for the evaluation of the probability of rejecting the null hypothesis at given treatment effect. We compare the blinded sample size re-estimation method with two unblinded methods with respect to the empirical type I error, the empirical power, and the empirical distribution of the standard deviation estimator and final sample size. We characterize the type I error inflation across the range of standardized non-inferiority margin for non-inferiority trials, and derive the adjusted significance level to ensure type I error control for given sample size of the internal pilot study. We show that the adjusted significance level increases as the sample size of the internal pilot study increases. Copyright © 2016 John Wiley & Sons, Ltd.

  20. Conditional Reliability Coefficients for Test Scores.

    Science.gov (United States)

    Nicewander, W Alan

    2017-04-06

    The most widely used, general index of measurement precision for psychological and educational test scores is the reliability coefficient-a ratio of true variance for a test score to the true-plus-error variance of the score. In item response theory (IRT) models for test scores, the information function is the central, conditional index of measurement precision. In this inquiry, conditional reliability coefficients for a variety of score types are derived as simple transformations of information functions. It is shown, for example, that the conditional reliability coefficient for an ordinary, number-correct score, X, is equal to, ρ(X,X'|θ)=I(X,θ)/[I(X,θ)+1] Where: θ is a latent variable measured by an observed test score, X; p(X, X'|θ) is the conditional reliability of X at a fixed value of θ; and I(X, θ) is the score information function. This is a surprisingly simple relationship between the 2, basic indices of measurement precision from IRT and classical test theory (CTT). This relationship holds for item scores as well as test scores based on sums of item scores-and it holds for dichotomous as well as polytomous items, or a mix of both item types. Also, conditional reliabilities are derived for computerized adaptive test scores, and for θ-estimates used as alternatives to number correct scores. These conditional reliabilities are all related to information in a manner similar-or-identical to the 1 given above for the number-correct (NC) score. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  1. What do educational test scores really measure?

    DEFF Research Database (Denmark)

    McIntosh, James; D. Munk, Martin

    measure of pure cognitive ability. We find that variables which are not closely associated with traditional notions of intelligence explain a significant proportion of the variation in test scores. This adds to the complexity of interpreting test scores and suggests that school culture, attitudes...

  2. What Do Test Scores Really Mean? A Latent Class Analysis of Danish Test Score Performance

    DEFF Research Database (Denmark)

    Munk, Martin D.; McIntosh, James

    2014-01-01

    Latent class Poisson count models are used to analyze a sample of Danish test score results from a cohort of individuals born in 1954-55, tested in 1968, and followed until 2011. The procedure takes account of unobservable effects as well as excessive zeros in the data. We show that the test scores...... of intelligence explain a significant proportion of the variation in test scores. This adds to the complexity of interpreting test scores and suggests that school culture and possible incentive problems make it more di¢ cult to understand what the tests measure....

  3. The two-sample problem for Poisson processes: adaptive tests with a non-asymptotic wild bootstrap approach

    CERN Document Server

    Reynaud-Bouret, Patricia; Laurent, Béatrice

    2012-01-01

    Considering two independent Poisson processes, we address the question of testing equality of their respective intensities. We construct multiple testing procedures from the aggregation of single tests whose testing statistics come from model selection, thresholding and/or kernel estimation methods. The corresponding critical values are computed through a non-asymptotic wild bootstrap approach. The obtained tests are proved to be exactly of level $\\alpha$, and to satisfy non-asymptotic oracle type inequalities. From these oracle type inequalities, we deduce that our tests are adaptive in the minimax sense over a large variety of classes of alternatives based on classical and weak Besov bodies in the univariate case, but also Sobolev and anisotropic Nikol'skii-Besov balls in the multivariate case. A simulation study furthermore shows that they strongly perform in practice.

  4. What do educational test scores really measure?

    DEFF Research Database (Denmark)

    McIntosh, James; D. Munk, Martin

    Latent class Poisson count models are used to analyze a sample of Danish test score results from a cohort of individuals born in 1954-55 and tested in 1968. The procedure takes account of unobservable effects as well as excessive zeros in the data. The bulk of unobservable effects are uncorrelate...

  5. Critical Thinking: More than Test Scores

    Science.gov (United States)

    Smith, Vernon G.; Szymanski, Antonia

    2013-01-01

    This article is for practicing or aspiring school administrators. The demand for excellence in public education has lead to an emphasis on standardized test scores. This article explores the development of a professional enhancement program designed to prepare teachers to teach higher order thinking skills. Higher order thinking is the primary…

  6. Two-sample density-based empirical likelihood tests for incomplete data in application to a pneumonia study.

    Science.gov (United States)

    Vexler, Albert; Yu, Jihnhee

    2011-07-01

    In clinical trials examining the incidence of pneumonia it is a common practice to measure infection via both invasive and non-invasive procedures. In the context of a recently completed randomized trial comparing two treatments the invasive procedure was only utilized in certain scenarios due to the added risk involved, and given that the level of the non-invasive procedure surpassed a given threshold. Hence, what was observed was bivariate data with a pattern of missingness in the invasive variable dependent upon the value of the observed non-invasive observation within a given pair. In order to compare two treatments with bivariate observed data exhibiting this pattern of missingness we developed a semi-parametric methodology utilizing the density-based empirical likelihood approach in order to provide a non-parametric approximation to Neyman-Pearson-type test statistics. This novel empirical likelihood approach has both a parametric and non-parametric components. The non-parametric component utilizes the observations for the non-missing cases, while the parametric component is utilized to tackle the case where observations are missing with respect to the invasive variable. The method is illustrated through its application to the actual data obtained in the pneumonia study and is shown to be an efficient and practical method. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  7. Statistical energy as a tool for binning-free, multivariate goodness-of-fit tests, two-sample comparison and unfolding

    Energy Technology Data Exchange (ETDEWEB)

    Aslan, B. [Universitaet Siegen, Holderlinstrasse 3, D-57068 Siegen (Germany)]. E-mail: aslan@physik.uni-siegen.de; Zech, G. [Universitaet Siegen, Holderlinstrasse 3, D-57068 Siegen (Germany)]. E-mail: zech@physik.uni-siegen.de

    2005-02-01

    We introduce the novel concept of statistical energy as a statistical tool. We define statistical energy of statistical distributions in a similar way as for electric charge distributions. Charges of opposite sign are in a state of minimum energy if they are equally distributed. This property is used to check whether two samples belong to the same parent distribution, to define goodness-of-fit tests and to unfold distributions distorted by measurement. The approach is binning-free and especially powerful in multidimensional applications.

  8. ITC Guidelines on Quality Control in Scoring, Test Analysis, and Reporting of Test Scores

    Science.gov (United States)

    Allalouf, Avi

    2014-01-01

    The Quality Control (QC) Guidelines are intended to increase the efficiency, precision, and accuracy of the scoring, analysis, and reporting process of testing. The QC Guidelines focus on large-scale testing operations where multiple forms of tests are created for use on set dates. However, they may also be used for a wide variety of other testing…

  9. ITC Guidelines on Quality Control in Scoring, Test Analysis, and Reporting of Test Scores

    Science.gov (United States)

    Allalouf, Avi

    2014-01-01

    The Quality Control (QC) Guidelines are intended to increase the efficiency, precision, and accuracy of the scoring, analysis, and reporting process of testing. The QC Guidelines focus on large-scale testing operations where multiple forms of tests are created for use on set dates. However, they may also be used for a wide variety of other testing…

  10. Validating the Interpretations and Uses of Test Scores

    Science.gov (United States)

    Kane, Michael T.

    2013-01-01

    To validate an interpretation or use of test scores is to evaluate the plausibility of the claims based on the scores. An argument-based approach to validation suggests that the claims based on the test scores be outlined as an argument that specifies the inferences and supporting assumptions needed to get from test responses to score-based…

  11. Facilitating the Interpretation of English Language Proficiency Scores: Combining Scale Anchoring and Test Score Mapping Methodologies

    Science.gov (United States)

    Powers, Donald; Schedl, Mary; Papageorgiou, Spiros

    2017-01-01

    The aim of this study was to develop, for the benefit of both test takers and test score users, enhanced "TOEFL ITP"® test score reports that go beyond the simple numerical scores that are currently reported. To do so, we applied traditional scale anchoring (proficiency scaling) to item difficulty data in order to develop performance…

  12. A Procedure for Linear Polychotomous Scoring of Test Items

    Science.gov (United States)

    1993-10-01

    associated with the response categories of test items . When tests are scored using these scoring weights, test reliability increases. The new procedure is...program POLY. The example demonstrates how polyweighting can be used to calibrate and score test items drawn from an item bank that is too large to

  13. The Test Score Decline: A Review and Annotated Bibliography

    Science.gov (United States)

    1981-08-01

    J.R., The Test Score Decline: Are the Public Schools the Scapegoat? Part Two =129. K%’apfer. P., Kapfer , M., & Woodruff, A., Declining Test Scores...Michigan State University, August 1976. 129. Kapfer , P.F., Kapfer , M.B., & Woodruff, A.D., Declining test scores: Inter- pretations, issues, and relationship

  14. Nontraditional Scoring of C-tests

    CERN Document Server

    Tamara, Tretjakova

    2007-01-01

    In C-tests the hypothesis of items local independence is violated, which doesn't permit to consider them as real tests. It is suggested to determine the distances between separate C-test items (blanks) and to combine items into clusters. Weights, inversely proportional to the number of items in corresponding clusters, are assigned to items. As a result, the C-test structure becomes similar to the structure of classical tests, without violation of local independence hypothesis.

  15. A NOTE ON INCONSISTENCY OF THE SCORE TEST

    Directory of Open Access Journals (Sweden)

    Sumathi K

    2010-12-01

    Full Text Available The score test proposed by Rao (1947 has been widely used in the recent years for data analysis and model building because of its simplicity. However, at the time of its computation, it has been found that the value of the score test statistic becomes negative. Freedman (2007 discussed some of the theoretical reasons for this inconsistency of the score test and observed that the test was inconsistent when the observed Fisher information matrix was used rather than the expected Fisher information matrix. The present paper is an attempt to demonstrate the inconsistency of the score test in terms of the power function.

  16. Pictures Speak Louder than Test Scores.

    Science.gov (United States)

    McCabe, Deborah; Hilmo, Joellen

    1985-01-01

    The Goodenough-Harris Draw-a-Person Test, if given at regular intervals during periods of remediation, may show clear evidence of improvement in behavior and attitude of learning disabled students. (CL)

  17. Improving Scores on the IELTS Speaking Test

    Science.gov (United States)

    Issitt, Steve

    2008-01-01

    This article presents three strategies for teaching students who are taking the IELTS speaking test. The first strategy is aimed at improving confidence and uses a variety of self-help materials from the field of popular psychology. The second encourages students to think critically and invokes a range of academic perspectives. The third strategy…

  18. Improving Scores on the IELTS Speaking Test

    Science.gov (United States)

    Issitt, Steve

    2008-01-01

    This article presents three strategies for teaching students who are taking the IELTS speaking test. The first strategy is aimed at improving confidence and uses a variety of self-help materials from the field of popular psychology. The second encourages students to think critically and invokes a range of academic perspectives. The third strategy…

  19. The Trait Structure of Cloze Test Scores.

    Science.gov (United States)

    Bachman, Lyle F.

    1982-01-01

    Presents study designed to examine trait structure of a cloze test using confirmatory factor analysis. Results suggest that a modified cloze passage, using rational deletions, is capable of measuring syntactic- and discourse-level relationships in a text, and this advantage may outweigh considerations of reduced redundancy which underlie random…

  20. Fuzzy Math: A Meditation on Test Scoring

    Science.gov (United States)

    Jacks, Meredith

    2011-01-01

    As a public school English teacher, the author observes standardized testing season each year with a sort of grim fascination. "So this is it," she thinks as she paces around her silent classroom, peering over kids' shoulders at articles about parasailing. Line graphs tracking the rainfall in Tulsa. Parts of speech. Functions of "x." "These are…

  1. A Human Capital Model of Educational Test Scores

    DEFF Research Database (Denmark)

    McIntosh, James; D. Munk, Martin

    measure of pure cognitive ability. We find that variables which are not closely associated with traditional notions of intelligence explain a significant proportion of the variation in test scores. This adds to the complexity of interpreting test scores and suggests that school culture, attitudes...

  2. Improving personality facet scores with multidimensional computer adaptive testing

    DEFF Research Database (Denmark)

    Makransky, Guido; Mortensen, Erik Lykke; Glas, Cees A W

    2013-01-01

    Narrowly defined personality facet scores are commonly reported and used for making decisions in clinical and organizational settings. Although these facets are typically related, scoring is usually carried out for a single facet at a time. This method can be ineffective and time consuming when...... personality tests contain many highly correlated facets. This article investigates the possibility of increasing the precision of the NEO PI-R facet scores by scoring items with multidimensional item response theory and by efficiently administering and scoring items with multidimensional computer adaptive...... testing (MCAT). The increase in the precision of personality facet scores is obtained from exploiting the correlations between the facets. Results indicate that the NEO PI-R could be substantially shorter without attenuating precision when the MCAT methodology is used. Furthermore, the study shows...

  3. Development and testing of a portfolio evaluation scoring tool.

    Science.gov (United States)

    Karlowicz, Karen A

    2010-02-01

    This study focused on development of a portfolio evaluation tool to guide the assignment of valid and reliable scores. Tool development was facilitated by a literature review, guidance of a faculty committee, and validation by content experts. Testing involved a faculty team that evaluated 60 portfolios. Calculation of interrater reliability and a paired-samples t test were used to judge effectiveness. Interrater reliability was 0.78 for overall scores, 0.81 for the seven program outcomes criteria scores, and more than 0.65 for scores assigned by 11 of 13 pairs of raters. There were no significant differences between raters' scores in 10 of 13 pairs. The portfolio evaluation tool demonstrated high reliability and should be tested by other schools using portfolio evaluation.

  4. Evaluating the Predictive Validity of Graduate Management Admission Test Scores

    Science.gov (United States)

    Sireci, Stephen G.; Talento-Miller, Eileen

    2006-01-01

    Admissions data and first-year grade point average (GPA) data from 11 graduate management schools were analyzed to evaluate the predictive validity of Graduate Management Admission Test[R] (GMAT[R]) scores and the extent to which predictive validity held across sex and race/ethnicity. The results indicated GMAT verbal and quantitative scores had…

  5. Evaluating the Predictive Validity of Graduate Management Admission Test Scores

    Science.gov (United States)

    Sireci, Stephen G.; Talento-Miller, Eileen

    2006-01-01

    Admissions data and first-year grade point average (GPA) data from 11 graduate management schools were analyzed to evaluate the predictive validity of Graduate Management Admission Test[R] (GMAT[R]) scores and the extent to which predictive validity held across sex and race/ethnicity. The results indicated GMAT verbal and quantitative scores had…

  6. Group differences in the heritability of items and test scores

    NARCIS (Netherlands)

    Wicherts, J.M.; Johnson, W.

    2009-01-01

    It is important to understand potential sources of group differences in the heritability of intelligence test scores. On the basis of a basic item response model we argue that heritabilities which are based on dichotomous item scores normally do not generalize from one sample to the next. If groups

  7. Counselor Simulation by Film in Test Score Reporting Interviews

    Science.gov (United States)

    Collins, Tom

    1972-01-01

    The responsible and innovative utilization of media, not only in test score reporting but also in other guidance functions, may assist the counselor in permitting him more time to function with clients in counseling relationships. (Author)

  8. Does parental physical violence reduce children's standardized test score performance?

    Science.gov (United States)

    Peek-Asa, Corinne; Maxwell, Leah; Stromquist, Ann; Whitten, Paul; Limbos, Mary Ann; Merchant, James

    2007-11-01

    Many negative cognitive and behavioral outcomes have been identified among children living in households with parental violence, but few studies have examined academic performance. In a rural population-based cohort, we examine the role of parental violence on standardized test score performance. The cohort included 306 children ages 6 through 17. Parents responded to a health interview that included questions about physical violence. Children's standardized test scores were collected prospectively for 5 years after the parent interview. Hierarchical multivariate models clustering on school, household, and repeated individual test scores and controlling for children's and parent's characteristics were run to predict test score performance. One in five children lived in a household in which parents reported at least one act of physical violence. Children whose parents reported intimate partner violence (IPV) performed an average of 12.2 percentile points lower than children whose parents reported no IPV (95% CI, -19.2--5.2; p Parent-reported IPV led to larger test score reductions for girls than for boys and for children less than 12 years old than for older children. Parental physical violence was common, and children in homes with violence had significantly poorer performance on standardized test scores.

  9. A Monte Carlo Comparison of Three Optimal Test Scoring Procedures,

    Science.gov (United States)

    theory, with conventional scoring. The effect of the COWS procedures is to differentially weight test items as a function of examine ability and the...item characteristics. Low ability examinees are given very low weights on difficult test items , lowering the effects of guessing, and decreasing test error.

  10. Grades and Test Scores: Accounting for Observed Differences.

    Science.gov (United States)

    Willingham, Warren W.; Pollack, Judith M.; Lewis, Charles

    2002-01-01

    Proposed a framework of possible differences between grades and test scores and tested the framework with data on 8,454 high school seniors from the National Education Longitudinal Study. Identified differences and correlations among achievement factors. Differences between grades and tests give these measures complementary strengths in…

  11. Prediction of true test scores from observed item scores and ancillary data.

    Science.gov (United States)

    Haberman, Shelby J; Yao, Lili; Sinharay, Sandip

    2015-05-01

    In many educational tests which involve constructed responses, a traditional test score is obtained by adding together item scores obtained through holistic scoring by trained human raters. For example, this practice was used until 2008 in the case of GRE(®) General Analytical Writing and until 2009 in the case of TOEFL(®) iBT Writing. With use of natural language processing, it is possible to obtain additional information concerning item responses from computer programs such as e-rater(®). In addition, available information relevant to examinee performance may include scores on related tests. We suggest application of standard results from classical test theory to the available data to obtain best linear predictors of true traditional test scores. In performing such analysis, we require estimation of variances and covariances of measurement errors, a task which can be quite difficult in the case of tests with limited numbers of items and with multiple measurements per item. As a consequence, a new estimation method is suggested based on samples of examinees who have taken an assessment more than once. Such samples are typically not random samples of the general population of examinees, so that we apply statistical adjustment methods to obtain the needed estimated variances and covariances of measurement errors. To examine practical implications of the suggested methods of analysis, applications are made to GRE General Analytical Writing and TOEFL iBT Writing. Results obtained indicate that substantial improvements are possible both in terms of reliability of scoring and in terms of assessment reliability.

  12. High Test Scores: The Wrong Road to National Economic Success

    Science.gov (United States)

    Baker, Keith

    2011-01-01

    A widely held view is that good schools are essential to a nation's international economic success and that high test scores on international tests of academic skills and knowledge indicate how good a nation's schools are. The widespread belief that good schools are an important contributor to a nation's economic success in the world is supported…

  13. Effort Analysis: Individual Score Validation of Achievement Test Data

    Science.gov (United States)

    Wise, Steven L.

    2015-01-01

    Whenever the purpose of measurement is to inform an inference about a student's achievement level, it is important that we be able to trust that the student's test score accurately reflects what that student knows and can do. Such trust requires the assumption that a student's test event is not unduly influenced by construct-irrelevant factors…

  14. High Test Scores: The Wrong Road to National Economic Success

    Science.gov (United States)

    Baker, Keith

    2011-01-01

    A widely held view is that good schools are essential to a nation's international economic success and that high test scores on international tests of academic skills and knowledge indicate how good a nation's schools are. The widespread belief that good schools are an important contributor to a nation's economic success in the world is supported…

  15. A Human Capital Model of Educational Test Scores

    DEFF Research Database (Denmark)

    McIntosh, James; D. Munk, Martin

    Latent class Poisson count models are used to analyze a sample of Danish test score results from a cohort of individuals born in 1954-55 and tested in 1968. The procedure takes account of unobservable effects as well as excessive zeros in the data. The bulk of unobservable effects are uncorrelate...

  16. Accountancy, teaching methods, sex, and American College Test scores.

    Science.gov (United States)

    Heritage, J; Harper, B S; Harper, J P

    1990-10-01

    This study examines the significance of sex, methodology, academic preparation, and age as related to development of judgmental and problem-solving skills. Sex, American College Test (ACT) Mathematics scores, Composite ACT scores, grades in course work, grade point average (GPA), and age were used in studying the effects of teaching method on 96 students' ability to analyze data in financial statements. Results reflect positively on accounting students compared to the general college population and the women students in particular.

  17. What Do Test Scores in Texas Tell Us?

    Directory of Open Access Journals (Sweden)

    Stephen P. Klein et al

    2000-10-01

    Full Text Available We examine the results on the Texas Assessment of Academic Skills (TAAS, the highest-profile state testing program and one that has recorded extraordinary recent gains in math and reading scores. To investigate whether the dramatic math and reading gains on the TAAS represent actual academic progress, we have compared these gains to score changes in Texas on another test, the National Assessment of Educational Progress (NAEP. Texas students did improve significantly more on a fourth-grade NAEP math test than their counterparts nationally. But, the size of this gain was smaller than their gains on TAAS and was not present on the eighth-grade math test. The stark differences between the stories told by NAEP and TAAS are especially striking when it comes to the gap in average scores between whites and students of color. According to the NAEP results, that gap in Texas is not only very large but increasing slightly. According to TAAS scores, the gap is much smaller and decreasing greatly. Many schools are devoting a great deal of class time to highly specific TAAS preparation. While this preparation may improve TAAS scores, it may not help students develop necessary reading and math skills. Schools with relatively large percentages of minority and poor students may be doing this more than other schools. We raise serious questions about the validity of those gains, and caution against the danger of making decisions to sanction or reward students, teachers and schools on the basis of test scores that may be inflated or misleading. Finally, we suggest some steps that states can take to increase the likelihood that their test results merit public confidence and provide a sound basis for educational policy.

  18. America's Mediocre Test Scores: Education Crisis or Poverty Crisis?

    Science.gov (United States)

    Petrilli, Michael J.; Wright, Brandon L.

    2016-01-01

    At a time when the national conversation is focused on lagging upward mobility, it is no surprise that many educators point to poverty as the explanation for mediocre test scores among U.S. students compared to those of students in other countries. If American teachers in struggling U.S. schools taught in Finland, says Finnish educator Pasi…

  19. A Latent Class Approach to Estimating Test-Score Reliability

    Science.gov (United States)

    van der Ark, L. Andries; van der Palm, Daniel W.; Sijtsma, Klaas

    2011-01-01

    This study presents a general framework for single-administration reliability methods, such as Cronbach's alpha, Guttman's lambda-2, and method MS. This general framework was used to derive a new approach to estimating test-score reliability by means of the unrestricted latent class model. This new approach is the latent class reliability…

  20. Commentary on "Validating the Interpretations and Uses of Test Scores"

    Science.gov (United States)

    Brennan, Robert L.

    2013-01-01

    Kane's paper "Validating the Interpretations and Uses of Test Scores" is the most complete and clearest discussion yet available of the argument-based approach to validation. At its most basic level, validation as formulated by Kane is fundamentally a simply-stated two-step enterprise: (1) specify the claims inherent in a particular interpretation…

  1. The Correlational Relationship between Homeschooling Demographics and High Test Scores.

    Science.gov (United States)

    Burns, Johnna

    Homeschooling, one of the fastest growing educational alternatives, is enjoying increasing respect from educators and parents alike. This is partly because homeschooling children score as well and often better on standardized tests than their publicly schooled counterparts. However, the vast majority of homeschooled students come from the…

  2. America's Mediocre Test Scores: Education Crisis or Poverty Crisis?

    Science.gov (United States)

    Petrilli, Michael J.; Wright, Brandon L.

    2016-01-01

    At a time when the national conversation is focused on lagging upward mobility, it is no surprise that many educators point to poverty as the explanation for mediocre test scores among U.S. students compared to those of students in other countries. If American teachers in struggling U.S. schools taught in Finland, says Finnish educator Pasi…

  3. Scoring Rod-and-Frame Tests: Quantitative and Qualitative Considerations.

    Science.gov (United States)

    Haller, Otto; Edgington, Eugene S.

    1982-01-01

    Current scoring procedures depend on unrealistic assumptions about subjects' performance on the rod-and-frame test. A procedure is presented which corrects for constant error, is sensitive to response strategy and consistency, and examines qualitative and quantitative aspects of performance and individual differences in laterality bias as defined…

  4. The Weighted Airman Promotion System: Standardizing Test Scores

    Science.gov (United States)

    2008-01-01

    42 Approaches to Standardizing PFE /SKT Scores...NPS non–prior service OSD Office of the Secretary of Defense OSI Office of Special Investigations PFE Promotion Fitness Exam RAW Retrieval Application...both the Promotion Fitness Exam ( PFE ) and Specialty Knowledge Test (SKT). Using a percentile ranking was one way to standardize because every AFSC

  5. Simplifying multivariate survival analysis using global score test methodology

    Science.gov (United States)

    Zain, Zakiyah; Aziz, Nazrina; Ahmad, Yuhaniz

    2015-12-01

    In clinical trials, the main purpose is often to compare efficacy between experimental and control treatments. Treatment comparisons often involve multiple endpoints, and this situation further complicates the analysis of survival data. In the case of tumor patients, endpoints concerning survival times include: times from tumor removal until the first, the second and the third tumor recurrences, and time to death. For each patient, these endpoints are correlated, and the estimation of the correlation between two score statistics is fundamental in derivation of overall treatment advantage. In this paper, the bivariate survival analysis method using the global score test methodology is extended to multivariate setting.

  6. Spinal appearance questionnaire: factor analysis, scoring, reliability, and validity testing.

    Science.gov (United States)

    Carreon, Leah Y; Sanders, James O; Polly, David W; Sucato, Daniel J; Parent, Stefan; Roy-Beaudry, Marjolaine; Hopkins, Jeffrey; McClung, Anna; Bratcher, Kelly R; Diamond, Beverly E

    2011-08-15

    Cross sectional. This study presents the factor analysis of the Spinal Appearance Questionnaire (SAQ) and its psychometric properties. Although the SAQ has been administered to a large sample of patients with adolescent idiopathic scoliosis (AIS) treated surgically, its psychometric properties have not been fully evaluated. This study presents the factor analysis and scoring of the SAQ and evaluates its psychometric properties. The SAQ and the Scoliosis Research Society-22 (SRS-22) were administered to AIS patients who were being observed, braced or scheduled for surgery. Standard demographic data and radiographic measures including Lenke type and curve magnitude were also collected. Of the 1802 patients, 83% were female; with a mean age of 14.8 years and mean initial Cobb angle of 55.8° (range, 0°-123°). From the 32 items of the SAQ, 15 loaded on two factors with consistent and significant correlations across all Lenke types. There is an Appearance (items 1-10) and an Expectations factor (items 12-15). Responses are summed giving a range of 5 to 50 for the Appearance domain and 5 to 20 for the Expectations domain. The Cronbach's α was 0.88 for both domains and Total score with a test-retest reliability of 0.81 for Appearance and 0.91 for Expectations. Correlations with major curve magnitude were higher for the SAQ Appearance and SAQ Total scores compared to correlations between the SRS Appearance and SRS Total scores. The SAQ and SRS-22 Scores were statistically significantly different in patients who were scheduled for surgery compared to those who were observed or braced. The SAQ is a valid measure of self-image in patients with AIS with greater correlation to curve magnitude than SRS Appearance and Total score. It also discriminates between patients who require surgery from those who do not.

  7. Developing a nanoparticle test for prostate cancer scoring

    Directory of Open Access Journals (Sweden)

    Huo Qun

    2012-03-01

    Full Text Available Abstract Background Over-diagnosis and treatment of prostate cancer has been a major problem in prostate cancer care and management. Currently the most relevant prognostic factor to predict a patient's risk of death due to prostate cancer is the Gleason score of the biopsied tissue samples. However, pathological analysis is subjective, and the Gleason score is only a qualitative estimate of the cancer malignancy. Molecular biomarkers and diagnostic tests that can accurately predict prostate tumor aggressiveness are rather limited. Method We report here for the first time the development of a nanoparticle test that not only can distinguish prostate cancer from normal and benign conditions, but also has the potential to predict the aggressiveness of prostate cancer quantitatively. To conduct the test, a prostate tissue lysate sample is spiked into a blood serum or human IgG solution and the spiked sample is incubated with a citrate-protected gold nanoparticle solution. IgG is known to adsorb to citrate-protected gold nanoparticles to form a "protein corona" on the nanoparticle surface. From this study, we discovered that certain tumor-specific molecules can interact with IgG and change the adsorption behavior of IgG to the gold nanoparticles. This change is reflected in the nanoparticle size of the assay solution and detected by a dynamic light scattering technique. Assay data were analyzed by one-way ANOVA for multiple variant analysis, and using the Student t-test or nonparametric Mann-Whitney U-tests for pairwise analyses. Results An inverse, quantitative correlation of the average nanoparticle size of the assay solution with tumor status and histological diagnostic grading was observed from the nanoparticle test. IgG solutions spiked with prostate tumor tissue exhibit significantly smaller nanoparticle size than the solutions spiked with normal and benign tissues. The higher grade the tumor is, the smaller the nanoparticle size is. The test

  8. The Effect of Logical Choice Weight and Corrected Scoring Methods on Multiple Choice Agricultural Science Test Scores

    Directory of Open Access Journals (Sweden)

    B. K. Ajayi

    2012-12-01

    Full Text Available The study focused on the effect of logical choice weight and corrected scoring methods on multiple choice Agricultural science test scores the study also investigated the interaction effect of logical choice weight and corrected scoring methods in schools ,and types of school in multiple choice agricultural science test. The researcher used a combination of survey type and one short experimental design. The sample for the study consisted of 600 students selected by stratified random sampling techniques in south western Nigeria. Overall performance of students in percentage, and correlation was analyzed. The hypotheses were generated and tested at 0.05 level of significance. The study revealed that there was a significant difference in the academic performance of students in logical choice weight and corrected scoring methods in multiple choice agricultural science test scores. The result also shown that there was no interaction effect on the two scoring methods in the type of schools, the location of schools in multiple choices agricultural science test. The study revealed that logical choice weight scoring method was the best method that favoured the scoring of the students’ scripts in multiple choices agricultural science test. On the basis of these findings, logical choice weight should be introduced to the teachers to use in the classroom as a new method of scoring multiple choice agricultural science the logical choice weight method is recommended in the ministry of education, in Examination Division, and to junior secondary schools for scoring JSS (3 three multiple choice test. Examination bodies such as West Africa Examination Council (WAEC, National Examination Council (NECO, Joint Admission and Matriculation Board (JAMB should adopt the use of logical choice weight method in scoring multiple choice tests. The method could be used in tertiary institutions for post ‘JAMB’ Unify Matriculation Examination (UME test. It is also

  9. Correlation of the Scores on Barron's Ego Strength Scale with the Scores on the Bender-Gestalt Test.

    Science.gov (United States)

    Martin, John D.; And Others

    1979-01-01

    The degree of relationship between scores on the Barron Ego Strength Scale and the scores on the Bender-Gestalt Test was investigated on a sample of college students. Correlations were moderate to low. Racial differences were observed on the Bender-Gestalt Test. (Author/JKS)

  10. Detection of Invalid Test Scores on Admission Tests : A Simulation Study Using Person-Fit Statistics

    NARCIS (Netherlands)

    Tendeiro, Jorge N.; Meijer, Rob R.; Albers, Casper J.

    While an admission test may strongly predict success in university or law school programs for most test takers, there may be some test takers who are mismeasured. To address this issue, a class of statistics called person-fit statistics is used to check the validity of individual test scores.

  11. The Visual Aural Digit Span Test and Bender Gestalt Test as Predictors of Wide Range Achievement Test-Revised Scores.

    Science.gov (United States)

    Smith, Teresa C.; Smith, Billy L.

    1988-01-01

    Examined Visual Aural Digit Span Test (VADS) and Bender-Gestalt (BG) scores as predictors of Wide Range Achievement Test-Revised (WRAT-R) scores among 115 elementary school students referred for low academic achievement. Divided children into three age groups. Results suggest BG and VADS Test can be effective screening devices for young children…

  12. The Graduate Management Admission Test: Technical Report on Test Development and Score Interpretation for GMAT Users.

    Science.gov (United States)

    Schrader, William B.

    This report provides information on test development, test administration, and score interpretation for the Graduate Management Admission Test (GMAT). The GMAT, first administered in 1954, provides objective measures of an applicant's abilities for use in admissions decisions by graduate management schools. It is currently composed of five…

  13. Your move: The effect of chess on mathematics test scores.

    Science.gov (United States)

    Rosholm, Michael; Mikkelsen, Mai Bjørnskov; Gumede, Kamilla

    2017-01-01

    We analyse the effect of substituting a weekly mathematics lesson in primary school grades 1-3 with a lesson in mathematics based on chess instruction. We use data from the City of Aarhus in Denmark, combining test score data with a comprehensive data set obtained from administrative registers. We use two different methodological approaches to identify and estimate treatment effects and we tend to find positive effects, indicating that knowledge acquired through chess play can be transferred to the domain of mathematics. We also find larger impacts for unhappy children and children who are bored in school, perhaps because chess instruction facilitates learning by providing an alternative approach to mathematics for these children. The results are encouraging and suggest that chess may be an important and effective tool for improving mathematical capacity in young students.

  14. Empirical Bayes Estimates of Domain Scores under Binomial and Hypergeometric Distributions for Test Scores.

    Science.gov (United States)

    Lin, Miao-Hsiang; Hsiung, Chao A.

    1994-01-01

    Two simple empirical approximate Bayes estimators are introduced for estimating domain scores under binomial and hypergeometric distributions respectively. Criteria are established regarding use of these functions over maximum likelihood estimation counterparts. (SLD)

  15. Volatility in School Test Scores: Implications for Test-Based Accountability Systems

    Science.gov (United States)

    Kane, Thomas J.; Staiger, Douglas O.

    2002-01-01

    By the spring of 2000, forty states had begun using student test scores to rate school performance. Twenty states have gone a step further and are attaching explicit monetary rewards or sanctions to a school's test performance. In this paper, the authors focus on accountability programs in which states measure the effectiveness of individual…

  16. The Relationship of Scores on Elizur's Hostility System on the Rorschach to the Acting-Out Score on the Hand Test.

    Science.gov (United States)

    Martin, John D.; And Others

    1978-01-01

    The relationship between Elizur's Hostility Scoring on the Rorschach Test and the Acting-Out Score on the Hand Test was examined. Correlations between the two measures (using several scoring procedures) ranged from .40 to .64. (JKS)

  17. Reporting Diagnostic Scores in Educational Testing: Temptations, Pitfalls, and Some Solutions

    Science.gov (United States)

    Sinharay, Sandip; Puhan, Gautam; Haberman, Shelby J.

    2010-01-01

    Diagnostic scores are of increasing interest in educational testing due to their potential remedial and instructional benefit. Naturally, the number of educational tests that report diagnostic scores is on the rise, as are the number of research publications on such scores. This article provides a critical evaluation of diagnostic score reporting…

  18. A Procedure for Linear Polychotomous Scoring of Test Items (Computer Diskette).

    Science.gov (United States)

    weights that are then associated with the response categories of test items . When tests are scored using these scoring weights, test reliability...program poly. The example demonstrates how polyweighting can be used to calibrate and score test items drawn from an item bank that is too large to

  19. Two sampling techniques for game meat.

    Science.gov (United States)

    van der Merwe, Maretha; Jooste, Piet J; Hoffman, Louw C; Calitz, Frikkie J

    2013-03-20

    A study was conducted to compare the excision sampling technique used by the export market and the sampling technique preferred by European countries, namely the biotrace cattle and swine test. The measuring unit for the excision sampling was grams (g) and square centimetres (cm2) for the swabbing technique. The two techniques were compared after a pilot test was conducted on spiked approved beef carcasses (n = 12) that statistically proved the two measuring units correlated. The two sampling techniques were conducted on the same game carcasses (n = 13) and analyses performed for aerobic plate count (APC), Escherichia coli and Staphylococcus aureus, for both techniques. A more representative result was obtained by swabbing and no damage was caused to the carcass. Conversely, the excision technique yielded fewer organisms and caused minor damage to the carcass. The recovery ratio from the sampling technique improved 5.4 times for APC, 108.0 times for E. coli and 3.4 times for S. aureus over the results obtained from the excision technique. It was concluded that the sampling methods of excision and swabbing can be used to obtain bacterial profiles from both export and local carcasses and could be used to indicate whether game carcasses intended for the local market are possibly on par with game carcasses intended for the export market and therefore safe for human consumption.

  20. Developing Test Score Reports that Work: The Process and Best Practices for Effective Communication

    Science.gov (United States)

    Zenisky, April L.; Hambleton, Ronald K.

    2012-01-01

    Test scores matter these days. Test-takers want to understand how they performed, and test score reports, particularly those for individual examinees, are the vehicles by which most people get the bulk of this information. Historically, score reports have not always met the examinees' information or usability needs, but this is clearly changing…

  1. Discrepancies between modified Medical Research Council dyspnea score and COPD assessment test score in patients with COPD

    Directory of Open Access Journals (Sweden)

    Rhee CK

    2015-08-01

    Full Text Available Chin Kook Rhee,1 Jin Woo Kim,2 Yong Il Hwang,3 Jin Hwa Lee,4 Ki-Suck Jung,3 Myung Goo Lee,5 Kwang Ha Yoo,6 Sang Haak Lee,7 Kyeong-Cheol Shin,8 Hyoung Kyu Yoon9 1Division of Pulmonary, Allergy and Critical Care Medicine, Department of Internal Medicine, Seoul St Mary’s Hospital, College of Medicine, The Catholic University of Korea, Seoul, 2Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, Uijeongbu St Mary’s Hospital, College of Medicine, The Catholic University of Korea, Uijeongbu, 3Division of Pulmonary, Allergy and Critical Care Medicine, Department of Internal Medicine, Hallym University Medical Center, Hallym University College of Medicine, Anyang, 4Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, School of Medicine, Ewha Womans University, Seoul, 5Division of Pulmonary, Allergy and Critical Care Medicine, Department of Internal Medicine, Hallym University Chuncheon Sacred Heart Hospital, Hallym University College of Medicine, Chuncheon, 6Division of Pulmonary, Allergy and Critical Care Medicine, Department of Internal Medicine, Konkuk University School of Medicine, Seoul, 7Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, St Paul’s Hospital, College of Medicine, The Catholic University of Korea, Seoul, 8Regional Center for Respiratory Disease, Yeungnam University Medical Center, Yeungnam University College of Medicine, Daegu, 9Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, Yeouido St Mary’s Hospital, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea Background and objective: According to the Global Initiative for Chronic Obstructive Lung Disease (GOLD guidelines, either a modified Medical Research Council (mMRC dyspnea score of ≥2 or a chronic obstructive pulmonary disease (COPD assessment test (CAT score of ≥10 is considered to represent COPD patients who are

  2. Improving Test Score Reporting: Perspectives from the ETS Score Reporting Conference. Research Report. ETS RR-11-45

    Science.gov (United States)

    Zapata-Rivera, Diego, Ed.; Zwick, Rebecca, Ed.

    2011-01-01

    This volume includes 3 papers based on presentations at a workshop on communicating assessment information to particular audiences, held at Educational Testing Service (ETS) on November 4th, 2010, to explore some issues that influence score reports and new advances that contribute to the effectiveness of these reports. Jessica Hullman, Rebecca…

  3. Effect of Examinee Certainty on Probabilistic Test Scores and a Comparison of Scoring Methods for Probabilistic Responses.

    Science.gov (United States)

    1983-07-01

    equal to the maximum value for this index is due to the dependence of this index upon the magnitude and sign of the factor loadings. Gorsuch (1974, p...Measurement, 1972, 9, 205-207. Gorsuch , R. L. Factor analysis. Philadelphia: W. B. Saunders Company, 1974. Guilford, J. P. A simple scoring weight for test

  4. Effect of Self-Assessment on Test Scores: Student Perceptions

    Science.gov (United States)

    Ramirez, Beatriz U.

    2010-01-01

    After a sudden increase in most of the individual grades in a multiple-choice test, students were asked to rank the three most relevant factors responsible for this outcome. Among eight others, the availability of a test for self-assessment before the final test was by far the most frequently mentioned (82.4% of the students). Questions applied…

  5. Neuropsychological test scores, academic performance, and developmental disorders in Spanish-speaking children.

    Science.gov (United States)

    Rosselli, M; Ardila, A; Bateman, J R; Guzmán, M

    2001-01-01

    Limited information is currently available about performance of Spanish-speaking children on different neuropsychological tests. This study was designed to (a) analyze the effects of age and sex on different neuropsychological test scores of a randomly selected sample of Spanish-speaking children, (b) analyze the value of neuropsychological test scores for predicting school performance, and (c) describe the neuropsychological profile of Spanish-speaking children with learning disabilities (LD). Two hundred ninety (141 boys, 149 girls) 6- to 11-year-old children were selected from a school in Bogotá, Colombia. Three age groups were distinguished: 6- to 7-, 8- to 9-, and 10- to 11-year-olds. Performance was measured utilizing the following neuropsychological tests: Seashore Rhythm Test, Finger Tapping Test (FTT), Grooved Pegboard Test, Children's Category Test (CCT), California Verbal Learning Test-Children's Version (CVLT-C), Benton Visual Retention Test (BVRT), and Bateria Woodcock Psicoeducativa en Español (Woodcock, 1982). Normative scores were calculated. Age effect was significant for most of the test scores. A significant sex effect was observed for 3 test scores. Intercorrelations were performed between neuropsychological test scores and academic areas (science, mathematics, Spanish, social studies, and music). In a post hoc analysis, children presenting very low scores on the reading, writing, and arithmetic achievement scales of the Woodcock battery were identified in the sample, and their neuropsychological test scores were compared with a matched normal group. Finally, a comparison was made between Colombian and American norms.

  6. Construct Validity and Test Re-Test Reliability of the Forgotten Joint Score.

    Science.gov (United States)

    Thompson, Simon M; Salmon, Lucy J; Webb, Justin M; Pinczewski, Leo A; Roe, Justin P

    2015-11-01

    Consecutive patients undergoing knee arthroplasty completed questionnaires: FJS, Knee Injury and Osteoarthritis Outcome Score (KOOS) and WOMAC Score (mean 39 months after surgery), and were mailed a repeat questionnaire after 4 to 6 weeks. The test-retest reliability was almost perfect for the FJS (ICC = 0.97), and the FJS subdomains (ICC > 0.8). Convergent construct validity of the FJS was correlated with the KOOS Subscores of Quality of Life (0.63, P = 0.001), Symptom (0.33, P = 0.001), Pain (0.68, P = 0.001) and ADL (0.66, P = 0.001) and the Total WOMAC (0.70, P = 0.001). The FJS demonstrates high test-retest reliability and construct validity compared to the Normalised WOMAC and KOOS Subscales. The FJS does not demonstrate the ceiling effect of the WOMAC or KOOS pain scores so may have greater discriminatory ability following TKR. Copyright © 2015 Elsevier Inc. All rights reserved.

  7. Further Validation of the Qualitative Scoring System for the Modified Bender-Gestalt Test.

    Science.gov (United States)

    Brannigan, Gary G.; And Others

    1995-01-01

    Compares the Qualitative Scoring System and the Developmental Scoring Systems, both Bender-Gestalt tests, in predicting achievement on the Metropolitan Achievement Test (MAT). In this study, first through fourth graders (n=409) from regular elementary schools were subjected to both tests; both systems correlated significantly with school…

  8. Are Score Comparisons across Language Proficiency Test Batteries Justified?: An IELTS-TOEFL Comparability Study.

    Science.gov (United States)

    Geranpayeh, Ardeshir

    1994-01-01

    This paper reports on a study conducted to determine if comparisons between scores on the Test of English as a Foreign Language (TOEFL) and the International English Language Testing Service (IELTS) are justifiable. The test scores of 216 Iranian graduate students who took the TOEFL and IELTS, as well as the Iranian Ministry of Culture and Higher…

  9. The rank product method with two samples.

    Science.gov (United States)

    Koziol, James A

    2010-11-05

    Breitling et al. (2004) introduced a statistical technique, the rank product method, for detecting differentially regulated genes in replicated microarray experiments. The technique has achieved widespread acceptance and is now used more broadly, in such diverse fields as RNAi analysis, proteomics, and machine learning. In this note, we extend the rank product method to the two sample setting, provide distribution theory attending the rank product method in this setting, and give numerical details for implementing the method.

  10. Comparing Graphical and Verbal Representations of Measurement Error in Test Score Reports

    Science.gov (United States)

    Zwick, Rebecca; Zapata-Rivera, Diego; Hegarty, Mary

    2014-01-01

    Research has shown that many educators do not understand the terminology or displays used in test score reports and that measurement error is a particularly challenging concept. We investigated graphical and verbal methods of representing measurement error associated with individual student scores. We created four alternative score reports, each…

  11. Detection of aberrant item score patterns in computerized adaptive testing : An empirical example using the CUSUM

    NARCIS (Netherlands)

    Egberink, Iris J. L.; Meijer, Rob R.; Veldkamp, Bernard P.; Schakel, Lolle; Smid, Nico G.

    2010-01-01

    The scalability of individual trait scores on a computerized adaptive test (CAT) was assessed through investigating the consistency of individual item score patterns. A sample of N = 428 persons completed a personality CAT as part of a career development procedure. To detect inconsistent item score

  12. Comparing Graphical and Verbal Representations of Measurement Error in Test Score Reports

    Science.gov (United States)

    Zwick, Rebecca; Zapata-Rivera, Diego; Hegarty, Mary

    2014-01-01

    Research has shown that many educators do not understand the terminology or displays used in test score reports and that measurement error is a particularly challenging concept. We investigated graphical and verbal methods of representing measurement error associated with individual student scores. We created four alternative score reports, each…

  13. Effects of Test Media on Different EFL Test-Takers in Writing Scores and in the Cognitive Writing Process

    Science.gov (United States)

    Zou, Xiao-Ling; Chen, Yan-Min

    2016-01-01

    The effects of computer and paper test media on EFL test-takers with different computer familiarity in writing scores and in the cognitive writing process have been comprehensively explored from the learners' aspect as well as on the basis of related theories and practice. The results indicate significant differences in test scores among the…

  14. Prediction of WAIS Scores from Group Ability Tests

    Science.gov (United States)

    Watson, Charles G.; Klett, William G.

    1973-01-01

    In a search for an adequate but efficient substitute, the authors have instituted three evaluations of the relationships between potential WAIS-substitutes and the WAIS itself. The present report describes the first of these researches-- a study of the relationships between the four group ability tests and the WAIS in a mental hospital setting.…

  15. Maintaining Equivalent Cut Scores for Small Sample Test Forms

    Science.gov (United States)

    Dwyer, Andrew C.

    2016-01-01

    This study examines the effectiveness of three approaches for maintaining equivalent performance standards across test forms with small samples: (1) common-item equating, (2) resetting the standard, and (3) rescaling the standard. Rescaling the standard (i.e., applying common-item equating methodology to standard setting ratings to account for…

  16. Generation of GHS Scores from TEST and online sources

    Science.gov (United States)

    Alternatives assessment frameworks such as DfE (Design for the Environment) evaluate chemical alternatives in terms of human health effects, ecotoxicity, and fate. T.E.S.T. (Toxicity Estimation Software Tool) can be utilized to evaluate human health in terms of acute oral rat tox...

  17. Adolescent Psychopathy and the Big Five: Results from Two Samples

    Science.gov (United States)

    Lynam, Donald R.; Caspi, Avshalom; Moffitt, Terrie E.; Raine, Adrian; Loeber, Rolf; Stouthamer-Loeber, Magda

    2005-01-01

    The present study examines the relation between psychopathy and the Big Five dimensions of personality in two samples of adolescents. Specifically, the study tests the hypothesis that the aspect of psychopathy representing selfishness, callousness, and interpersonal manipulation (Factor 1) is most strongly associated with low Agreeableness,…

  18. Freudenfreude and Schadenfreude Test (FAST) scores of depressed and non-depressed undergraduates.

    Science.gov (United States)

    Chambliss, Catherine; Cattai, Ashley; Benton, Peter; Elghawy, Ahmed; Fan, Madde; Thompson, Kayleigh; Scavicchio, Daniel; Tanenbaum, Joshua

    2012-08-01

    The Freudenfreude and Schadenfreude Test (FAST) had moderate test-retest reliability in an undergraduate sample. Freudenfreude scores were lower and Schadenfreude scores were higher among mildly depressed than nondepressed students. Distinctive reactions to personal success and failure were associated with depression. Responses to others' success and failure may also be related to depression.

  19. Using Raters from India to Score a Large-Scale Speaking Test

    Science.gov (United States)

    Xi, Xiaoming; Mollaun, Pam

    2011-01-01

    We investigated the scoring of the Speaking section of the Test of English as a Foreign Language[TM] Internet-based (TOEFL iBT[R]) test by speakers of English and one or more Indian languages. We explored the extent to which raters from India, after being trained and certified, were able to score the TOEFL examinees with mixed first languages…

  20. Comparison of Two Scoring Systems for the Modified Version of the Bender-Gestalt Test.

    Science.gov (United States)

    Schachter, Steven; And Others

    1991-01-01

    Examined relative utility of two scoring systems for Modified Version of Bender-Gestalt Test in predicting performance on Developmental Test of Visual-Motor Integration. Findings from 53 kindergarten and 47 first grade students indicated that Qualitative Scoring System was significantly better predictor of visual-motor integration skills than…

  1. Many Children Left Behind? Textbooks and Test Scores in Kenya. NBER Working Paper No. 13300

    Science.gov (United States)

    Glewwe, Paul; Kremer, Michael; Moulin, Sylvie

    2007-01-01

    A randomized evaluation suggests that a program which provided official textbooks to randomly selected rural Kenyan primary schools did not increase test scores for the average student. In contrast, the previous literature suggests that textbook provision has a large impact on test scores. Disaggregating the results by students' initial academic…

  2. Beyond Correlations: Usefulness of High School GPA and Test Scores in Making College Admissions Decisions

    Science.gov (United States)

    Sawyer, Richard

    2013-01-01

    Correlational evidence suggests that high school GPA is better than admission test scores in predicting first-year college GPA, although test scores have incremental predictive validity. The usefulness of a selection variable in making admission decisions depends in part on its predictive validity, but also on institutions' selectivity and…

  3. Using Raters from India to Score a Large-Scale Speaking Test

    Science.gov (United States)

    Xi, Xiaoming; Mollaun, Pam

    2011-01-01

    We investigated the scoring of the Speaking section of the Test of English as a Foreign Language[TM] Internet-based (TOEFL iBT[R]) test by speakers of English and one or more Indian languages. We explored the extent to which raters from India, after being trained and certified, were able to score the TOEFL examinees with mixed first languages…

  4. The Effects of Developmental Placement and Early Retention on Children's Later Scores on Standardized Tests.

    Science.gov (United States)

    May, Deborah C.; Welch, Edward L.

    1984-01-01

    Examined the relationship between early school retention as a result of preschool and kindergarten developmental testing and children's later academic achievement (N=223). Results showed children who scored as immature on the Gesell Screening Test and who were retained a year had the lowest scores on all measures. (JAC)

  5. Comparing the Effects of Elementary Music and Visual Arts Lessons on Standardized Mathematics Test Scores

    Science.gov (United States)

    King, Molly Elizabeth

    2016-01-01

    The purpose of this quantitative, causal-comparative study was to compare the effect elementary music and visual arts lessons had on third through sixth grade standardized mathematics test scores. Inferential statistics were used to compare the differences between test scores of students who took in-school, elementary, music instruction during the…

  6. Life Stress and Reading Comprehension Test Scores in the Middle School Student.

    Science.gov (United States)

    Jones, Maryann Clementi

    A study determined the relationship between life stress and reading comprehension test scores on the IOWA Tests of Basic Skills. Subjects, 41 middle-school students attending Lincoln School in Garwood, New Jersey, were surveyed as to the amount of life stress prevalent in their lives. In addition, the Iowa scores for reading comprehension were…

  7. Scoring Yes-No Vocabulary Tests: Reaction Time vs. Nonword Approaches

    Science.gov (United States)

    Pellicer-Sanchez, Ana; Schmitt, Norbert

    2012-01-01

    Despite a number of research studies investigating the Yes-No vocabulary test format, one main question remains unanswered: What is the best scoring procedure to adjust for testee overestimation of vocabulary knowledge? Different scoring methodologies have been proposed based on the inclusion and selection of nonwords in the test. However, there…

  8. Comparing the Effects of Elementary Music and Visual Arts Lessons on Standardized Mathematics Test Scores

    Science.gov (United States)

    King, Molly Elizabeth

    2016-01-01

    The purpose of this quantitative, causal-comparative study was to compare the effect elementary music and visual arts lessons had on third through sixth grade standardized mathematics test scores. Inferential statistics were used to compare the differences between test scores of students who took in-school, elementary, music instruction during the…

  9. Correcting for Test Score Measurement Error in ANCOVA Models for Estimating Treatment Effects

    Science.gov (United States)

    Lockwood, J. R.; McCaffrey, Daniel F.

    2014-01-01

    A common strategy for estimating treatment effects in observational studies using individual student-level data is analysis of covariance (ANCOVA) or hierarchical variants of it, in which outcomes (often standardized test scores) are regressed on pretreatment test scores, other student characteristics, and treatment group indicators. Measurement…

  10. An Item Analysis and Validity Investigation of Bender Visual Motor Gestalt Test Score Items

    Science.gov (United States)

    Lambert, Nadine M.

    1971-01-01

    This investigation attempted to demonstrate the utility of standard item analysis procedures for selecting the most reliable and valid items for scoring Bender Visual Motor Gestalt Test test records. (Author)

  11. Are Increasing Test Scores in Texas Really a Myth?

    Directory of Open Access Journals (Sweden)

    A. Toenjes

    2002-03-01

    Full Text Available Pass rates by Texas tenth-graders on the high school exit exam improved from 52 percent in 1994 to 72 percent in 1998. In his article "The Myth of the Texas Miracle in Education" (EPAA, August 2000 Professor Walt Haney argued that some part of this increased pass rate was, as he put it, an illusion. Haney contended that the combined effects of students dropping out of school prior to taking the 10th grade TAAS and special education exemptions accounted for much of the increase in TAAS pass rates. Relying on the same methodology and data that Haney used, we demonstrate that his conclusion is incorrect. None of the 20 percent improvement in the TAAS exit test pass rate between 1994 and 1998 is explained by combined increases in dropout rates or special education exemptions.

  12. The non-credible score of the Rey Auditory Verbal Learning Test: is it better at predicting non-credible neuropsychological test performance than the RAVLT recognition score?

    Science.gov (United States)

    Whitney, Kriscinda A; Davis, Jeremy J

    2015-03-01

    The ability of both the non-credible score of the Rey Auditory Verbal Learning Test (RAVLT NC) and the recognition score of the RAVLT (RAVLT Recog) to predict credible versus non-credible neuropsychological test performance was examined. Credible versus non-credible group membership was determined according to diagnostic criteria with consideration of performance on two stand-alone performance validity tests. Findings from this retrospective data analysis of outpatients seen for neuropsychological testing within a Veterans Affairs Medical Center (N = 175) showed that RAVLT Recog demonstrated better classification accuracy than RAVLT NC in predicting credible versus non-credible neuropsychological test performance. Specifically, an RAVLT Recog cutoff of ≤9 resulted in reasonable sensitivity (48%) and acceptable specificity (91%) in predicting non-credible neuropsychological test performance. Implications for clinical practice are discussed. Note: The views contained here within are those of the authors and not representative of the institutions with which they are associated.

  13. The Relationship between Scores on the Graduate Management Admission Test and the Test of English as a Foreign Language.

    Science.gov (United States)

    Powers, Donald E.

    In addition to scores on the Graduate Management Admission Test (GMAT), which are required of applicants to a substantial number of graduate management schools, foreign candidates may also be required to submit scores on the Test of English as a Foreign Language (TOEFL) as an indication of English language proficiency. The present study provides…

  14. The Relationship between Scores on the Graduate Management Admission Test and the Test of English as a Foreign Language.

    Science.gov (United States)

    Powers, Donald E.

    In addition to scores on the Graduate Management Admission Test (GMAT), which are required of applicants to a substantial number of graduate management schools, foreign candidates may also be required to submit scores on the Test of English as a Foreign Language (TOEFL) as an indication of English language proficiency. The present study provides…

  15. Personnel Test Battery and Scoring Procedures. Memorandum No. L.S. 15.

    Science.gov (United States)

    Berson, Barry L.

    The purpose of this memo is to present tests that comprise the test battery used to select Navy personnel to train marine mammals, and to describe the scoring procedures of the tests. The test battery consists of: Biosystems General Information Test (BGIT), Personnel History Questionnaire (PHQ), Gordon Personal Inventory, Gordon Personal Profile,…

  16. Effects of Targeted Test Preparation on Scores of Two Tests of Oral English as a Second Language

    Science.gov (United States)

    Farnsworth, Tim

    2013-01-01

    This study investigated the effect of targeted test preparation, or coaching, on oral English as a second language test scores. The tests in question were the Basic English Skills Test Plus (BEST Plus), a scripted oral interview published by the Center for Applied Linguistics, and the Versant English Test (VET), a computer-administered and…

  17. Wisconsin card sorting test: a new global score, with Italian norms, and its relationship with the Weigl sorting test.

    Science.gov (United States)

    Laiacona, M; Inzaghi, M G; De Tanti, A; Capitani, E

    2000-10-01

    The Wisconsin card sorting test and the Weigl test are two neuropsychological tools widely used in clinical practice to assess frontal lobe functions. In this study we present norms useful for Italian subjects aged from 15 to 85 years, with 5-17 years of education. Concerning the Wisconsin card sorting test, a new measure of global efficiency (global score) is proposed as well as norms for some well known qualitative aspects of the performance, i.e. perseverative responses, failure to maintain the set and non-perseverative errors. In setting normative values, we followed a statistical methodology (equivalent scores) employed in Italy for other neuropsychological tests, in order to favour the possibility of comparison among these tests. A correlation study between the global score of the Wisconsin card sorting test and the score on the Weigl test was carried out and it emerges that some cognitive aspects are not overlapping in these two measures.

  18. Estimating Achievement Gaps from Test Scores Reported in Ordinal "Proficiency" Categories

    Science.gov (United States)

    Ho, Andrew D.; Reardon, Sean F.

    2012-01-01

    Test scores are commonly reported in a small number of ordered categories. Examples of such reporting include state accountability testing, Advanced Placement tests, and English proficiency tests. This paper introduces and evaluates methods for estimating achievement gaps on a familiar standard-deviation-unit metric using data from these ordered…

  19. TOEFL iBT Speaking Test Scores as Indicators of Oral Communicative Language Proficiency

    Science.gov (United States)

    Bridgeman, Brent; Powers, Donald; Stone, Elizabeth; Mollaun, Pamela

    2012-01-01

    Scores assigned by trained raters and by an automated scoring system (SpeechRater[TM]) on the speaking section of the TOEFL iBT[TM] were validated against a communicative competence criterion. Specifically, a sample of 555 undergraduate students listened to speech samples from 184 examinees who took the Test of English as a Foreign Language…

  20. Psychometric Properties of Raw and Scale Scores on Mixed-Format Tests

    Science.gov (United States)

    Kolen, Michael J.; Lee, Won-Chan

    2011-01-01

    This paper illustrates that the psychometric properties of scores and scales that are used with mixed-format educational tests can impact the use and interpretation of the scores that are reported to examinees. Psychometric properties that include reliability and conditional standard errors of measurement are considered in this paper. The focus is…

  1. Optimal Scoring Methods of Hand-Strength Tests in Patients with Stroke

    Science.gov (United States)

    Huang, Sheau-Ling; Hsieh, Ching-Lin; Lin, Jau-Hong; Chen, Hui-Mei

    2011-01-01

    The purpose of this study was to determine the optimal scoring methods for measuring strength of the more-affected hand in patients with stroke by examining the effect of reducing measurement errors. Three hand-strength tests of grip, palmar pinch, and lateral pinch were administered at two sessions in 56 patients with stroke. Five scoring methods…

  2. The Impact of the 2004 Hurricanes on Florida Comprehensive Assessment Test Scores: Implications for School Counselors

    Science.gov (United States)

    Baggerly, Jennifer; Ferretti, Larissa K.

    2008-01-01

    What is the impact of natural disasters on students' statewide assessment scores? To answer this question, Florida Comprehensive Assessment Test (FCAT) scores of 55,881 students in grades 4 through 10 were analyzed to determine if there were significant decreases after the 2004 hurricanes. Results reveal that there was statistical but no practical…

  3. See It, Be It, Write It: Using Performing Arts to Improve Writing Skills and Test Scores

    Science.gov (United States)

    Blecher-Sass, Hope Sara; Moffitt, Maryellen

    2010-01-01

    Improve students' writing skills and boost their assessment scores while adding arts education, creativity, and fun to your writing curriculum. With this vibrant resource, improving writing skills goes hand-in-hand with improving test scores. Students learn how to use acting and visualization as prewriting activities to help them connect writing…

  4. TOEFL iBT Speaking Test Scores as Indicators of Oral Communicative Language Proficiency

    Science.gov (United States)

    Bridgeman, Brent; Powers, Donald; Stone, Elizabeth; Mollaun, Pamela

    2012-01-01

    Scores assigned by trained raters and by an automated scoring system (SpeechRater[TM]) on the speaking section of the TOEFL iBT[TM] were validated against a communicative competence criterion. Specifically, a sample of 555 undergraduate students listened to speech samples from 184 examinees who took the Test of English as a Foreign Language…

  5. Comparison of the Koppitz and Watkins Scoring Systems for the Bender Gestalt Test.

    Science.gov (United States)

    Johnston, Cris W.; Lanak, Brenda

    1985-01-01

    The Bender Gestalt Test was administered to 25 children (7-10 years old) referred for neuropsychological assessment and scored using the Koppitz system and the Watkins system. Although the scores obtained using the two different sets of criteria were highly correlated, the Watkins rules produced generally better performance. (Author/CL)

  6. Use of Standardized Test Scores to Predict Success in a Computer Applications Course

    Science.gov (United States)

    Harris, Robert V.; King, Stephanie B.

    2016-01-01

    The purpose of this study was to see if a relationship existed between American College Testing (ACT) scores (i.e., English, reading, mathematics, science reasoning, and composite) and student success in a computer applications course at a Mississippi community college. The study showed that while the ACT scores were excellent predictors of…

  7. See It, Be It, Write It: Using Performing Arts to Improve Writing Skills and Test Scores

    Science.gov (United States)

    Blecher-Sass, Hope Sara; Moffitt, Maryellen

    2010-01-01

    Improve students' writing skills and boost their assessment scores while adding arts education, creativity, and fun to your writing curriculum. With this vibrant resource, improving writing skills goes hand-in-hand with improving test scores. Students learn how to use acting and visualization as prewriting activities to help them connect writing…

  8. Assessing the Relationship among Defining Issues Test Scores and Crystallised and Fluid Intellectual Indices

    Science.gov (United States)

    Derryberry, W. Pitt; Jones, Kristy L.; Grieve, Frederick G.; Barger, Brian

    2007-01-01

    Differing findings exist on how Defining Issues Test (DIT) scores relate to intelligence. Further study is needed in order to address aspects of intellect not previously considered and to address how these relationships rival studies that have compared indices of intellect with constructs similar to DIT scores. In the present study, a sample of…

  9. A comparison of likelihood ratio tests and Rao's score test for three separable covariance matrix structures.

    Science.gov (United States)

    Filipiak, Katarzyna; Klein, Daniel; Roy, Anuradha

    2017-01-01

    The problem of testing the separability of a covariance matrix against an unstructured variance-covariance matrix is studied in the context of multivariate repeated measures data using Rao's score test (RST). The RST statistic is developed with the first component of the separable structure as a first-order autoregressive (AR(1)) correlation matrix or an unstructured (UN) covariance matrix under the assumption of multivariate normality. It is shown that the distribution of the RST statistic under the null hypothesis of any separability does not depend on the true values of the mean or the unstructured components of the separable structure. A significant advantage of the RST is that it can be performed for small samples, even smaller than the dimension of the data, where the likelihood ratio test (LRT) cannot be used, and it outperforms the standard LRT in a number of contexts. Monte Carlo simulations are then used to study the comparative behavior of the null distribution of the RST statistic, as well as that of the LRT statistic, in terms of sample size considerations, and for the estimation of the empirical percentiles. Our findings are compared with existing results where the first component of the separable structure is a compound symmetry (CS) correlation matrix. It is also shown by simulations that the empirical null distribution of the RST statistic converges faster than the empirical null distribution of the LRT statistic to the limiting χ(2) distribution. The tests are implemented on a real dataset from medical studies. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  10. Estimating the Relationship between Use of Test-Preparation Methods and Scores on the Graduate Management Admission Test.

    Science.gov (United States)

    Leary, Linda F.; Wightman, Lawrence E.

    This study sought to examine the relationship between five methods of test preparation and test performance as measured by Graduate Management Admission Test (GMAT) Verbal, Quantitative and Total scores. Data on method of test preparation were obtained through voluntary examinee response to five questions which appeared on the answer sheets. One…

  11. Estimating the Relationship between Use of Test-Preparation Methods and Scores on the Graduate Management Admission Test.

    Science.gov (United States)

    Leary, Linda F.; Wightman, Lawrence E.

    This study sought to examine the relationship between five methods of test preparation and test performance as measured by Graduate Management Admission Test (GMAT) Verbal, Quantitative and Total scores. Data on method of test preparation were obtained through voluntary examinee response to five questions which appeared on the answer sheets. One…

  12. Testing measurement invariance of the schizotypal personality questionnaire-brief scores across Spanish and Swiss adolescents.

    Directory of Open Access Journals (Sweden)

    Javier Ortuño-Sierra

    Full Text Available BACKGROUND: Schizotypy is a complex construct intimately related to psychosis. Empirical evidence indicates that participants with high scores on schizotypal self-report are at a heightened risk for the later development of psychotic disorders. Schizotypal experiences represent the behavioural expression of liability for psychotic disorders. Previous factorial studies have shown that schizotypy is a multidimensional construct similar to that found in patients with schizophrenia. Specifically, using the Schizotypal Personality Questionnaire-Brief (SPQ-B, the three-dimensional model has been widely replicated. However, there has been no in-depth investigation of whether the dimensional structure underlying the SPQ-B scores is invariant across countries. METHODS: The main goal of this study was to examine the measurement invariance of the SPQ-B scores across Spanish and Swiss adolescents. The final sample was made up of 261 Spanish participants (51.7% men; M = 16.04 years and 241 Swiss participants (52.3% men; M = 15.94 years. RESULTS: The results indicated that Raine et al.'s three-factor model presented adequate goodness-of-fit indices. Moreover, the results supported the measurement invariance (configural and partial strong invariance of the SPQ-B scores across the two samples. Spanish participants scored higher on Interpersonal dimension than Swiss when latent means were compared. DISCUSSION: The study of measurement equivalence across countries provides preliminary evidence for the Raine et al.'s three-factor model and of the cross-cultural validity of the SPQ-B scores in adolescent population. Future studies should continue to examine the measurement invariance of the schizotypy and psychosis-risk syndromes across cultures.

  13. Scoring Divergent Thinking Tests by Computer With a Semantics-Based Algorithm

    Directory of Open Access Journals (Sweden)

    Kenes Beketayev

    2016-05-01

    Full Text Available Divergent thinking (DT tests are useful for the assessment of creative potentials. This article reports the semantics-based algorithmic (SBA method for assessing DT. This algorithm is fully automated: Examinees receive DT questions on a computer or mobile device and their ideas are immediately compared with norms and semantic networks. This investigation compared the scores generated by the SBA method with the traditional methods of scoring DT (i.e., fluency, originality, and flexibility. Data were collected from 250 examinees using the “Many Uses Test” of DT. The most important finding involved the flexibility scores from both scoring methods. This was critical because semantic networks are based on conceptual structures, and thus a high SBA score should be highly correlated with the traditional flexibility score from DT tests. Results confirmed this correlation (r = .74. This supports the use of algorithmic scoring of DT. The nearly-immediate computation time required by SBA method may make it the method of choice, especially when it comes to moderate- and large-scale DT assessment investigations. Correlations between SBA scores and GPA were insignificant, providing evidence of the discriminant and construct validity of SBA scores. Limitations of the present study and directions for future research are offered.

  14. A Maturing Global Testing Regime Meets the World Economy: Test Scores and Economic Growth, 1960-2012

    Science.gov (United States)

    Kamens, David H.

    2015-01-01

    This article considers the growth of the international testing regime. It discusses sources of growth and empirically examines two related sets of issues: (1) the stability of countries' achievement scores, and (2) the influence of those national scores on subsequent economic development over different time lags. The article suggests that…

  15. The Sinonasal Outcome Test 22 score in persons without chronic rhinosinusitis

    DEFF Research Database (Denmark)

    Lange, Bibi; Thilsing, T; Baelum, J

    2016-01-01

    -67 with a mean score of 10.5 (CI: 9.1 - 11.9) and the median score was 7. Persons with allergic rhinitis and blue collar workers had a significant higher score. CONCLUSION: The median value of 7 is taken as the normal SNOT 22 score in persons without CRS and can be used as a reference in clinical settings......OBJECTIVES: To determine the Sino Nasal Outcome Test 22 (SNOT 22) score in persons without chronic rhinosinusitis. DESIGN AND SETTING: As part of a trans-European study selected respondents to a survey questionnaire were invited for a clinical visit. Subjective symptoms and rhinoscopy were used...... for the clinical diagnosis of chronic rhinosinusitis according to EPOS. PARTICIPANTS: A total of 366 persons participated at the clinical visit and of these 268 did not have chronic rhinosinusitis. All participants completed the SNOT 22. MAIN OUTCOME MEASURES: The SNOT 22. RESULTS: The SNOT 22 score ranged from 0...

  16. Development of Scoring Procedures for the Performance Based Measurement (PBM) Test: Psychometric and Criterion Validity Investigation

    Science.gov (United States)

    2011-11-29

    39 Vertical Tracking Test ( VTT ): Scoring Strategies and...Validities ................................................. 43 Item-level CTT and IRT Analyses and Results for the VTT ...44 IRT Calibration of the 9 VTT Items

  17. Validity of Alternative Cut-Off Scores for the Back-Saver Sit and Reach Test

    Science.gov (United States)

    Looney, Marilyn A.; Gilbert, Jennie

    2012-01-01

    The purpose of the study was to determine if currently used FITNESSGRAM[R] cut-off scores for the Back Saver Sit and Reach Test had the best criterion-referenced validity evidence for 6-12 year old children. Secondary analyses of an existing data set focused on the passive straight leg raise and Back Saver Sit and Reach Test flexibility scores of…

  18. Demands on Users for Interpretation of Achievement Test Scores: Implications for the Evaluation Profession

    Science.gov (United States)

    Della-Piana, Gabriel Mario; Gardner, Michael

    2011-01-01

    Background: Professional standards for validity of achievement tests have long reflected a consensus that validity is the degree to which evidence and theory support interpretations of test scores entailed by the intended uses of tests. Yet there are convincing lines of evidence that the standards are not adequately followed in practice, that…

  19. Preliminary test of effects of cognitive ability, experience, and teaching methods on Verbal Analogy Test scores.

    Science.gov (United States)

    Rosenberg, D; Willson-Quayle, A; Pasnak, R

    2000-06-01

    The methods from which one can choose when preparing for the GRE Verbal Analogies include books, software, audiotapes, and formal classroom instruction. What teaching method will work best for a given individual? To begin the search for an answer, Gray's test of reasoning ability was given to 28 undergraduates who also answered a questionnaire detailing their experience with analogies. They were randomly assigned to teaching conditions ranging from self-directed workbook study to intensive interactive assistance. No teaching method was superior overall, but interactions showed that (1) students who scored worst on the pretest improved the most, (2) those higher in cognitive functioning and experience performed better after intensive interactive assistance, and (3) those lower in both cognitive functioning and experience did significantly better with self-paced workbooks. This preliminary work suggests that it may be profitable to assess the prior experience and reasoning of potential students and adopt the methods for teaching formal operational thought found empirically to be most suitable.

  20. An Analysis of Cross Racial Identity Scale Scores Using Classical Test Theory and Rasch Item Response Models

    Science.gov (United States)

    Sussman, Joshua; Beaujean, A. Alexander; Worrell, Frank C.; Watson, Stevie

    2013-01-01

    Item response models (IRMs) were used to analyze Cross Racial Identity Scale (CRIS) scores. Rasch analysis scores were compared with classical test theory (CTT) scores. The partial credit model demonstrated a high goodness of fit and correlations between Rasch and CTT scores ranged from 0.91 to 0.99. CRIS scores are supported by both methods.…

  1. Generalization of the Lord-Wingersky Algorithm to Computing the Distribution of Summed Test Scores Based on Real-Number Item Scores

    Science.gov (United States)

    Kim, Seonghoon

    2013-01-01

    With known item response theory (IRT) item parameters, Lord and Wingersky provided a recursive algorithm for computing the conditional frequency distribution of number-correct test scores, given proficiency. This article presents a generalized algorithm for computing the conditional distribution of summed test scores involving real-number item…

  2. Improvement in Intelligence Test Scores from 6 to 10 years in Children of Teenage Mothers

    Science.gov (United States)

    Cornelius, Marie D.; Goldschmidt, Lidush; De Genna, Natacha M.; Richardson, Gale A.; Leech, Sharon L.; Day, Richard

    2010-01-01

    Objective This study investigates change in IQ scores among 290 children born to teenage mothers and identifies social, economic, and environmental variables that may be associated with change in intelligence test performance. Methods The children of 290 teenage mothers (72% African American and 28% European American) were assessed with the Stanford-Binet Intelligence Scale-4th Edition (SBIS) at ages 6 and 10. Results The mean composite score at age 6 was 84.8 and was 91.2 at age 10, an improvement of 6.4 points. Significant cross-sectional predictors at both ages 6 and 10 of higher SBIS scores were maternal cognitive ability, school grade, Caucasian ethnicity, and caregiver education. Having more children in the household significantly predicted lower SBIS scores at age 6. Higher satisfaction with maternal social support predicted higher SBIS scores at age 10. Change in IQ scores was not related to maternal socioeconomic status, social support, home environment, ethnicity, or family interactions. Custodial stability was associated with an improvement in IQ scores, while increase in caregiver depression was related to decline in IQ scores. Conclusions Our findings suggest that improvement in IQ scores of offspring of teenage mothers may be related to stability of maternal custody. More research is needed to determine the impact of the maturation of adolescent mothers' parenting and the role of early education on improvement in cognitive abilities. PMID:20495472

  3. Multiple Constructs and Effects of Accommodations on Accommodated Test Scores for Students with Disabilities

    Directory of Open Access Journals (Sweden)

    Stephanie W. Cawthon

    2009-10-01

    Full Text Available Students with disabilities frequently use accommodations to participate in large-scale, standardized assessments. Accommodations can include changes to the administration of the test, such as extended time, changes to the test items, such as read aloud, or changes to the student's response, such as the use of a scribe. Some accommodations or modifications risk changing the difficulty of the test items or decreasing the validity of how test scores are interpreted. Questions regarding the validity of accommodated tests are heightened when scores are used in high-stakes decisions such as grade promotion, graduation, teacher merit pay, or other accountability initiatives. The purpose of this article is to review existing literature on multiple constructs that affect validity of interpretations of accommodated assessment scores. Research on assessment accommodations continues to grow but offers few conclusive findings on whether they facilitate fair and accurate measurement of student knowledge and skill. The validity of an accommodated score appears to vary depending on several factors such as student characteristics, test characteristics, and the accommodations themselves. A multiple construct approach may facilitate more accurate evaluations of the effects of accommodated test scores

  4. The Dental Hygiene Aptitude Tests and the American College Testing Program Tests as Predictors of Scores on the National Board Dental Hygiene Examination.

    Science.gov (United States)

    Longenbecker, Sueann; Wood, Peter H.

    1984-01-01

    Scores from the National Board Dental Hygiene Examination (NBDHE) served as the criterion variable in a comparison of the predictive validity of the Dental Hygiene Aptitude Tests (DHAT) and the ACT Assessment tests. The DHAT-Science and Verbal tests combined to produce the highest multiple correlation with NBDHE scores. (Author/DWH)

  5. Exploring discrepancies in the TOEFL iBT scores of repeat test takers

    OpenAIRE

    2011-01-01

    Students choosing to study abroad either independently or as a component of a local degree must first demonstrate their language proficiency by achieving the required score of their destination institution in a TOEFL iBT or IELTS examination. With such high stakes, many candidates opt to repeat the test a number of times before the submission date in the hope that one of the tests will yield a sufficiently high score. This study analyzed the test results of twenty-five students who, toward th...

  6. Examining alternative scoring rubrics on a statewide test: The impact of different scoring methods on science and social studies performance assessments

    Science.gov (United States)

    Creighton, Susan Dabney

    There is no consensus regarding the most reliable and valid scoring methods for the assessment of higher order thinking skills. Most of the research on alternative formats has focused on the scoring of writing ability. This study examined the value of different types of performance assessment scoring guides on state mandated science and social studies tests. A proportional stratified sample of raters were randomly assigned to one of four scoring groups: checklist, analytic rubric, holistic rubric, and generic rubrics. A fifth method, the weighted analytic rubric, was included by applying an algorithmic formula to the scores assigned by raters using the analytic rubric. A comparison of the mean scores for the five scoring groups suggests that there may be a difference in the way raters applied the rubric for each group. Although the literature suggests that it is possible to achieve high levels of inter-rater reliability, across forms of scoring, phi coefficients of moderate strength were obtained for three of the four constructed-response items. Results for each scoring group were compared indicating that item complexity may impact the level of inter-rate, reliability and the selection of the most reliable rubric for each discipline. Analytic rubrics appear to achieve more reliable results with less complex items. A multitrait-multimethod approach was utilized to investigate the external validity of the social studies and science tasks. As expected, there tended to be a stronger association between the PACT science constructed-response scores with scores based on science multiple-choice scores than between the science constructed-response scores and the writing ability subtest scores. A similar pattern was seen with social studies items. These results provide some evidence for the validity of the performance assessments. A post study survey completed by raters provided qualitative information regarding their thought processes and their primary focus during the

  7. An electrophysiological correlate of Eating Attitudes Test scores in female college students.

    Science.gov (United States)

    Wilson, J F; Mercer, J C

    1990-11-01

    Eating Attitudes Test (EAT) scores of forty female college students were compared to their electrodermal activity (EDA) responses when offered a plate of chocolate chip cookies. A significant positive correlation was detected between the EAT scores and the skin conductivity measures associated with the presentation of food. Women with the highest EAT scores also exhibited the greatest sympathetic nervous system responses to a plate of cookies. This finding supports the conclusion that the EAT is capable of identifying individuals who are preoccupied with food or anxious about eating.

  8. Passing score and length of a mastery test: An old problem appraoched anew

    NARCIS (Netherlands)

    van der Linden, Willem J.

    1980-01-01

    A classical problem in mastery testing is the choice of passing score and test length so that the mastery decisions are optimal. This problem has been addressed several times from a variety of viewpoints. In this paper, the usual indifference zone approach is adopted, with a new criterion for optimi

  9. An Investigation of the Effectiveness of Vocabulary Learning Strategies on Iranian EFL Learners' Vocabulary Test Score

    Science.gov (United States)

    Rahimy, Ramin; Shams, Kiana

    2012-01-01

    This study aims to investigate the effectiveness of vocabulary learning strategies on Iranian EFL learners' vocabulary test score. To achieve this aim, fifty Intermediate level students from Kish English Institute were randomly selected from among fifteen classes after administering the Oxford Placement Test (OPT). Then, an intermediate level…

  10. The Disaggregation of Value-Added Test Scores to Assess Learning Outcomes in Economics Courses

    Science.gov (United States)

    Walstad, William B.; Wagner, Jamie

    2016-01-01

    This study disaggregates posttest, pretest, and value-added or difference scores in economics into four types of economic learning: positive, retained, negative, and zero. The types are derived from patterns of student responses to individual items on a multiple-choice test. The micro and macro data from the "Test of Understanding in College…

  11. Two for One: Using QAR to Increase Reading Comprehension and Improve Test Scores

    Science.gov (United States)

    Green, Susan

    2016-01-01

    This teaching tip describes an intervention used in a third-grade classroom implemented to help students pass an end-of-grade reading comprehension test. Low scores on a practice end-of-grade comprehension test prompted a re-examination of classroom reading instruction and a plan for intervention. This teaching tip describes the phases implemented…

  12. Zertifikat Deutsch als Fremdsprache and the Oral Proficiency Interview: A Comparison of Test Scores and Examinations.

    Science.gov (United States)

    Lalande, John F.; Schweckendiek, Jurgen

    1986-01-01

    Investigates what correlations might exist between an individual's score on the Zertifikat Deutsch als Fremdsprache and on the Oral Proficiency Interview. The tests themselves are briefly described. Results indicate that the two tests appear to correlate well in their evaluation of speaking skills. (SED)

  13. A review of methods for evaluating the fit of item score patterns on a test

    NARCIS (Netherlands)

    Meijer, R.R.; Sijtsma, Klaas

    1999-01-01

    Methods are discussed that can be used to investigate the fit of an item score pattern to a test model. Model-based tests and personality inventories are administered to more than 100 million people a year and, as a result, individual fit is of great concern. Item Response Theory (IRT) modeling and

  14. The Effects of Using Selected Metacognitive Strategies on ACT Mathematics Sub-Test Scores

    Science.gov (United States)

    LeMay, Jeffrey W.

    2016-01-01

    This quasi-experimental post-test only control group designed quantitative study examined whether or not members of an experimental group of participants who utilized two metacognitive strategy training regimens experienced a significant increase in their ACT mathematics sub-test scores compared to a group of students who did not utilize either of…

  15. Zertifikat Deutsch als Fremdsprache and the Oral Proficiency Interview: A Comparison of Test Scores and Examinations.

    Science.gov (United States)

    Lalande, John F.; Schweckendiek, Jurgen

    1986-01-01

    Investigates what correlations might exist between an individual's score on the Zertifikat Deutsch als Fremdsprache and on the Oral Proficiency Interview. The tests themselves are briefly described. Results indicate that the two tests appear to correlate well in their evaluation of speaking skills. (SED)

  16. Beating the Odds: A Low Equalized Assessed Valuation Elementary School with High Standardized Test Scores

    Science.gov (United States)

    Levin, Brian

    2011-01-01

    This mixed methods study examines what makes Bluffview Elementary School a success as measured by the ISAT, the mandated state test of Illinois. Despite national reports of achievement gaps and low test scores, Bluffview Elementary has shown sustained success in educating children. This paper reviews how Bluffview Elementary students are achieving…

  17. A review of methods for evaluating the fit of item score patterns on a test

    NARCIS (Netherlands)

    Meijer, Rob R.; Sijtsma, Klaas

    1999-01-01

    Methods are discussed that can be used to investigate the fit of an item score pattern to a test model. Model-based tests and personality inventories are administered to more than 100 million people a year and, as a result, individual fit is of great concern. Item Response Theory (IRT) modeling and

  18. The Disaggregation of Value-Added Test Scores to Assess Learning Outcomes in Economics Courses

    Science.gov (United States)

    Walstad, William B.; Wagner, Jamie

    2016-01-01

    This study disaggregates posttest, pretest, and value-added or difference scores in economics into four types of economic learning: positive, retained, negative, and zero. The types are derived from patterns of student responses to individual items on a multiple-choice test. The micro and macro data from the "Test of Understanding in College…

  19. Predicting Teacher Performance with Test Scores and Grade Point Average: A Meta-Analysis

    Science.gov (United States)

    D'Agostino, Jerome V.; Powers, Sonya J.

    2009-01-01

    A meta-analysis was conducted to examine the degree to which teachers' test scores and their performance in preparation programs as measured by their collegiate grade point average (GPA) predicted their teaching competence. Results from 123 studies that yielded 715 effect sizes were analyzed, and the mediating effects of test and GPA type,…

  20. The Effects of Testing Circumstance and Education Level on MMPI-2 Correction Scale Scores

    Science.gov (United States)

    2010-02-01

    Report The Effects of Testing Circumstance and Education Level on MMPI - 2 Correction Scale Scores DOT/FAA/AM-10/3 Office of Aerospace Medicine Washington...Education Level on MMPI - 2 Correction Scale Scores 6. Performing Organization Code 7. Author(s) 8. Performing Organization Report No...Inventory- 2 ( MMPI - 2 ) is used by the Federal Aviation Administration to assess psychopathology in air traffic control specialist applicants after a

  1. Sex differences on Purpose-In-Life Test total and factorial scores among spanish undergraduates

    OpenAIRE

    2011-01-01

    The aim of this paper is to analyze the differences on Purpose-In-Life Test [PIL] (Crumbaugh & Maholic, 1969) total and factorial scores associated to sex, among 309 spanish undergratudates (207 women, 102 men), range 18 to 45 years. PIL Spanish version is used (Noblejas de la Flor, 1994). PIL evalues life meaning achievement vs. existential vacuum. Women achieve higher means on PIL total and factorial scores, and estatistical analysis show that sex is significantly associated to total PIL sc...

  2. Correlation of Simulation Examination to Written Test Scores for Advanced Cardiac Life Support Testing: Prospective Cohort Study

    Directory of Open Access Journals (Sweden)

    Suzanne L. Strom

    2015-11-01

    Full Text Available Introduction: Traditional Advanced Cardiac Life Support (ACLS courses are evaluated using written multiple-choice tests. High-fidelity simulation is a widely used adjunct to didactic content, and has been used in many specialties as a training resource as well as an evaluative tool. There are no data to our knowledge that compare simulation examination scores with written test scores for ACLS courses. Objective: To compare and correlate a novel high-fidelity simulation-based evaluation with traditional written testing for senior medical students in an ACLS course. Methods: We performed a prospective cohort study to determine the correlation between simulationbased evaluation and traditional written testing in a medical school simulation center. Students were tested on a standard acute coronary syndrome/ventricular fibrillation cardiac arrest scenario. Our primary outcome measure was correlation of exam results for 19 volunteer fourth-year medical students after a 32-hour ACLS-based Resuscitation Boot Camp course. Our secondary outcome was comparison of simulation-based vs. written outcome scores. Results: The composite average score on the written evaluation was substantially higher (93.6% than the simulation performance score (81.3%, absolute difference 12.3%, 95% CI [10.6-14.0%], p<0.00005. We found a statistically significant moderate correlation between simulation scenario test performance and traditional written testing (Pearson r=0.48, p=0.04, validating the new evaluation method. Conclusion: Simulation-based ACLS evaluation methods correlate with traditional written testing and demonstrate resuscitation knowledge and skills. Simulation may be a more discriminating and challenging testing method, as students scored higher on written evaluation methods compared to simulation.

  3. The Impact of Correction for Guessing Formula on MC and Yes/No Vocabulary Tests' Scores

    Directory of Open Access Journals (Sweden)

    abdollah baradaran

    2009-10-01

    Full Text Available A standard correction for random guessing (cfg formula on multiple-choice and Yes/Noexaminations was examined retrospectively in the scores of the intermediate female EFL learners in an English language school. The correctionwas a weighting formula for points awarded for correct answers,incorrect answers, and unanswered questions so that the expectedvalue of the increase in test score due to guessing was zero. The researcher compared uncorrected and corrected scores on examinationsusing multiple-choice and Yes/No formats. These short-answer formats eliminatedor at least greatly reduced the potential for guessing the correctanswer. The expectation for students to improve their grade by guessingon multiple-choice and Yes/No format examinations is well known. The researcher examined a method for correcting for random guessing (cfg " no knowledge" on multiple- choice and Yes/No vocabulary examinations by comparing application and non-application of correction for guessing (cfg formula on scores on these examinations. It was done to determine whether the test takers really knew the correct answer, or they had resorted to a kind of guessing. This study represented a unique opportunity to compare scores from multiple-choice and Yes/No examinations in a settingin which students were given the same number of questions ineach of the two format types testing their knowledge over thesame subject matter. The results of this study indicated that the significant differences were highlighted between the subjects' scores when cfg formula was applied and when it was not.

  4. Deriving utility scores for co-morbid conditions: a test of the multiplicative model for combining individual condition scores

    Directory of Open Access Journals (Sweden)

    Le Petit Christel

    2006-10-01

    Full Text Available Abstract Background The co-morbidity of health conditions is becoming a significant health issue, particularly as populations age, and presents important methodological challenges for population health research. For example, the calculation of summary measures of population health (SMPH can be compromised if co-morbidity is not taken into account. One popular co-morbidity adjustment used in SMPH computations relies on a straightforward multiplicative combination of the severity weights for the individual conditions involved. While the convenience and simplicity of the multiplicative model are attractive, its appropriateness has yet to be formally tested. The primary objective of the current study was therefore to examine the empirical evidence in support of this approach. Methods The present study drew on information on the prevalence of chronic conditions and a utility-based measure of health-related quality of life (HRQoL, namely the Health Utilities Index Mark 3 (HUI3, available from Cycle 1.1 of the Canadian Community Health Survey (CCHS; 2000–01. Average HUI3 scores were computed for both single and co-morbid conditions, and were also purified by statistically removing the loss of functional health due to health problems other than the chronic conditions reported. The co-morbidity rule was specified as a multiplicative combination of the purified average observed HUI3 utility scores for the individual conditions involved, with the addition of a synergy coefficient s for capturing any interaction between the conditions not explained by the product of their utilities. The fit of the model to the purified average observed utilities for the co-morbid conditions was optimized using ordinary least squares regression to estimate s. Replicability of the results was assessed by applying the method to triple co-morbidities from the CCHS cycle 1.1 database, as well as to double and triple co-morbidities from cycle 2.1 of the CCHS (2003–04. Results

  5. A score based on screening tests to differentiate mild cognitive impairment from subjective memory complaints

    Directory of Open Access Journals (Sweden)

    Fábio Henrique de Gobbi Porto

    2013-09-01

    Full Text Available It is not easy to differentiate patients with mild cognitive impairment (MCI from subjective memory complainers (SMC. Assessments with screening cognitive tools are essential, particularly in primary care where most patients are seen. The objective of this study was to evaluate the diagnostic accuracy of screening cognitive tests and to propose a score derived from screening tests. Elderly subjects with memory complaints were evaluated using the Mini Mental State Examination (MMSE and the Brief Cognitive Battery (BCB. We added two delayed recalls in the MMSE (a delayed recall and a late-delayed recall, LDR, and also a phonemic fluency test of letter P fluency (LPF. A score was created based on these tests. The diagnoses were made on the basis of clinical consensus and neuropsychological testing. Receiver operating characteristic curve analyses were used to determine area under the curve (AUC, the sensitivity and specificity for each test separately and for the final proposed score. MMSE, LDR, LPF and delayed recall of BCB scores reach statistically significant differences between groups (P=0.000, 0.03, 0.001 and 0.01, respectively. Sensitivity, specificity and AUC were MMSE: 64%, 79% and 0.75 (cut off <29; LDR: 56%, 62% and 0.62 (cut off <3; LPF: 71%, 71% and 0.71 (cut off <14; delayed recall of BCB: 56%, 82% and 0.68 (cut off <9. The proposed score reached a sensitivity of 88% and 76% and specificity of 62% and 75% for cut off over 1 and over 2, respectively. AUC were 0.81. In conclusion, a score created from screening tests is capable of discriminating MCI from SMC with moderate to good accurancy.

  6. The Effect of School Poverty on Racial Gaps in Tests Scores: The Case of the Minnesota Basic Standards Tests

    Science.gov (United States)

    Myers, Samuel L.; Kim, Hyeoneui; Mandala, Cheryl

    2004-01-01

    A data from 1996,1998 and 1999 Minnesota comprehensive statewide testing on eight graders is used to analyze whether African American students perform worse than the white students who attend the poverty schools. The analyses conclude that African American-White test score gap is attributed more to the racial discriminations and racial treatments…

  7. The Effects of Group Members' Personalities on a Test Taker's L2 Group Oral Discussion Test Scores

    Science.gov (United States)

    Ockey, Gary J.

    2009-01-01

    The second language group oral is a test of second language speaking proficiency, in which a group of three or more English language learners discuss an assigned topic without interaction with interlocutors. Concerns expressed about the extent to which test takers' personal characteristics affect the scores of others in the group have limited its…

  8. The Effect of School Poverty on Racial Gaps in Tests Scores: The Case of the Minnesota Basic Standards Tests

    Science.gov (United States)

    Myers, Samuel L.; Kim, Hyeoneui; Mandala, Cheryl

    2004-01-01

    A data from 1996,1998 and 1999 Minnesota comprehensive statewide testing on eight graders is used to analyze whether African American students perform worse than the white students who attend the poverty schools. The analyses conclude that African American-White test score gap is attributed more to the racial discriminations and racial treatments…

  9. Impact of a standardized test package on exit examination scores and NCLEX-RN outcomes.

    Science.gov (United States)

    Homard, Catherine M

    2013-03-01

    The purpose of this ex post facto correlational study was to compare exit examination scores and NCLEX-RN(®) pass rates of baccalaureate nursing students who differed in level of participation in a standardized test package. Three cohort groups emerged as a standardized test package was introduced: (a) students who did not participate in a standardized test package; (b) students with two semesters of a standardized test package; and (c) students with four semesters of a standardized test package. Benner's novice-to-expert theory framed the study in the belief that students best acquire knowledge and skills through practice and reflection. Students participating in four semesters of a standardized test package demonstrated higher exit examination scores and NCLEX-RN pass rates compared with students who did not participate in this package. This study's results could inform nurse educators about strategies to facilitate nursing student success on exit examinations and the NCLEX-RN.

  10. Do We Really Become Smarter When Our Fluid-Intelligence Test Scores Improve?

    Science.gov (United States)

    Hayes, Taylor R; Petrov, Alexander A; Sederberg, Per B

    2015-01-01

    Recent reports of training-induced gains on fluid intelligence tests have fueled an explosion of interest in cognitive training-now a billion-dollar industry. The interpretation of these results is questionable because score gains can be dominated by factors that play marginal roles in the scores themselves, and because intelligence gain is not the only possible explanation for the observed control-adjusted far transfer across tasks. Here we present novel evidence that the test score gains used to measure the efficacy of cognitive training may reflect strategy refinement instead of intelligence gains. A novel scanpath analysis of eye movement data from 35 participants solving Raven's Advanced Progressive Matrices on two separate sessions indicated that one-third of the variance of score gains could be attributed to test-taking strategy alone, as revealed by characteristic changes in eye-fixation patterns. When the strategic contaminant was partialled out, the residual score gains were no longer significant. These results are compatible with established theories of skill acquisition suggesting that procedural knowledge tacitly acquired during training can later be utilized at posttest. Our novel method and result both underline a reason to be wary of purported intelligence gains, but also provide a way forward for testing for them in the future.

  11. The two-sample problem with induced dependent censorship.

    Science.gov (United States)

    Huang, Y

    1999-12-01

    Induced dependent censorship is a general phenomenon in health service evaluation studies in which a measure such as quality-adjusted survival time or lifetime medical cost is of interest. We investigate the two-sample problem and propose two classes of nonparametric tests. Based on consistent estimation of the survival function for each sample, the two classes of test statistics examine the cumulative weighted difference in hazard functions and in survival functions. We derive a unified asymptotic null distribution theory and inference procedure. The tests are applied to trial V of the International Breast Cancer Study Group and show that long duration chemotherapy significantly improves time without symptoms of disease and toxicity of treatment as compared with the short duration treatment. Simulation studies demonstrate that the proposed tests, with a wide range of weight choices, perform well under moderate sample sizes.

  12. Grades--Scores--Predictions: A Study of the Efficiency of High School Grades and American College Test Scores in Predicting Academic Achievement at Montgomery College.

    Science.gov (United States)

    Gell, Robert L.; Bleil, David F.

    This report analyzes the relationship between high school grades, American College Test (ACT) scores, and first-semester college grades. Based on the Standard Research Service of the ACT program, 1,379 students in the fall 1969 freshman class of Montgomery College (Maryland) were studied. Measures of academic background used ACT scores in English,…

  13. Individual differences in left parietal white matter predict math scores on the Preliminary Scholastic Aptitude Test.

    Science.gov (United States)

    Matejko, Anna A; Price, Gavin R; Mazzocco, Michèle M M; Ansari, Daniel

    2013-02-01

    Mathematical skills are of critical importance, both academically and in everyday life. Neuroimaging research has primarily focused on the relationship between mathematical skills and functional brain activity. Comparatively few studies have examined which white matter regions support mathematical abilities. The current study uses diffusion tensor imaging (DTI) to test whether individual differences in white matter predict performance on the math subtest of the Preliminary Scholastic Aptitude Test (PSAT). Grades 10 and 11 PSAT scores were obtained from 30 young adults (ages 17-18) with wide-ranging math achievement levels. Tract based spatial statistics was used to examine the correlation between PSAT math scores, fractional anisotropy (FA), radial diffusivity (RD) and axial diffusivity (AD). FA in left parietal white matter was positively correlated with math PSAT scores (specifically in the left superior longitudinal fasciculus, left superior corona radiata, and left corticospinal tract) after controlling for chronological age and same grade PSAT critical reading scores. Furthermore, RD, but not AD, was correlated with PSAT math scores in these white matter microstructures. The negative correlation with RD further suggests that participants with higher PSAT math scores have greater white matter integrity in this region. Individual differences in FA and RD may reflect variability in experience dependent plasticity over the course of learning and development. These results are the first to demonstrate that individual differences in white matter are associated with mathematical abilities on a nationally administered scholastic aptitude measure.

  14. A Score Type Test for General Autoregressive Models in Time Series

    Institute of Scientific and Technical Information of China (English)

    Jian-hong Wu; Li-xing Zhu

    2007-01-01

    This paper is devoted to the goodness-of-fit test for the general autoregressive models in time series. By averaging for the weighted residuals, we construct a score type test which is asymptotically standard chi-squared under the null and has some desirable power properties under the alternatives. Specifically, the test is sensitive to alternatives and can detect the alternatives approaching, along a direction, the null at a rate that is arbitrarily close to n-1/2. Furthermore, when the alternatives are not directional, we construct asymptotically distribution-free maximin tests for a large class of alternatives. The performance of the tests is evaluated through simulation studies.

  15. The Fight's Not Always Fixed: Using Literary Response to Transcend Standardized Test Scores

    Science.gov (United States)

    Avila, JuliAnna

    2012-01-01

    In 2004, the National Endowment for the Arts (NEA) concluded that "literature reading is fading as a meaningful activity, especially among younger people." How can educators continue to teach students about the power of literary response when the priority is for them to achieve proficiency on standardized tests, whose scores can only be narrowly…

  16. The Effects of Family Background, Test Scores, Personality Traits and Education on Economic Success.

    Science.gov (United States)

    Jencks, Christopher; Rainwater, Lee

    Ten surveys of American men aged 25-64 were analyzed to determine the effects of family background, adolescent personality traits, cognitive test scores, and years of schooling on occupational status and earnings in maturity. Some of the findings follow: Data on brothers indicated that prior research has underestimated the effect of family…

  17. Integrating GIS in the Middle School Curriculum: Impacts on Diverse Students' Standardized Test Scores

    Science.gov (United States)

    Goldstein, Donna; Alibrandi, Marsha

    2013-01-01

    This case study conducted with 1,425 middle school students in Palm Beach County, Florida, included a treatment group receiving GIS instruction (256) and a control group without GIS instruction (1,169). Quantitative analyses on standardized test scores indicated that inclusion of GIS in middle school curriculum had a significant effect on student…

  18. End of Course Grades and Standardized Test Scores: Are Grades Predictive of Student Achievement?

    Science.gov (United States)

    Ricketts, Christine R.

    2010-01-01

    This study examined the extent to which end-of-course grades are predictive of Virginia Standards of Learning test scores in nine high school content areas. It also analyzed the impact of the variables school cluster attended, gender, ethnicity, disability status, Limited English Proficiency status, and socioeconomic status on the relationship…

  19. Score test for familial aggregation in probands studies: application to Alzheimer's disease

    NARCIS (Netherlands)

    D. Commenges; H. Jacqmin; L. Letenneur; C.M. van Duijn (Cock)

    1995-01-01

    textabstractWhen studying familial aggregation of a disease, the following two-stage design is often used: first select index subjects (cases and controls); then record data on their relatives. The likelihood corresponding to this design is derived and a score test of homogeneity is proposed for tes

  20. The Fight's Not Always Fixed: Using Literary Response to Transcend Standardized Test Scores

    Science.gov (United States)

    Avila, JuliAnna

    2012-01-01

    In 2004, the National Endowment for the Arts (NEA) concluded that "literature reading is fading as a meaningful activity, especially among younger people." How can educators continue to teach students about the power of literary response when the priority is for them to achieve proficiency on standardized tests, whose scores can only be narrowly…

  1. Relationship of Friends, Physical Education, and State Test Scores: Implications for School Counselors

    Science.gov (United States)

    Hollingsworth, Mary Ann

    2010-01-01

    This study examined the relationship between dimensions of wellness and academic performance for 634 third through fifth grade students in Title One schools in rural Mississippi, using composites of the Five Factor Wellness Inventory for Elementary Children and Reading, Language, and Math Scores of the Mississippi Curriculum Test (a state level…

  2. Permanent Income and the Black-White Test Score Gap. NBER Working Paper No. 17610

    Science.gov (United States)

    Rothstein, Jesse; Wozny, Nathan

    2011-01-01

    Analysts often examine the black-white test score gap conditional on family income. Typically only a current income measure is available. We argue that the gap conditional on permanent income is of greater interest, and we describe a method for identifying this gap using an auxiliary data set to estimate the relationship between current and…

  3. Using College Admission Test Scores to Clarify High School Placement. Leading Indicator Spotlight

    Science.gov (United States)

    Flug, Susanna

    2010-01-01

    In "Beyond Test Scores: Leading Indicators for Education," Foley and colleagues (2008) define leading indicators as those that "provide early signals of progress toward academic achievement" (p. 1) and stress that educators "need leading indicators to help them see the direction their efforts are going in and to take…

  4. Raise Test Scores without Selling Your Soul: An Interview with Scott Mandel

    Science.gov (United States)

    Curriculum Review, 2006

    2006-01-01

    With his 10th book, Improving Test Scores: A Practical Approach for Teachers and Administrators, Scott Mandel outlines steps educators can take to boost achievement on standardized exams while maintaining the integrity of their day-to-day teaching. Mandel, who holds a Ph.D. in curriculum and instruction from USC, teaches history and English at…

  5. Detecting Dissimulation in Personality Test Scores: A Comparison between Person-Fit Indices and Detection Scales.

    Science.gov (United States)

    Ferrando, Pere J.; Chico, Eliseo

    2001-01-01

    Examined whether a procedure based on item response theory (IRT) for assessing the scalability of response patterns could detect deliberate dissimulation (faking good) on scores from three tests of the Eysenck Personality Questionnaire Revised. Results for 489 and 140 undergraduates show that IRT measures were not powerful enough to detect…

  6. Testing Vegetation Flammability: The Problem of Extremely Low Ignition Frequency and Overall Flammability Score

    Directory of Open Access Journals (Sweden)

    Zorica Kauf

    2014-01-01

    Full Text Available In the recent decades changes in fire regimes led to higher vulnerability of fire prone ecosystems, with vegetation being the only component influencing fire regime which can be managed in order to reduce probability of extreme fire events. For these management practices to be effective reliable information on the vegetation flammability is being crucial. Epiradiator based testing methods are one of the methods commonly used to investigate vegetation flammability and decrease in ignition frequency is always interpreted as a decrease in flammability. Furthermore, gathered information is often combined into a single flammability score. Here we present results of leaf litter testing which, together with previously conducted research on similar materials, show that material with very low ignition frequency under certain testing conditions can be extremely flammable if testing conditions are slightly changed. Additionally, our results indicate that combining measured information into one single flammability score, even though sometimes useful, is not always meaningful and should be performed with caution.

  7. Rey's Auditory Verbal Learning Test scores can be predicted from whole brain MRI in Alzheimer's disease

    Directory of Open Access Journals (Sweden)

    Elaheh Moradi

    2017-01-01

    Full Text Available Rey's Auditory Verbal Learning Test (RAVLT is a powerful neuropsychological tool for testing episodic memory, which is widely used for the cognitive assessment in dementia and pre-dementia conditions. Several studies have shown that an impairment in RAVLT scores reflect well the underlying pathology caused by Alzheimer's disease (AD, thus making RAVLT an effective early marker to detect AD in persons with memory complaints. We investigated the association between RAVLT scores (RAVLT Immediate and RAVLT Percent Forgetting and the structural brain atrophy caused by AD. The aim was to comprehensively study to what extent the RAVLT scores are predictable based on structural magnetic resonance imaging (MRI data using machine learning approaches as well as to find the most important brain regions for the estimation of RAVLT scores. For this, we built a predictive model to estimate RAVLT scores from gray matter density via elastic net penalized linear regression model. The proposed approach provided highly significant cross-validated correlation between the estimated and observed RAVLT Immediate (R = 0.50 and RAVLT Percent Forgetting (R = 0.43 in a dataset consisting of 806 AD, mild cognitive impairment (MCI or healthy subjects. In addition, the selected machine learning method provided more accurate estimates of RAVLT scores than the relevance vector regression used earlier for the estimation of RAVLT based on MRI data. The top predictors were medial temporal lobe structures and amygdala for the estimation of RAVLT Immediate and angular gyrus, hippocampus and amygdala for the estimation of RAVLT Percent Forgetting. Further, the conversion of MCI subjects to AD in 3-years could be predicted based on either observed or estimated RAVLT scores with an accuracy comparable to MRI-based biomarkers.

  8. The reliability and validity of qualitative scores for the Controlled Oral Word Association Test.

    Science.gov (United States)

    Ross, Thomas P; Calhoun, Emily; Cox, Tara; Wenner, Carolyn; Kono, Whitney; Pleasant, Morgan

    2007-05-01

    The reliability and validity of two qualitative scoring systems for the Controlled Oral Word Association Test [Benton, A. L., Hamsher, de S. K., & Sivan, A. B. (1983). Multilingual aplasia examination (2nd ed.). Iowa City, IA: AJA Associates] were examined in 108 healthy young adults. The scoring systems developed by Troyer et al. [Troyer, A. K., Moscovich, M., & Winocur, G. (1997). Clustering and switching as two components of verbal fluency: Evidence from younger and older healthy adults. Neuropsychology, 11, 138-146] and by Abwender et al. [Abwender, D. A., Swan, J. G., Bowerman, J. T., & Connolly, S. W. (2001a). Qualitative analysis of verbal fluency output: Review and comparison of several scoring methods. Assessment, 8, 323-336] each demonstrated excellent interrater reliability (all indices at or above r(icc)=.9). Consistent with previous research [e.g., Ross, T. P. (2003). The reliability of cluster and switch scores for the COWAT. Archives of Clinical Psychology, 18, 153-164), test-retest reliability coefficients (N=53; M interval 44.6 days) for the qualitative scores were modest to poor (r(icc)=.6 to .4 range). Correlations among COWAT scores, measures of executive functioning, verbal learning, working memory, and vocabulary were examined. The idea that qualitative scores represent distinct executive functions such as cognitive flexibility or strategy utilization was not supported. We offer the interpretation that COWAT performance may require the ability to retrieve words in a non-routine manner while suppressing habitual responses and associated processing interference, presumably due to a spread of activation across semantic or lexical networks. This interpretation, though speculative at present, implies that clustering and switching on the COWAT may not be entirely deliberate, but rather an artifact of a passive (i.e., state-dependent) process. Ideas for future research, most noticeably experimental studies using cognitive methods (e.g., priming), are

  9. Rey's Auditory Verbal Learning Test scores can be predicted from whole brain MRI in Alzheimer's disease.

    Science.gov (United States)

    Moradi, Elaheh; Hallikainen, Ilona; Hänninen, Tuomo; Tohka, Jussi

    2017-01-01

    Rey's Auditory Verbal Learning Test (RAVLT) is a powerful neuropsychological tool for testing episodic memory, which is widely used for the cognitive assessment in dementia and pre-dementia conditions. Several studies have shown that an impairment in RAVLT scores reflect well the underlying pathology caused by Alzheimer's disease (AD), thus making RAVLT an effective early marker to detect AD in persons with memory complaints. We investigated the association between RAVLT scores (RAVLT Immediate and RAVLT Percent Forgetting) and the structural brain atrophy caused by AD. The aim was to comprehensively study to what extent the RAVLT scores are predictable based on structural magnetic resonance imaging (MRI) data using machine learning approaches as well as to find the most important brain regions for the estimation of RAVLT scores. For this, we built a predictive model to estimate RAVLT scores from gray matter density via elastic net penalized linear regression model. The proposed approach provided highly significant cross-validated correlation between the estimated and observed RAVLT Immediate (R = 0.50) and RAVLT Percent Forgetting (R = 0.43) in a dataset consisting of 806 AD, mild cognitive impairment (MCI) or healthy subjects. In addition, the selected machine learning method provided more accurate estimates of RAVLT scores than the relevance vector regression used earlier for the estimation of RAVLT based on MRI data. The top predictors were medial temporal lobe structures and amygdala for the estimation of RAVLT Immediate and angular gyrus, hippocampus and amygdala for the estimation of RAVLT Percent Forgetting. Further, the conversion of MCI subjects to AD in 3-years could be predicted based on either observed or estimated RAVLT scores with an accuracy comparable to MRI-based biomarkers.

  10. Effects of Classroom Ventilation Rate and Temperature on Students' Test Scores.

    Directory of Open Access Journals (Sweden)

    Ulla Haverinen-Shaughnessy

    Full Text Available Using a multilevel approach, we estimated the effects of classroom ventilation rate and temperature on academic achievement. The analysis is based on measurement data from a 70 elementary school district (140 fifth grade classrooms from Southwestern United States, and student level data (N = 3109 on socioeconomic variables and standardized test scores. There was a statistically significant association between ventilation rates and mathematics scores, and it was stronger when the six classrooms with high ventilation rates that were indicated as outliers were filtered (> 7.1 l/s per person. The association remained significant when prior year test scores were included in the model, resulting in less unexplained variability. Students' mean mathematics scores (average 2286 points were increased by up to eleven points (0.5% per each liter per second per person increase in ventilation rate within the range of 0.9-7.1 l/s per person (estimated effect size 74 points. There was an additional increase of 12-13 points per each 1°C decrease in temperature within the observed range of 20-25°C (estimated effect size 67 points. Effects of similar magnitude but higher variability were observed for reading and science scores. In conclusion, maintaining adequate ventilation and thermal comfort in classrooms could significantly improve academic achievement of students.

  11. The Relationships between Social Class, Listening Test Anxiety and Test Scores

    OpenAIRE

    Omid Talebi Rezaabadi

    2016-01-01

    This study investigated the relationships between the social anxiety, social class and listening-test anxiety of students learning English as a foreign language. The aims of the study were to examine the relationship between listening-test anxiety and listening-test performance. The data were collected using an adapted Foreign Language Listening Anxiety Scale and a newly developed Foreign Language Social Anxiety Scale. The potential correlation between social anxiety and listening-test perfor...

  12. The Relationships between Social Class, Listening Test Anxiety and Test Scores

    OpenAIRE

    Omid Talebi Rezaabadi

    2016-01-01

    This study investigated the relationships between the social anxiety, social class and listening-test anxiety of students learning English as a foreign language. The aims of the study were to examine the relationship between listening-test anxiety and listening-test performance. The data were collected using an adapted Foreign Language Listening Anxiety Scale and a newly developed Foreign Language Social Anxiety Scale. The potential correlation between social anxiety and listening-test perfor...

  13. Identifying genetic marker sets associated with phenotypes via an efficient adaptive score test

    KAUST Repository

    Cai, T.

    2012-06-25

    In recent years, genome-wide association studies (GWAS) and gene-expression profiling have generated a large number of valuable datasets for assessing how genetic variations are related to disease outcomes. With such datasets, it is often of interest to assess the overall effect of a set of genetic markers, assembled based on biological knowledge. Genetic marker-set analyses have been advocated as more reliable and powerful approaches compared with the traditional marginal approaches (Curtis and others, 2005. Pathways to the analysis of microarray data. TRENDS in Biotechnology 23, 429-435; Efroni and others, 2007. Identification of key processes underlying cancer phenotypes using biologic pathway analysis. PLoS One 2, 425). Procedures for testing the overall effect of a marker-set have been actively studied in recent years. For example, score tests derived under an Empirical Bayes (EB) framework (Liu and others, 2007. Semiparametric regression of multidimensional genetic pathway data: least-squares kernel machines and linear mixed models. Biometrics 63, 1079-1088; Liu and others, 2008. Estimation and testing for the effect of a genetic pathway on a disease outcome using logistic kernel machine regression via logistic mixed models. BMC bioinformatics 9, 292-2; Wu and others, 2010. Powerful SNP-set analysis for case-control genome-wide association studies. American Journal of Human Genetics 86, 929) have been proposed as powerful alternatives to the standard Rao score test (Rao, 1948. Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation. Mathematical Proceedings of the Cambridge Philosophical Society, 44, 50-57). The advantages of these EB-based tests are most apparent when the markers are correlated, due to the reduction in the degrees of freedom. In this paper, we propose an adaptive score test which up- or down-weights the contributions from each member of the marker-set based on the Z-scores of

  14. The Relationships between Social Class, Listening Test Anxiety and Test Scores

    Science.gov (United States)

    Rezaabadi, Omid Talebi

    2016-01-01

    This study investigated the relationships between the social anxiety, social class and listening-test anxiety of students learning English as a foreign language. The aims of the study were to examine the relationship between listening-test anxiety and listening-test performance. The data were collected using an adapted Foreign Language Listening…

  15. Admissions Testing at Career College and Trade School Training Programs. Test Score Guidelines, Norms, and Student Demographics.

    Science.gov (United States)

    Wonderlic, Charles F.; And Others

    This report provides a method for determining minimum score by vocational program based on the use of the Wonderlic Scholastic Level Exam (SLE). The SLE has been demonstrated to be a highly accurate and reliable measure of adult cognitive ability. It is currently in use as an admissions test at many career colleges and trade schools. The SLE test…

  16. Bayesian and Empirical Bayes Approaches to Setting Passing Scores on Mastery Tests. Publication Series in Mastery Testing.

    Science.gov (United States)

    Huynh, Huynh; Saunders, Joseph C., III

    The Bayesian approach to setting passing scores, as proposed by Swaminathan, Hambleton, and Algina, is compared with the empirical Bayes approach to the same problem that is derived from Huynh's decision-theoretic framework. Comparisons are based on simulated data which follow an approximate beta-binomial distribution and on real test results from…

  17. School Readiness and the Draw-a-Man Test: An Empiricaly Derived Alternative to Harris' Scoring System.

    Science.gov (United States)

    Simner, Marvin L.

    1985-01-01

    An abbreviated scoring system for the Goodenough-Harris Draw-A-Man Test found that three items had the same overall potential for correctly identifying at-risk kindergarteners as more time-consuming scoring methods. (CL)

  18. Spatial and verbal memory test scores following yoga and fine arts camps for school children.

    Science.gov (United States)

    Manjunath, N K; Telles, Shirley

    2004-07-01

    The performance scores of children (aged 11 to 16 years) in verbal and spatial memory tests were compared for two groups (n = 30, each), one attending a yoga camp and the other a fine arts camp. Both groups were assessed on the memory tasks initially and after ten days of their respective interventions. A control group (n = 30) was similarly studied to assess the test-retest effect. At the final assessment the yoga group showed a significant increase of 43% in spatial memory scores (Multivariate analysis, Tukey test), while the fine arts and control groups showed no change. The results suggest that yoga practice, including physical postures, yoga breathing, meditation and guided relaxation improved delayed recall of spatial information.

  19. The relationship between the ability to identify evaluation criteria and integrity test scores

    Directory of Open Access Journals (Sweden)

    CORNELIUS J. KÖNIG

    2006-09-01

    Full Text Available It has been argued that applicants who have the ability to identify what kind of behavior is evaluated positively in a personnel selection situation can use this information to adapt their behavior accordingly. Although this idea has been tested for assessment centers and structured interviews, it has not been studied with regard to integrity tests (or other personality tests. Therefore, this study tested whether candidates’ ability to identify evaluation criteria (ATIC correlates with their integrity test scores. Candidates were tested in an application training setting (N = 92. The results supported the idea that ATIC also plays an important role for integrity tests. New directions for future research are suggested based on this finding.

  20. Student Test Scores: How the Sausage Is Made and Why You Should Care. Evidence Speaks Reports, Vol 1, #25

    Science.gov (United States)

    Jacob, Brian A.

    2016-01-01

    Contrary to popular belief, modern cognitive assessments--including the new Common Core tests--produce test scores based on sophisticated statistical models rather than the simple percent of items a student answers correctly. While there are good reasons for this, it means that reported test scores depend on many decisions made by test designers,…

  1. Pathology update to the Manchester Scoring System based on testing in over 4000 families.

    Science.gov (United States)

    Evans, D Gareth; Harkness, Elaine F; Plaskocinska, Inga; Wallace, Andrew J; Clancy, Tara; Woodward, Emma R; Howell, Tony A; Tischkowitz, Marc; Lalloo, Fiona

    2017-05-10

    While the requirement for thresholds for testing for mutations in BRCA1/2 is being questioned, they are likely to remain for individuals unaffected by a relevant cancer. It is still useful to provide pretesting likelihoods, but models need to take into account tumour pathology. The Manchester Scoring System (MSS) is a well-used, simple, paper-based model for assessing carrier probability that already incorporates pathology data. We have used mutation testing data from 4115 unrelated samples from affected non-Jewish individuals alongside tumour pathology to further refine the scoring system. Adding additional points for high-grade serous ovarian cancer <60 (HGSOC=+2) and adding grade score to those with triple-negative breast cancer, while reducing the score for those with HER2+ breast cancer (-6), resulted in significantly improved sensitivity and minor improvements in specificity to the MSS. Sporadic HGSOC <60 years thus reached a score of 15-19 points within the 10% grouping consistent with the 15/113-13.2% that were identified with a BRCA1/2 pathogenic variant. Validation in a population series of ovarian cancer from Cambridge showed high sensitivity at the 10% threshold 15/17 (88.2%). The new pathology-adjusted Manchester score MSS3 appears to provide an effective and simple-to-use estimate of the 10% and 20% thresholds for BRCA1/2 likelihood. For unaffected individuals, the 20-point (20%) threshold in their affected first-degree relative can be used to determine eligibility at the 10% threshold. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  2. Pediatric residents' learning styles and temperaments and their relationships to standardized test scores.

    Science.gov (United States)

    Tuli, Sanjeev Y; Thompson, Lindsay A; Saliba, Heidi; Black, Erik W; Ryan, Kathleen A; Kelly, Maria N; Novak, Maureen; Mellott, Jane; Tuli, Sonal S

    2011-12-01

    Board certification is an important professional qualification and a prerequisite for credentialing, and the Accreditation Council for Graduate Medical Education (ACGME) assesses board certification rates as a component of residency program effectiveness. To date, research has shown that preresidency measures, including National Board of Medical Examiners scores, Alpha Omega Alpha Honor Medical Society membership, or medical school grades poorly predict postresidency board examination scores. However, learning styles and temperament have been identified as factors that 5 affect test-taking performance. The purpose of this study is to characterize the learning styles and temperaments of pediatric residents and to evaluate their relationships to yearly in-service and postresidency board examination scores. This cross-sectional study analyzed the learning styles and temperaments of current and past pediatric residents by administration of 3 validated tools: the Kolb Learning Style Inventory, the Keirsey Temperament Sorter, and the Felder-Silverman Learning Style test. These results were compared with known, normative, general and medical population data and evaluated for correlation to in-service examination and postresidency board examination scores. The predominant learning style for pediatric residents was converging 44% (33 of 75 residents) and the predominant temperament was guardian 61% (34 of 56 residents). The learning style and temperament distribution of the residents was significantly different from published population data (P  =  .002 and .04, respectively). Learning styles, with one exception, were found to be unrelated to standardized test scores. The predominant learning style and temperament of pediatric residents is significantly different than that of the populations of general and medical trainees. However, learning styles and temperament do not predict outcomes on standardized in-service and board examinations in pediatric residents.

  3. The effect of instructional methodology on high school students natural sciences standardized tests scores

    Science.gov (United States)

    Powell, P. E.

    Educators have recently come to consider inquiry based instruction as a more effective method of instruction than didactic instruction. Experience based learning theory suggests that student performance is linked to teaching method. However, research is limited on inquiry teaching and its effectiveness on preparing students to perform well on standardized tests. The purpose of the study to investigate whether one of these two teaching methodologies was more effective in increasing student performance on standardized science tests. The quasi experimental quantitative study was comprised of two stages. Stage 1 used a survey to identify teaching methods of a convenience sample of 57 teacher participants and determined level of inquiry used in instruction to place participants into instructional groups (the independent variable). Stage 2 used analysis of covariance (ANCOVA) to compare posttest scores on a standardized exam by teaching method. Additional analyses were conducted to examine the differences in science achievement by ethnicity, gender, and socioeconomic status by teaching methodology. Results demonstrated a statistically significant gain in test scores when taught using inquiry based instruction. Subpopulation analyses indicated all groups showed improved mean standardized test scores except African American students. The findings benefit teachers and students by presenting data supporting a method of content delivery that increases teacher efficacy and produces students with a greater cognition of science content that meets the school's mission and goals.

  4. Adults with poor reading skills: How lexical knowledge interacts with scores on standardized reading comprehension tests.

    Science.gov (United States)

    McKoon, Gail; Ratcliff, Roger

    2016-01-01

    Millions of adults in the United States lack the necessary literacy skills for most living wage jobs. For students from adult learning classes, we used a lexical decision task to measure their knowledge of words and we used a decision-making model (Ratcliff's, 1978, diffusion model) to abstract the mechanisms underlying their performance from their RTs and accuracy. We also collected scores for each participant on standardized IQ tests and standardized reading tests used commonly in the education literature. We found significant correlations between the model's estimates of the strengths with which words are represented in memory and scores for some of the standardized tests but not others. The findings point to the feasibility and utility of combining a test of word knowledge, lexical decision, that is well-established in psycholinguistic research, a decision-making model that supplies information about underlying mechanisms, and standardized tests. The goal for future research is to use this combination of approaches to understand better how basic processes relate to standardized tests with the eventual aim of understanding what these tests are measuring and what the specific difficulties are for individual, low-literacy adults. Copyright © 2015. Published by Elsevier B.V.

  5. [Internal structure and standardised scores of the Torrance Test of Creative Thinking].

    Science.gov (United States)

    Ferrando, Mercedes; Ferrándiz, Carmen; Bermejo, María R; Sánchez, Cristina; Parra, Joaquín; Prieto, María D

    2007-08-01

    The present work sets out to study the internal structure of the Torrance Test of Creative Thinking (TTCT) and to establish standardised scores that will enable the test to be used in both a diagnostic and educational context. 649 students (319 girls and 330 boys), aged 5 to 12 years from various schools in Murcia and Alicante (SE Spain), took part in the study. The findings suggest that the psychometric characteristics of TTCT are satisfactory, and its internal structure can be attributed to three factors that are responsible for a high percentage of the variance (73.8%). The standardised score tables, which are provided for first time in this context, will be useful in the evaluation of creativity and the identification of students with high intellectual abilities.

  6. Psychometrics of Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT) scores.

    Science.gov (United States)

    Brannick, Michael T; Wahi, Monika M; Goldin, Steven B

    2011-08-01

    A sample of 183 medical students completed the Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT V2.0). Scores on the test were examined for evidence of reliability and factorial validity. Although Cronbach's alpha for the total scores was adequate (.79), many of the scales had low internal consistency (scale alphas ranged from .34 to .77; median = .48). Previous factor analyses of the MSCEIT are critiqued and the rationale for the current analysis is presented. Both confirmatory and exploratory factor analyses of the MSCEIT item parcels are reported. Pictures and faces items formed separate factors rather than loading on a Perception factor. Emotional Management appeared as a factor, but items from Blends and Facilitation failed to load consistently on any factor, rendering factors for Emotional Understanding and Emotional Facilitation problematic.

  7. Predicting scores of the Halstead Category Test with the WAIS-III.

    Science.gov (United States)

    Titus, Jeffrey B; Retzlaff, Paul D; Dean, Raymond S

    2002-09-01

    The Halstead Category Test (HCT) and the Wechsler Adult Intelligence Scale (WAIS) are two of the most widely used neuropsychological tests. Often assessment conclusions are dependent upon the comparison of these measures. Therefore, it is crucial for clinicians to know how they relate to one another. This study examined the relationship between the HCT and the WAIS-III with undergraduate psychology students. Correlational analyses were conducted between HCT scores and WAIS-III subtests, Verbal and Performance IQ, and Full Scale IQ scores. Additionally, the new WAIS-III scales (Letter-Number Sequencing, Matrix Reasoning, and Symbol Search) were further examined. Regression analyses were run to develop predictor equations for the HCT using VIQ, PIQ, and FSIQ. Finally, predictor tables were generated between the HCT and VIQ, PIQ, and FSIQ to provide assessment of brain dysfunction for clinical use.

  8. Generalizing Terwilliger's likelihood approach: a new score statistic to test for genetic association

    OpenAIRE

    Hsu Li; Helmer Quinta; de Visser Marieke CH; Uitte de Willige Shirley; el Galta Rachid; Houwing-Duistermaat Jeanine J

    2007-01-01

    Abstract Background: In this paper, we propose a one degree of freedom test for association between a candidate gene and a binary trait. This method is a generalization of Terwilliger's likelihood ratio statistic and is especially powerful for the situation of one associated haplotype. As an alternative to the likelihood ratio statistic, we derive a score statistic, which has a tractable expression. For haplotype analysis, we assume that phase is known. Results: By means of a simulation study...

  9. Testing Students with Special Educational Needs in Large-Scale Assessments - Psychometric Properties of Test Scores and Associations with Test Taking Behavior.

    Science.gov (United States)

    Pohl, Steffi; Südkamp, Anna; Hardt, Katinka; Carstensen, Claus H; Weinert, Sabine

    2016-01-01

    Assessing competencies of students with special educational needs in learning (SEN-L) poses a challenge for large-scale assessments (LSAs). For students with SEN-L, the available competence tests may fail to yield test scores of high psychometric quality, which are-at the same time-measurement invariant to test scores of general education students. We investigated whether we can identify a subgroup of students with SEN-L, for which measurement invariant competence measures of adequate psychometric quality may be obtained with tests available in LSAs. We furthermore investigated whether differences in test-taking behavior may explain dissatisfying psychometric properties and measurement non-invariance of test scores within LSAs. We relied on person fit indices and mixture distribution models to identify students with SEN-L for whom test scores with satisfactory psychometric properties and measurement invariance may be obtained. We also captured differences in test-taking behavior related to guessing and missing responses. As a result we identified a subgroup of students with SEN-L for whom competence scores of adequate psychometric quality that are measurement invariant to those of general education students were obtained. Concerning test taking behavior, there was a small number of students who unsystematically picked response options. Removing these students from the sample slightly improved item fit. Furthermore, two different patterns of missing responses were identified that explain to some extent problems in the assessments of students with SEN-L.

  10. Open-book tests : Search behaviour, time used and test scores

    NARCIS (Netherlands)

    Westerkamp, Andrie C.; Heijne-Penninga, Marjolein; Kuks, Jan B. M.; Cohen-Schotanus, Janke

    2013-01-01

    Background: Because of the increasing medical knowledge and the focus of medical education on acquiring competencies, the use of open-book tests seems inevitable. Dealing with a large body of information, indicating which kind of information is needed to solve a problem, and finding and understandin

  11. Comparison of the Qualitative and Developmental Scoring Systems for the Modified Version of the Bender-Gestalt Test.

    Science.gov (United States)

    Brannigan, Gary G.; Brunner, Nancy A.

    1993-01-01

    Examined two scoring systems for Modified Version of the Bender-Gestalt Test. Administered Bender-Gestalt and Otis-Lennon School Ability Test to 75 first-grade and 84 second-grade students. Both systems were significantly correlated with school ability. Results of tests for differences between correlations indicated that Qualitative Scoring System…

  12. Classroom attributes and achievement test scores for deaf and hard of hearing students.

    Science.gov (United States)

    Holt, J

    1994-10-01

    This study examined reading comprehension and mathematics computation achievement of deaf and hard-of-hearing students in a variety of school settings. Data were collected by Gallaudet University's Center for Assessment and Demographic Studies during its 1990 standardization of the 8th Edition Stanford Achievement Test. Descriptive and inferential statistical methods were used to analyze the relationships among achievement scores, classroom attributes, and demographic factors associated with achievement. Based on the results of this study, inclusion with hearing students in regular classrooms is related to a variety of demographic factors. When reading comprehension and mathematics computation scores are adjusted for these factors, they are higher for the deaf and hard-of-hearing students in regular classrooms. However, it is not known whether the higher achievement is due to inclusion or whether students were selected for inclusion due to their higher achievement levels.

  13. Test Scores, Class Rank and College Performance: Lessons for Broadening Access and Promoting Success

    Science.gov (United States)

    Niu, Sunny X.; Tienda, Marta

    2012-01-01

    Using administrative data for five Texas universities that differ in selectivity, this study evaluates the relative influence of two key indicators for college success—high school class rank and standardized tests. Empirical results show that class rank is the superior predictor of college performance and that test score advantages do not insulate lower ranked students from academic underperformance. Using the UT-Austin campus as a test case, we conduct a simulation to evaluate the consequences of capping students admitted automatically using both achievement metrics. We find that using class rank to cap the number of students eligible for automatic admission would have roughly uniform impacts across high schools, but imposing a minimum test score threshold on all students would have highly unequal consequences by greatly reduce the admission eligibility of the highest performing students who attend poor high schools while not jeopardizing admissibility of students who attend affluent high schools. We discuss the implications of the Texas admissions experiment for higher education in Europe. PMID:23788828

  14. COMPARISON BETWEEN WOOD DRYING DEFECT SCORES: SPECIMEN TESTING X ANALYSIS OF KILN-DRIED BOARDS

    Directory of Open Access Journals (Sweden)

    Djeison Cesar Batista

    2015-04-01

    Full Text Available It is important to develop drying technologies for Eucalyptus grandis lumber, which is one of the most planted species of this genus in Brazil and plays an important role as raw material for the wood industry. The general aim of this work was to assess the conventional kiln drying of juvenile wood of three clones of Eucalyptus grandis. The specific aims were to compare the behavior between: i drying defects indicated by tests with wood specimens and conventional kiln-dried boards; and ii physical properties and the drying quality. Five 11-year-old trees of each clone were felled, and only flatsawn boards of the first log were used. Basic density and total shrinkage were determined, and the drying test with wood specimens at 100 °C was carried out. Kiln drying of boards was performed, and initial and final moisture content, moisture gradient in thickness, drying stresses and drying defects were assessed. The defect scoring method was used to verify the behavior between the defects detected by specimen testing and the defects detected in kiln-dried boards. As main results, the drying schedule was too severe for the wood, resulting in a high level of boards with defects. The behavior between the defects in the drying test with specimens and the defects of kiln-dried boards was different, there was no correspondence, according to the defect scoring method.

  15. Can the external masculinization score predict the success of genetic testing in 46,XY DSD?

    Directory of Open Access Journals (Sweden)

    Ruthie Su

    2015-05-01

    Full Text Available Genetic testing is judiciously applied to individuals with Disorders of Sex Development (DSD and so it is necessary to identify those most likely to benefit from such testing. We hypothesized that the external masculinization score (EMS is inversely associated with the likelihood of finding a pathogenic genetic variant. Patients with 46,XY DSD from a single institution evaluated from 1994-2014 were included. Results of advanced cytogenetic and gene sequencing tests were recorded. An EMS score (range 0-12 was assigned to each patient according to the team's initial external genitalia physical examination. During 1994-2011, 44 (40% patients with 46,XY DSD were evaluated and underwent genetic testing beyond initial karyotype; 23% (10/44 had a genetic diagnosis made by gene sequencing or array. The median EMS score of those with an identified pathogenic variant was significantly different from those in whom no confirmed genetic cause was identified [median 3 (95% CI, 2-6 versus 6 (95% CI, 5-7, respectively (p = 0.02], but limited to diagnoses of complete or partial androgen insensitivity (8/10 or 5-reductase deficiency (2/10. In the modern cohort (2012-2014, the difference in median EMS in whom a genetic cause was or was not identified approached significance (p = 0.05, median 3 (95% CI, 0-7 versus 7 (95% CI, 6-9, respectively. When all patients from 1994-2014 are pooled, the EMS is significantly different amongst those with compared to those without a genetic cause (median EMS 3 vs. 6, p < 0.02. We conclude that an EMS of 3 or less may indicate a higher likelihood of identifying a genetic cause of 46,XY DSD and justify genetic screening, especially when androgen insensitivity is suspected.

  16. Apgar score

    Science.gov (United States)

    ... this page: //medlineplus.gov/ency/article/003402.htm Apgar score To use the sharing features on this page, ... birth. Virginia Apgar, MD (1909-1974) introduced the Apgar score in 1952. How the Test is Performed The ...

  17. Test and Score Data Summary for TOEFL[R] Internet-Based and Paper-Based Tests. January 2008-December 2008 Test Data

    Science.gov (United States)

    Educational Testing Service, 2008

    2008-01-01

    The Test of English as a Foreign Language[TM], better known as TOEFL[R], is designed to measure the English-language proficiency of people whose native language is not English. TOEFL scores are accepted by more than 6,000 colleges, universities, and licensing agencies in 130 countries. The test is also used by governments, and scholarship and…

  18. Comparison of the Bender Gestalt Test for Both Black and White Brain-Damaged Patients Using Two Scoring Systems

    Science.gov (United States)

    Butler, Oliver T.; And Others

    1976-01-01

    This study tested for cultural bias in the Bender Visual Motor Gestalt Test. Subjects were 72 black and white patients diagnosed as either brain damaged or psychiatric. Bender protocols were scored by Pascal-Suttell and Hain systems. No race effect appeared except for the Pascal-Suttell system for which blacks scored significantly better. (Author)

  19. Interpreting the "g" Loadings of Intelligence Test Composite Scores in Light of Spearman's Law of Diminishing Returns

    Science.gov (United States)

    Reynolds, Matthew R.

    2013-01-01

    The linear loadings of intelligence test composite scores on a general factor ("g") have been investigated recently in factor analytic studies. Spearman's law of diminishing returns (SLODR), however, implies that the "g" loadings of test scores likely decrease in magnitude as g increases, or they are nonlinear. The purpose of this study was to (a)…

  20. Interpreting the "g" Loadings of Intelligence Test Composite Scores in Light of Spearman's Law of Diminishing Returns

    Science.gov (United States)

    Reynolds, Matthew R.

    2013-01-01

    The linear loadings of intelligence test composite scores on a general factor ("g") have been investigated recently in factor analytic studies. Spearman's law of diminishing returns (SLODR), however, implies that the "g" loadings of test scores likely decrease in magnitude as g increases, or they are nonlinear. The purpose of…

  1. Guided-Inquiry Lessons Raise Scores on the Sixth Grade Georgia Science Test

    Science.gov (United States)

    Page, Purlie M.

    At the local level, G Middle School has the highest district-wide percentage of 6th grade science students who are not meeting standards. It is imperative that G middle school take corrective action to reduce the number of students failing to meet state science standards. Dewey's theory of conceptual framework, which involves knowledge constructed on a person's personal experience and mind activity through active forms of learning, guided this study. The goal of the study was to determine whether inquiry-based science modules produce greater 6th grade science achievement, as measured by an equivalent instrument of the science section of the Georgia Criterion-Referenced Competency Test, when compared to traditional instruction among eastern Georgia 6th graders. The sample consisted of 230 students in the nonintervention group and 119 students in the intervention group. All students were from intact classes. At the end of the intervention, an independent t test was conducted to analyze the scores. According to the study t test, (t = 12.33, df = 304.56, p motivation towards, comprehension of, and interest in science concepts. At the local level, these inquiry lessons can be shared with science teachers across grade levels and within the district to improve county-wide science scores. An increase in student interest and comprehension of science concepts could ultimately lead to the United States producing more students in the fields of science, technology, engineering, and mathematics (STEM) education.

  2. Level and change in cognitive test scores predict risk of first stroke.

    Science.gov (United States)

    DeFries, Triveni; Avendaño, Mauricio; Glymour, M Maria

    2009-03-01

    To determine whether cognitive test scores and cognitive decline predict incidence of first diagnosed stroke. Stroke-free Health and Retirement Study participants were followed on average 7.6 years for self- or proxy-reported first stroke (1,483 events). Predictors included baseline performance on a modified Telephone Interview for Cognitive Status (Mental Status) and Word Recall test and decline between baseline and second assessment in either measure. Hazard ratios (HRs) were estimated using Cox proportional hazards models for the whole sample and stratified according to five major cardiovascular risk factors. National cohort study of noninstitutionalized adults with a mean baseline age of 64+/-9.9. Health and Retirement Study participants (n=19,699) aged 50 and older. Word Recall (HR for 1 standard deviation difference=0.92, 95% confidence interval (CI)=0.86-0.97)) and Mental Status (HR=0.89, 95% CI=0.84-0.95) predicted incident stroke. Mental Status predicted stroke risk in those with (HR=0.93, 95%=0.87-0.99) and without (HR=0.81, 95% CI=0.72-.91) one or more vascular risk factors. Word Recall declines predicted a 16% elevation in subsequent stroke risk (95% CI=1.01-1.34). Declines in Mental Status predicted a 37% elevation in stroke risk (95% CI=1.11-1.70). Cognitive test scores predict future stroke risk, independent of other major vascular risk factors.

  3. Micronucleus test for radiation biodosimetry in mass casualty events: Evaluation of visual and automated scoring

    Energy Technology Data Exchange (ETDEWEB)

    Bolognesi, Claudia, E-mail: claudia.bolognesi@istge.i [Environmental Carcinogenesis Unit, National Cancer Research Institute, Largo R. Benzi 10, 16132 Genoa (Italy); Balia, Cristina; Roggieri, Paola [Environmental Carcinogenesis Unit, National Cancer Research Institute, Largo R. Benzi 10, 16132 Genoa (Italy); Cardinale, Francesco [Clinical Epidemiology Unit, National Cancer Research Institute, Largo R. Benzi 10, 16132 Genoa (Italy); Department of Health Sciences, University of Genoa, Genoa (Italy); Bruzzi, Paolo [Clinical Epidemiology Unit, National Cancer Research Institute, Largo R. Benzi 10, 16132 Genoa (Italy); Sorcinelli, Francesca [Environmental Carcinogenesis Unit, National Cancer Research Institute, Largo R. Benzi 10, 16132 Genoa (Italy); Laboratory of Genetics, Histology and Molecular Biology Section, Army Medical and Veterinary, Research Center, Via Santo Stefano Rotondo 4, 00184 Roma (Italy); Lista, Florigio [Laboratory of Genetics, Histology and Molecular Biology Section, Army Medical and Veterinary, Research Center, Via Santo Stefano Rotondo 4, 00184 Roma (Italy); D' Amelio, Raffaele [Sapienza, Universita di Roma II Facolta di Medicina e Chirurgia and Ministero della Difesa, Direzione Generale Sanita Militare (Italy); Righi, Enzo [Frascati National Laboratories, National Institute of Nuclear Physics, Via Enrico Fermi 40, 00044 Frascati, Rome (Italy)

    2011-02-15

    In the case of a large-scale nuclear or radiological incidents a reliable estimate of dose is an essential tool for providing timely assessment of radiation exposure and for making life-saving medical decisions. Cytogenetics is considered as the 'gold standard' for biodosimetry. The dicentric analysis (DA) represents the most specific cytogenetic bioassay. The micronucleus test (MN) applied in interphase in peripheral lymphocytes is an alternative and simpler approach. A dose-effect calibration curve for the MN frequency in peripheral lymphocytes from 27 adult donors was established after in vitro irradiation at a dose range 0.15-8 Gy of {sup 137}Cs gamma rays (dose rate 6 Gy min{sup -1}). Dose prediction by visual scoring in a dose-blinded study (0.15-4.0 Gy) revealed a high level of accuracy (R = 0.89). The scoring of MN is time consuming and requires adequate skills and expertise. Automated image analysis is a feasible approach allowing to reduce the time and to increase the accuracy of the dose estimation decreasing the variability due to subjective evaluation. A good correlation (R = 0.705) between visual and automated scoring with visual correction was observed over the dose range 0-2 Gy. Almost perfect discrimination power for exposure to 1-2 Gy, and a satisfactory power for 0.6 Gy were detected. This threshold level can be considered sufficient for identification of sub lethally exposed individuals by automated CBMN assay.

  4. The National Early Warning Score: Translation, testing and prediction in a Swedish setting.

    Science.gov (United States)

    Spångfors, Martin; Arvidsson, Lisa; Karlsson, Victoria; Samuelson, Karin

    2016-12-01

    The National Early Warning Score - NEWS is a "track and trigger" scale designed to assess in-hospital patients' vital signs and detect clinical deterioration. In this study the NEWS was translated into Swedish and its association with the need of intensive care was investigated. A total of 868 patient charts, recorded by the medical emergency team at a university hospital, containing the parameters needed to calculate the NEWS were audited. The NEWS was translated into Swedish and tested for inter-rater reliability with a perfect agreement (weighted κ=1.0) among the raters. The median score for patients admitted to the ICU were higher than for those who were not (10 vs. 8, p<0.0001). AUROC for discriminating admittance to the ICU was 0.68 (95% CI: 0.622-0.739, p<0.0001). A regression analysis showed that lower oxygen saturation and a lower level of consciousness were significantly associated with ICU admission (OR 1.27 [1.06-1.52], p=0.01 and OR 1.77 [1.12-2.82], p=0.02) and may predict admission to the ICU better than the other parameters. The Swedish translated NEWS seems to have excellent inter-rater reliability and can be used without risk of linguistic misinterpretation. High scores for the parameters oxygen saturation and level of consciousness in the NEWS may predict admission to the ICU. Copyright © 2016 Elsevier Ltd. All rights reserved.

  5. Increased correlation coefficient between the written test score and tutors' performance test scores after training of tutors for assessment of medical students during problem-based learning course in Malaysia.

    Science.gov (United States)

    Jaiprakash, Heethal; Min, Aung Ko Ko; Ghosh, Sarmishtha

    2016-03-01

    This paper is aimed at finding if there was a change of correlation between the written test score and tutors' performance test scores in the assessment of medical students during a problem-based learning (PBL) course in Malaysia. This is a cross-sectional observational study, conducted among 264 medical students in two groups from November 2010 to November 2012. The first group's tutors did not receive tutor training; while the second group's tutors were trained in the PBL process. Each group was divided into high, middle and low achievers based on their end-of-semester exam scores. PBL scores were taken which included written test scores and tutors' performance test scores. Pearson correlation coefficient was calculated between the two kinds of scores in each group. The correlation coefficient between the written scores and tutors' scores in group 1 was 0.099 (pcorrelation coefficient in the group where tutors received the PBL training reinforces the importance of tutor training before their participation in the PBL course.

  6. Increased correlation coefficient between the written test score and tutors’ performance test scores after training of tutors for assessment of medical students during problem-based learning course in Malaysia

    Directory of Open Access Journals (Sweden)

    Heethal Jaiprakash

    2016-03-01

    Full Text Available This paper is aimed at finding if there was a change of correlation between the written test score and tutors’ performance test scores in the assessment of medical students during a problem-based learning (PBL course in Malaysia. This is a cross-sectional observational study, conducted among 264 medical students in two groups from November 2010 to November 2012. The first group’s tutors did not receive tutor training; while the second group’s tutors were trained in the PBL process. Each group was divided into high, middle and low achievers based on their end-of-semester exam scores. PBL scores were taken which included written test scores and tutors’ performance test scores. Pearson correlation coefficient was calculated between the two kinds of scores in each group. The correlation coefficient between the written scores and tutors’ scores in group 1 was 0.099 (p<0.001 and for group 2 was 0.305 (p<0.001. The higher correlation coefficient in the group where tutors received the PBL training reinforces the importance of tutor training before their participation in the PBL course.

  7. The Bender Gestalt Test with the Human Figure Drawing Test for Young School Children. A Manual for Use with the Koppitz Scoring System.

    Science.gov (United States)

    Koppitz, Elizabeth Munsterberg

    Presented is a manual for scoring the Bender Gestalt Test and the Human Figure Drawing Test for screening and diagnostic uses with emotionally disturbed, brain damaged, or perceptually handicapped 5- to 11-year-old children. Given are suggestions for administering and scoring the Bender test which examines distortion of shape, rotation,…

  8. The relationship between selected standardized test scores and performance in advanced placement math and science exams: Analyzing the differential effectiveness of scores for course identification and placement

    Science.gov (United States)

    Urbina, Josue N.

    There is a national need to increase the STEM-related workforce. Among factors leading towards STEM careers include the number of advanced high school mathematics and science courses students complete. Florida's enrollment patterns in STEM-related Advanced Placement (AP) courses, however, reveal that only a small percentage of students enroll into these classes. Therefore, screening tools are needed to find more students for these courses, who are academically ready, yet have not been identified. The purpose of this study was to investigate the extent to which scores from a national standardized test, Preliminary Scholastic Assessment Test/ National Merit Qualifying Test (PSAT/NMSQT), in conjunction with and compared to a state-mandated standardized test, Florida Comprehensive Assessment Test (FCAT), are related to selected AP exam performance in Seminole County Public Schools. An ex post facto correlational study was conducted using 6,189 student records from the 2010 - 2012 academic years. Multiple regression analyses using simultaneous Full Model testing showed differential moderate to strong relationships between scores in eight of the nine AP courses (i.e., Biology, Environmental Science, Chemistry, Physics B, Physics C Electrical, Physics C Mechanical, Statistics, Calculus AB and BC) examined. For example, the significant unique contribution to overall variance in AP scores was a linear combination of PSAT Math (M), Critical Reading (CR) and FCAT Reading (R) for Biology and Environmental Science. Moderate relationships for Chemistry included a linear combination of PSAT M, W (Writing) and FCAT M; a combination of FCAT M and PSAT M was most significantly associated with Calculus AB performance. These findings have implications for both research and practice. FCAT scores, in conjunction with PSAT scores, can potentially be used for specific STEM-related AP courses, as part of a systematic approach towards AP course identification and placement. For courses with

  9. [Development and clinical testing of the Russian version of the Acute Cystitis Symptom Score - ACSS].

    Science.gov (United States)

    Alidjanov, J F; Abdufattaev, U A; Makhmudov, D Kh; Mirkhamidov, D Kh; Khadzhikhanov, F A; Azgamov, A V; Pilatz, A; Naber, K G; Wagenlehner, F M; Akilov, F A

    2014-01-01

    The Acute Cystitis Symptom Score - ACSS was originally developed in the Uzbek language and has demonstrated high reliability and validity. The study was aimed to develop a Russian version of the ACSS questionnaire and evaluate its psychometric properties. Translation and adaptation of the ACSS questionnaire containing 18 questions, 6 of them - for the typical symptoms of acute cystitis (AC), 4 - for the differential diagnosis; 3 - for the quality of life, and 5 - for the conditions that may affect the choice of treatment, were performed according to the recommendations developed by the Mapi Research Institute. Study involved 83 Russian-speaking women (mean age, 35.6 ±13.7 years); 38 (45.8%) patients were in the main group (patients with AC), and 45 (54.2%) - in the control group (without AC). Medical examination and appropriate treatment of the respondents were conducted in accordance with approved standards. After completing the course of therapy, 19 (50%) patients of the main group came for the control examination. There was statistically significant difference in the scores obtained in the two groups. Score profiles positively correlated with the results of laboratory tests (rho = 0.26-0.48). Cronbach's alpha for the Russian version of the questionnaire was 0.86 (95% CI, 0.81-0.91), area under the curve in the ROC analysis was 0.96. The results of testing the Russian version correspond to those of the original version. The Russian version of the ACSS questionnaire has high. reliability and validity, and can be recommended for clinical research and diagnosis of primary AC, and dynamic monitoring of the effectiveness of the treatment of the Russian-speaking population of patients.

  10. CT densitovolumetry in children with obliterative bronchiolitis: correlation with clinical scores and pulmonary function test results

    Directory of Open Access Journals (Sweden)

    Helena Mocelin

    2013-12-01

    Full Text Available OBJECTIVE: To determine whether air trapping (expressed as the percentage of air trapping relative to total lung volume [AT%] correlates with clinical and functional parameters in children with obliterative bronchiolitis (OB.METHODS: CT scans of 19 children with OB were post-processed for AT% quantification with the use of a fixed threshold of −950 HU (AT%950 and of thresholds selected with the aid of density masks (AT%DM. Patients were divided into three groups by AT% severity. We examined AT% correlations with oxygen saturation (SO2 at rest, six-minute walk distance (6MWD, minimum SO2 during the six-minute walk test (6MWT_SO2, FVC, FEV1, FEV1/FVC, and clinical parameters.RESULTS: The 6MWD was longer in the patients with larger normal lung volumes (r = 0.53. We found that AT%950 showed significant correlations (before and after the exclusion of outliers, respectively with the clinical score (r = 0.72; 0.80, FVC (r = 0.24; 0.59, FEV1 (r = −0.58; −0.67, and FEV1/FVC (r = −0.53; r = −0.62, as did AT%DM with the clinical score (r = 0.58; r = 0.63, SO2 at rest (r = −0.40; r = −0.61, 6MWT_SO2 (r = −0.24; r = −0.55, FVC (r = −0.44; r = −0.80, FEV1 (r = −0.65; r = −0.71, and FEV1/FVC (r = −0.41; r = −0.52.CONCLUSIONS: Our results show that AT% correlates significantly with clinical scores and pulmonary function test results in children with OB.

  11. The Dutch version of the Forgotten Joint Score: test-retesting reliability and validation.

    Science.gov (United States)

    Shadid, Marvan B; Vinken, Nick S; Marting, Louis N; Wolterbeek, Nienke

    2016-03-01

    The aim of this study was to translate the Forgotten Joint Score (FJS) into the Dutch language. This -questionnaire was tested for internal consistency (Cronbach's alpha) and test-retest reliability (intraclass correlation coefficients (ICC)). 159 patients were included in this study; 74 with a total hip arthroplasty (THA) and 85 with a total knee arthroplasty (TKA). The FJS showed a high internal consistency (Cronbach's alpha=0.957; ICC=0.943). The FJS showed a significant correlation (r=0.751) with the WOMAC and low ceiling effects (3.1%). This study proved the Dutch FJS to be highly discriminative in patients treated with a THA or TKA. This makes the FJS a reliable patient related outcome measurement, measuring a new dimension in arthroplasty: the ability to forget an artificial joint in everyday life.

  12. Reliability and validity of the new Tanaka B Intelligence Scale scores: a group intelligence test.

    Directory of Open Access Journals (Sweden)

    Yota Uno

    Full Text Available OBJECTIVE: The present study evaluated the reliability and concurrent validity of the new Tanaka B Intelligence Scale, which is an intelligence test that can be administered on groups within a short period of time. METHODS: The new Tanaka B Intelligence Scale and Wechsler Intelligence Scale for Children-Third Edition were administered to 81 subjects (mean age ± SD 15.2 ± 0.7 years residing in a juvenile detention home; reliability was assessed using Cronbach's alpha coefficient, and concurrent validity was assessed using the one-way analysis of variance intraclass correlation coefficient. Moreover, receiver operating characteristic analysis for screening for individuals who have a deficit in intellectual function (an FIQ<70 was performed. In addition, stratum-specific likelihood ratios for detection of intellectual disability were calculated. RESULTS: The Cronbach's alpha for the new Tanaka B Intelligence Scale IQ (BIQ was 0.86, and the intraclass correlation coefficient with FIQ was 0.83. Receiver operating characteristic analysis demonstrated an area under the curve of 0.89 (95% CI: 0.85-0.96. In addition, the stratum-specific likelihood ratio for the BIQ≤65 stratum was 13.8 (95% CI: 3.9-48.9, and the stratum-specific likelihood ratio for the BIQ≥76 stratum was 0.1 (95% CI: 0.03-0.4. Thus, intellectual disability could be ruled out or determined. CONCLUSION: The present results demonstrated that the new Tanaka B Intelligence Scale score had high reliability and concurrent validity with the Wechsler Intelligence Scale for Children-Third Edition score. Moreover, the post-test probability for the BIQ could be calculated when screening for individuals who have a deficit in intellectual function. The new Tanaka B Intelligence Test is convenient and can be administered within a variety of settings. This enables evaluation of intellectual development even in settings where performing intelligence tests have previously been difficult.

  13. Multiple tests for wind turbine fault detection and score fusion using two- level multidimensional scaling (MDS)

    Science.gov (United States)

    Ye, Xiang; Gao, Weihua; Yan, Yanjun; Osadciw, Lisa A.

    2010-04-01

    Wind is an important renewable energy source. The energy and economic return from building wind farms justify the expensive investments in doing so. However, without an effective monitoring system, underperforming or faulty turbines will cause a huge loss in revenue. Early detection of such failures help prevent these undesired working conditions. We develop three tests on power curve, rotor speed curve, pitch angle curve of individual turbine. In each test, multiple states are defined to distinguish different working conditions, including complete shut-downs, under-performing states, abnormally frequent default states, as well as normal working states. These three tests are combined to reach a final conclusion, which is more effective than any single test. Through extensive data mining of historical data and verification from farm operators, some state combinations are discovered to be strong indicators of spindle failures, lightning strikes, anemometer faults, etc, for fault detection. In each individual test, and in the score fusion of these tests, we apply multidimensional scaling (MDS) to reduce the high dimensional feature space into a 3-dimensional visualization, from which it is easier to discover turbine working information. This approach gains a qualitative understanding of turbine performance status to detect faults, and also provides explanations on what has happened for detailed diagnostics. The state-of-the-art SCADA (Supervisory Control And Data Acquisition) system in industry can only answer the question whether there are abnormal working states, and our evaluation of multiple states in multiple tests is also promising for diagnostics. In the future, these tests can be readily incorporated in a Bayesian network for intelligent analysis and decision support.

  14. [Information Concerning Mean Test Scores for the Graduate Management Admission Test (GMAT), Graduate Record Examinations (GRE), Law School Admission Test (LSAT), Preliminary Scholastic Aptitude Test (PSAT), and Scholastic Aptitude Test (SAT) for the National Commission on Excellence in Education.

    Science.gov (United States)

    Solomon, Robert J.

    Data are provided to the National Commission on Excellence in Education on the Graduate Management Admission Test (GMAT), Graduate Record Examinations (GRE), Law School Admission Test (LSAT), Preliminary Scholastic Aptitude Test (PSAT), and Scholastic Aptitude Test (SAT). Statistics are provided on the following: yearly GMAT mean scores 1965-1966…

  15. Performance on large-scale science tests: Item attributes that may impact achievement scores

    Science.gov (United States)

    Gordon, Janet Victoria

    Significant differences in achievement among ethnic groups persist on the eighth-grade science Washington Assessment of Student Learning (WASL). The WASL measures academic performance in science using both scenario and stand-alone question types. Previous research suggests that presenting target items connected to an authentic context, like scenario question types, can increase science achievement scores especially in underrepresented groups and thus help to close the achievement gap. The purpose of this study was to identify significant differences in performance between gender and ethnic subgroups by question type on the 2005 eighth-grade science WASL. MANOVA and ANOVA were used to examine relationships between gender and ethnic subgroups as independent variables with achievement scores on scenario and stand-alone question types as dependent variables. MANOVA revealed no significant effects for gender, suggesting that the 2005 eighth-grade science WASL was gender neutral. However, there were significant effects for ethnicity. ANOVA revealed significant effects for ethnicity and ethnicity by gender interaction in both question types. Effect sizes were negligible for the ethnicity by gender interaction. Large effect sizes between ethnicities on scenario question types became moderate to small effect sizes on stand-alone question types. This indicates the score advantage the higher performing subgroups had over the lower performing subgroups was not as large on stand-alone question types compared to scenario question types. A further comparison examined performance on multiple-choice items only within both question types. Similar achievement patterns between ethnicities emerged; however, achievement patterns between genders changed in boys' favor. Scenario question types appeared to register differences between ethnic groups to a greater degree than stand-alone question types. These differences may be attributable to individual differences in cognition

  16. Survival analysis of colorectal cancer patients with tumor recurrence using global score test methodology

    Energy Technology Data Exchange (ETDEWEB)

    Zain, Zakiyah, E-mail: zac@uum.edu.my; Ahmad, Yuhaniz, E-mail: yuhaniz@uum.edu.my [School of Quantitative Sciences, Universiti Utara Malaysia, UUM Sintok 06010, Kedah (Malaysia); Azwan, Zairul, E-mail: zairulazwan@gmail.com, E-mail: farhanaraduan@gmail.com, E-mail: drisagap@yahoo.com; Raduan, Farhana, E-mail: zairulazwan@gmail.com, E-mail: farhanaraduan@gmail.com, E-mail: drisagap@yahoo.com; Sagap, Ismail, E-mail: zairulazwan@gmail.com, E-mail: farhanaraduan@gmail.com, E-mail: drisagap@yahoo.com [Surgery Department, Universiti Kebangsaan Malaysia Medical Centre, Jalan Yaacob Latif, 56000 Bandar Tun Razak, Kuala Lumpur (Malaysia); Aziz, Nazrina, E-mail: nazrina@uum.edu.my

    2014-12-04

    Colorectal cancer is the third and the second most common cancer worldwide in men and women respectively, and the second in Malaysia for both genders. Surgery, chemotherapy and radiotherapy are among the options available for treatment of patients with colorectal cancer. In clinical trials, the main purpose is often to compare efficacy between experimental and control treatments. Treatment comparisons often involve several responses or endpoints, and this situation complicates the analysis. In the case of colorectal cancer, sets of responses concerned with survival times include: times from tumor removal until the first, the second and the third tumor recurrences, and time to death. For a patient, the time to recurrence is correlated to the overall survival. In this study, global score test methodology is used in combining the univariate score statistics for comparing treatments with respect to each survival endpoint into a single statistic. The data of tumor recurrence and overall survival of colorectal cancer patients are taken from a Malaysian hospital. The results are found to be similar to those computed using the established Wei, Lin and Weissfeld method. Key factors such as ethnic, gender, age and stage at diagnose are also reported.

  17. CORRELATION BETWEEN SYMPTOM SCORE, WHEEZE, REVERSIBILITY OF PULMONARY FUNCTION TESTS AND TREATMENT RESPONSE IN ASTHMA

    Directory of Open Access Journals (Sweden)

    M.H. Boskabady

    2003-06-01

    Full Text Available Asthma management is a major concern because some asthmatic patients either do not respond or else hardly respond to treatment. Therefore in the present study, an attempt has been made to determine the predictors of treatment response in asthmatic patients.Thirty six asthmatic adults including 13 male and 23 female were studied dur¬ing a 3 month treatment period. Asthma symptom score (SS and wheezing were recorded before and after treatment. Pulmonary function tests (PFTs including forced vital capacity (FVC, forced expiratory volume in one second (FEVj, peak expiratory flow (PEF, maximal expiratory measured at the beginning and the end of the study. The increase in PFT values 10 min after 200 ug inhaled salbutamol (in percentage was considered as reversibility in airway constriction.There were significant improvements in SS (/JKO.001 and PFT variables (/;>The results of these study showed that a well conducted therapeutic program could lead to improvement in symptoms, wheeze, and PFT values. In addition symptom score, wheeze, and reversibility in FEV1 and PEF could be good indi¬cators of response to treatment in asthma.

  18. The Mote In Thy Brother's Eye, and The Beam in Thine Own: Predicting One's Own and Others' Personality Test Scores.

    Science.gov (United States)

    Furnham, Adrian; Henderson, Monika

    1983-01-01

    Examined the similarity between subjects' (N=63) ratings of themselves and others, on various tests of personality. Results revealed that subjects correctly estimated several of their own scores, but only two scores of another person. They believed themselves to be similar to their friend, thereby showing attributional errors. (JAC)

  19. Improving Personality Facet Scores with Multidimensional Computer Adaptive Testing: An Illustration with the Neo Pi-R

    Science.gov (United States)

    Makransky, Guido; Mortensen, Erik Lykke; Glas, Cees A. W.

    2013-01-01

    Narrowly defined personality facet scores are commonly reported and used for making decisions in clinical and organizational settings. Although these facets are typically related, scoring is usually carried out for a single facet at a time. This method can be ineffective and time consuming when personality tests contain many highly correlated…

  20. Expanded Koppitz Scoring System of the Bender Gestalt Visual-Motor Test for Adolescents: A Pilot Study.

    Science.gov (United States)

    Bolen, Larry M.; And Others

    1992-01-01

    Examined use of Bender Gestalt Visual-Motor Test with school-age adolescents over age 11. Mean error scores suggest that visual-motor development is not maturationally complete by age 11 years, 11 months. Suggests additional research focusing on extending normative sample or developing new scoring system for adolescents. (Author/NB)

  1. Validation of Automated Scores of TOEFL iBT Tasks against Non-Test Indicators of Writing Ability

    Science.gov (United States)

    Weigle, Sara Cushing

    2010-01-01

    Automated scoring has the potential to dramatically reduce the time and costs associated with the assessment of complex skills such as writing, but its use must be validated against a variety of criteria for it to be accepted by test users and stakeholders. This study approaches validity by comparing human and automated scores on responses to…

  2. Validation of Automated Scores of TOEFL iBT Tasks against Non-Test Indicators of Writing Ability

    Science.gov (United States)

    Weigle, Sara Cushing

    2010-01-01

    Automated scoring has the potential to dramatically reduce the time and costs associated with the assessment of complex skills such as writing, but its use must be validated against a variety of criteria for it to be accepted by test users and stakeholders. This study approaches validity by comparing human and automated scores on responses to…

  3. Academic self-concept, interest, grades, and standardized test scores: reciprocal effects models of causal ordering.

    Science.gov (United States)

    Marsh, Herbert W; Trautwein, Ulrich; Lüdtke, Oliver; Köller, Olaf; Baumert, Jürgen

    2005-01-01

    Reciprocal effects models of longitudinal data show that academic self-concept is both a cause and an effect of achievement. In this study this model was extended to juxtapose self-concept with academic interest. Based on longitudinal data from 2 nationally representative samples of German 7th-grade students (Study 1: N = 5,649, M age = 13.4; Study 2: N = 2,264, M age = 13.7 years), prior self-concept significantly affected subsequent math interest, school grades, and standardized test scores, whereas prior math interest had only a small effect on subsequent math self-concept. Despite stereotypic gender differences in means, linkages relating these constructs were invariant over gender. These results demonstrate the positive effects of academic self-concept on a variety of academic outcomes and integrate self-concept with the developmental motivation literature.

  4. Genetic analysis of somatic cell score in Norwegian cattle using random regression test-day models.

    Science.gov (United States)

    Odegård, J; Jensen, J; Klemetsdal, G; Madsen, P; Heringstad, B

    2003-12-01

    The dataset used in this analysis contained a total of 341,736 test-day observations of somatic cell scores from 77,110 primiparous daughters of 1965 Norwegian Cattle sires. Initial analyses, using simple random regression models without genetic effects, indicated that use of homogeneous residual variance was appropriate. Further analyses were carried out by use of a repeatability model and 12 random regression sire models. Legendre polynomials of varying order were used to model both permanent environmental and sire effects, as did the Wilmink function, the Lidauer-Mäntysaari function, and the Ali-Schaeffer function. For all these models, heritability estimates were lowest at the beginning (0.05 to 0.07) and higher at the end (0.09 to 0.12) of lactation. Genetic correlations between somatic cell scores early and late in lactation were moderate to high (0.38 to 0.71), whereas genetic correlations for adjacent DIM were near unity. Models were compared based on likelihood ratio tests, Bayesian information criterion, Akaike information criterion, residual variance, and predictive ability. Based on prediction of randomly excluded observations, models with 4 coefficients for permanent environmental effect were preferred over simpler models. More highly parameterized models did not substantially increase predictive ability. Evaluation of the different model selection criteria indicated that a reduced order of fit for sire effects was desireable. Models with zeroth- or first-order of fit for sire effects and higher order of fit for permanent environmental effects probably underestimated sire variance. The chosen model had Legendre polynomials with 3 coefficients for sire, and 4 coefficients for permanent environmental effects. For this model, trajectories of sire variance and heritability were similar assuming either homogeneous or heterogeneous residual variance structure.

  5. REPRODUCIBILITY OF THE MODIFIED STAR EXCURSION BALANCE TEST COMPOSITE AND SPECIFIC REACH DIRECTION SCORES

    Science.gov (United States)

    van Lieshout, Remko; Reijneveld, Elja A.E.; van den Berg, Sandra M.; Haerkens, Gijs M.; Koenders, Niek H.; de Leeuw, Arina J.; van Oorsouw, Roel G.; Paap, Davy; Scheffer, Else; Weterings, Stijn

    2016-01-01

    ABSTRACT Background The mSEBT is a screening tool used to evaluate dynamic balance. Most research investigating measurement properties focused on intrarater reliability and was done in small samples. To know whether the mSEBT is useful to discriminate dynamic balance between persons and to evaluate changes in dynamic balance, more research into intra- and interrater reliability and smallest detectable change (synonymous with minimal detectable change) is needed. Purpose To estimate intra- and interrater reliability and smallest detectable change of the mSEBT in adults at risk for ankle sprain. Study Design Cross-sectional, test-retest design Methods Fifty-five healthy young adults participating in sports at risk for ankle sprain participated (mean ± SD age, 24.0 ± 2.9 years). Each participant performed three test sessions within one hour and was rated by two physical therapists (session 1, rater 1; session 2, rater 2; session 3, rater 1). Participants and raters were blinded for previous measurements. Normalized composite and reach direction scores for the right and left leg were collected. Analysis of variance was used to calculate intraclass correlation coefficient values for intra- and interrater reliability. Smallest detectable change values were calculated based on the standard error of measurement. Results Intra- and interrater reliability for both legs was good to excellent (intraclass correlation coefficient ranging from 0.87 to 0.94). The intrarater smallest detectable change for the composite score of the right leg was 7.2% and for the left 6.2%. The interrater smallest detectable change for the composite score of the right leg was 6.9% and for the left 5.0%. Conclusion The mSEBT is a reliable measurement instrument to discriminate dynamic balance between persons. Most smallest detectable change values of the mSEBT appear to be large. More research is needed to investigate if the mSEBT is usable for evaluative purposes. Level of Evidence Level 2

  6. The achievement impact of the inclusion model on the standardized test scores of general education students

    Science.gov (United States)

    Garrett-Rainey, Syrena

    The purpose of this study was to compare the achievement of general education students within regular education classes to the achievement of general education students in inclusion/co-teach classes to determine whether there was a significant difference in the achievement between the two groups. The school district's inclusion/co-teach model included ongoing professional development support for teachers and administrators. General education teachers, special education teachers, and teacher assistants collaborated to develop instructional strategies to provide additional remediation to help students to acquire the skills needed to master course content. This quantitative study reviewed the end-of course test (EoCT) scores of Grade 10 physical science and math students within an urban school district. It is not known whether general education students in an inclusive/co-teach science or math course will demonstrate a higher achievement on the EoCT in math or science than students not in an inclusive/co-teach classroom setting. In addition, this study sought to determine if students classified as low socioeconomic status benefited from participating in co-teaching classrooms as evidenced by standardized tests. Inferential statistics were used to determine whether there was a significant difference between the achievements of the treatment group (inclusion/co-teach) and the control group (non-inclusion/co-teach). The findings can be used to provide school districts with optional instructional strategies to implement in the diverse classroom setting in the modern classroom to increase academic performance on state standardized tests.

  7. Poisson Approximation-Based Score Test for Detecting Association of Rare Variants.

    Science.gov (United States)

    Fang, Hongyan; Zhang, Hong; Yang, Yaning

    2016-07-01

    Genome-wide association study (GWAS) has achieved great success in identifying genetic variants, but the nature of GWAS has determined its inherent limitations. Under the common disease rare variants (CDRV) hypothesis, the traditional association analysis methods commonly used in GWAS for common variants do not have enough power for detecting rare variants with a limited sample size. As a solution to this problem, pooling rare variants by their functions provides an efficient way for identifying susceptible genes. Rare variant typically have low frequencies of minor alleles, and the distribution of the total number of minor alleles of the rare variants can be approximated by a Poisson distribution. Based on this fact, we propose a new test method, the Poisson Approximation-based Score Test (PAST), for association analysis of rare variants. Two testing methods, namely, ePAST and mPAST, are proposed based on different strategies of pooling rare variants. Simulation results and application to the CRESCENDO cohort data show that our methods are more powerful than the existing methods.

  8. Should We Stop Looking for a Better Scoring Algorithm for Handling Implicit Association Test Data? Test of the Role of Errors, Extreme Latencies Treatment, Scoring Formula, and Practice Trials on Reliability and Validity.

    Directory of Open Access Journals (Sweden)

    Juliette Richetin

    Full Text Available Since the development of D scores for the Implicit Association Test, few studies have examined whether there is a better scoring method. In this contribution, we tested the effect of four relevant parameters for IAT data that are the treatment of extreme latencies, the error treatment, the method for computing the IAT difference, and the distinction between practice and test critical trials. For some options of these different parameters, we included robust statistic methods that can provide viable alternative metrics to existing scoring algorithms, especially given the specificity of reaction time data. We thus elaborated 420 algorithms that result from the combination of all the different options and test the main effect of the four parameters with robust statistical analyses as well as their interaction with the type of IAT (i.e., with or without built-in penalty included in the IAT procedure. From the results, we can elaborate some recommendations. A treatment of extreme latencies is preferable but only if it consists in replacing rather than eliminating them. Errors contain important information and should not be discarded. The D score seems to be still a good way to compute the difference although the G score could be a good alternative, and finally it seems better to not compute the IAT difference separately for practice and test critical trials. From this recommendation, we propose to improve the traditional D scores with small yet effective modifications.

  9. The TSCA interagency testing committee`s approaches to screening and scoring chemicals and chemical groups: 1977-1983

    Energy Technology Data Exchange (ETDEWEB)

    Walker, J.D. [Environmental Protection Agency, Washington, DC (United States)

    1990-12-31

    This paper describes the TSCA interagency testing committee`s (ITC) approaches to screening and scoring chemicals and chemical groups between 1977 and 1983. During this time the ITC conducted five scoring exercises to select chemicals and chemical groups for detailed review and to determine which of these chemicals and chemical groups should be added to the TSCA Section 4(e) Priority Testing List. 29 refs., 1 fig., 2 tabs.

  10. The Effect of Transient Students' Scores on the Norm of One High School's Standardized Basic Skills Test Battery.

    Science.gov (United States)

    Hill, Carolyn Stevens

    The effect of the standardized test scores of transient students on the mean of the 9th, 10th, and 11th grade standardized test scores of a school was studied. The groups used in the study were grades 9, 10, and 11 at a new high school in Clarksburg (West Virginia). The study was conducted in the spring of 1998. Groups consisted of 204 9th…

  11. Rugby versus Soccer in South Africa: Content Familiarity Contributes to Cross-Cultural Differences in Cognitive Test Scores

    Science.gov (United States)

    Malda, Maike; van de Vijver, Fons J. R.; Temane, Q. Michael

    2010-01-01

    In this study, cross-cultural differences in cognitive test scores are hypothesized to depend on a test's cultural complexity (Cultural Complexity Hypothesis: CCH), here conceptualized as its content familiarity, rather than on its cognitive complexity (Spearman's Hypothesis: SH). The content familiarity of tests assessing short-term memory,…

  12. Rugby versus Soccer in South Africa: Content Familiarity Contributes to Cross-Cultural Differences in Cognitive Test Scores

    Science.gov (United States)

    Malda, Maike; van de Vijver, Fons J. R.; Temane, Q. Michael

    2010-01-01

    In this study, cross-cultural differences in cognitive test scores are hypothesized to depend on a test's cultural complexity (Cultural Complexity Hypothesis: CCH), here conceptualized as its content familiarity, rather than on its cognitive complexity (Spearman's Hypothesis: SH). The content familiarity of tests assessing short-term memory,…

  13. Relationship between California Mastitis Test score and ultrasonographic teat measurements in dairy cows.

    Science.gov (United States)

    Seker, I; Risvanli, A; Yuksel, M; Saat, N; Ozmen, O

    2009-12-01

    The majority of published studies about mastitis are related to the control and prevention of mastitis, with particular emphasis on eliminating predisposition factors. The objective of the current study was to determine the role of teat morphology as an important factor in the aetiology of mastitis. Ultrasonographic measurements were taken from 190 teats from 100 dairy cows of different breeds. Mastitis in cows was diagnosed by the California Mastitis Test (CMT) and microbiological tests. The data were evaluated in the light of the clinical history of the animals. Main effects of breed on teat diameter at the position of the Furstenberg rosette (FTD) and teat cistern diameter (CD), that of age on FTD and overall teat diameter (OTD), and that of CMT score on CD and OTD were significant (P CMT-positive udder lobes than that in CMT-negative lobes. No difference was detected in canal length, CD, teat wall thickness, OTD or FTD between the CMT-positive and -negative lobes. The occurrence of mastitis could be related to specific ultrasonographic teat measurements (e.g. CD, OTD and FTD) and these may be important in the breeding of cows with a predisposition to mastitis, as well in the evaluation of in-herd cows in terms of udder/teat deformities.

  14. The effects of extended time on algebra test scores for college students with and without learning disabilities.

    Science.gov (United States)

    Alster, E H

    1997-01-01

    The purpose of this study was to assess the effects of extended time on the algebra test performance of community college students with and without learning disabilities. Forty-four students with learning disabilities and 44 students without learning disabilities attending five California community colleges participated in the study. The students each took an algebra test under timed conditions and a comparable test under extended-time conditions. The main results were that the students with learning disabilities scored significantly lower than the students without learning disabilities under timed conditions, the scores of the students with learning disabilities increased significantly with extended time, and the scores of the students with learning disabilities under extended-time conditions did not differ significantly from the timed or extended-time scores of the students without learning disabilities.

  15. Evaluation of Factors Affecting Continuous Performance Test Identical Pairs Version Score of Schizophrenic Patients in a Japanese Clinical Sample

    Directory of Open Access Journals (Sweden)

    Takayoshi Koide

    2012-01-01

    Full Text Available Aim. Cognitive impairment in schizophrenia strongly relates to social outcome and is a good candidate for endophenotypes. When we accurately measure drug efficacy or effects of genes or variants relevant to schizophrenia on cognitive impairment, clinical factors that can affect scores on cognitive tests, such as age and severity of symptoms, should be considered. To elucidate the effect of clinical factors, we conducted multiple regression analysis using scores of the Continuous Performance Test Identical Pairs Version (CPT-IP, which is often used to measure attention/vigilance in schizophrenia. Methods. We conducted the CPT-IP (4-4 digit and examined clinical information (sex, age, education years, onset age, duration of illness, chlorpromazine-equivalent dose, and Positive and Negative Symptom Scale (PANSS scores in 126 schizophrenia patients in Japanese population. Multiple regression analysis was used to evaluate the effect of clinical factors. Results. Age, chlorpromazine-equivalent dose, and PANSS-negative symptom score were associated with mean d′ score in patients. These three clinical factors explained about 28% of the variance in mean d′ score. Conclusions. As conclusion, CPT-IP score in schizophrenia patients is influenced by age, chlorpromazine-equivalent dose and PANSS negative symptom score.

  16. Evaluation of factors affecting continuous performance test identical pairs version score of schizophrenic patients in a Japanese clinical sample.

    Science.gov (United States)

    Koide, Takayoshi; Aleksic, Branko; Kikuchi, Tsutomu; Banno, Masahiro; Kohmura, Kunihiro; Adachi, Yasunori; Kawano, Naoko; Iidaka, Tetsuya; Ozaki, Norio

    2012-01-01

    Aim. Cognitive impairment in schizophrenia strongly relates to social outcome and is a good candidate for endophenotypes. When we accurately measure drug efficacy or effects of genes or variants relevant to schizophrenia on cognitive impairment, clinical factors that can affect scores on cognitive tests, such as age and severity of symptoms, should be considered. To elucidate the effect of clinical factors, we conducted multiple regression analysis using scores of the Continuous Performance Test Identical Pairs Version (CPT-IP), which is often used to measure attention/vigilance in schizophrenia. Methods. We conducted the CPT-IP (4-4 digit) and examined clinical information (sex, age, education years, onset age, duration of illness, chlorpromazine-equivalent dose, and Positive and Negative Symptom Scale (PANSS) scores) in 126 schizophrenia patients in Japanese population. Multiple regression analysis was used to evaluate the effect of clinical factors. Results. Age, chlorpromazine-equivalent dose, and PANSS-negative symptom score were associated with mean d' score in patients. These three clinical factors explained about 28% of the variance in mean d' score. Conclusions. As conclusion, CPT-IP score in schizophrenia patients is influenced by age, chlorpromazine-equivalent dose and PANSS negative symptom score.

  17. The Effect of Computer-Based Self-Access Learning on Weekly Vocabulary Test Scores

    Directory of Open Access Journals (Sweden)

    Jordan Dreyer

    2014-09-01

    Full Text Available This study sets out to clarify the effectiveness of using an online vocabulary study tool, Quizlet, in an urban high school language arts class. Previous similar studies have mostly dealt with English Language Learners in college settings (Chui, 2013, and were therefore not directed at the issue self-efficacy that is at the heart of the problem of urban high school students in America entering remedial writing programs (Rose, 1989. The study involves 95 students over the course of 14 weeks. Students were tested weekly and were asked to use the Quizlet program in their own free time. The result of this optional involvement was that many students did not participate in the treatment and therefore acted as an elective control group. The resultant data collected shows a strong correlation between the use of an online vocabulary review program and short-term vocabulary retention. The study also showed that students who paced themselves and spread out their study sessions outperformed those students who used the program only for last minute “cram sessions.” The implications of the study are that students who take advantage of tools outside of the classroom are able to out perform their peers. The results are also in line with the call to include technology in the Basic Writing classroom not simply as a tool, but as a “form of discourse” (Jonaitis, 2012. Weekly vocabulary tests, combined with the daily online activity as reported by Quizlet, show that: 1 utilizing the review software improved the scores of most students, 2 those students who used Quizlet to review more than a single time (i.e., several days before the test outperformed those who only used the product once, and 3 students who professed proficiency with the “notebook” system of vocabulary learning appeared not to need the treatment.

  18. Exploration of analysis methods for diagnostic imaging tests: problems with ROC AUC and confidence scores in CT colonography.

    Science.gov (United States)

    Mallett, Susan; Halligan, Steve; Collins, Gary S; Altman, Doug G

    2014-01-01

    Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC) for the evaluation of CT colonography for the detection of polyps, either with or without computer assisted detection. In a multireader multicase study of 10 readers and 107 cases we compared sensitivity and specificity, using radiological reporting of the presence or absence of polyps, to ROC AUC calculated from confidence scores concerning the presence of polyps. Both methods were assessed against a reference standard. Here we focus on five readers, selected to illustrate issues in design and analysis. We compared diagnostic measures within readers, showing that differences in results are due to statistical methods. Reader performance varied widely depending on whether sensitivity and specificity or ROC AUC was used. There were problems using confidence scores; in assigning scores to all cases; in use of zero scores when no polyps were identified; the bimodal non-normal distribution of scores; fitting ROC curves due to extrapolation beyond the study data; and the undue influence of a few false positive results. Variation due to use of different ROC methods exceeded differences between test results for ROC AUC. The confidence scores recorded in our study violated many assumptions of ROC AUC methods, rendering these methods inappropriate. The problems we identified will apply to other detection studies using confidence scores. We found sensitivity and specificity were a more reliable and clinically appropriate method to compare diagnostic tests.

  19. Precision Gains from Publically Available School Proficiency Measures Compared to Study-Collected Test Scores in Education Cluster-Randomized Trials. NCEE 2010-4003

    Science.gov (United States)

    Deke, John; Dragoset, Lisa; Moore, Ravaris

    2010-01-01

    In randomized controlled trials (RCTs) where the outcome is a student-level, study-collected test score, a particularly valuable piece of information is a study-collected baseline score from the same or similar test (a pre-test). Pre-test scores can be used to increase the precision of impact estimates, conduct subgroup analysis, and reduce bias…

  20. A Comparison of Special Education Teacher and Psychologist Scoring of the Bender Visual Motor Gestalt Test

    Science.gov (United States)

    Foster, Glen G.; And Others

    1976-01-01

    Ten special education teachers and two school psychologists scored the Bender-Gestalt protocals of elementary school children using the Koppitz scoring system. The reported correlations between teachers and school psychologists compared favorably to correlations between school psychologists as well as to interrater reliabilities reported in the…

  1. Standardised test protocol (Constant Score) for evaluation of functionality in patients with shoulder disorders

    DEFF Research Database (Denmark)

    Ban, Ilija; Troelsen, Anders; Christiansen, David Høyrup;

    2013-01-01

    INTRODUCTION: The Constant Score (CS), developed as a scoring system to evaluate overall functionality of patients with shoulder disorders, is widely used but has been criticised for relying on an imprecise terminology and for lack of a standardised methodology. A modified guideline was therefore...

  2. Effect of locomotion score on sows' performances in a feed reward collection test.

    Science.gov (United States)

    Bos, E-J; Nalon, E; Maes, D; Ampe, B; Buijs, S; van Riet, M M J; Millet, S; Janssens, G P J; Tuyttens, F A M

    2015-10-01

    Sows housed in groups have to move through their pen to fulfil their behavioural and physiological needs such as feeding and resting. In addition to causing pain and discomfort, lameness may restrict the ability of sows to fulfil such needs. The aim of our study was to investigate the extent to which the mobility of sows is affected by different degrees of lameness. Mobility was measured as the sow's willingness or capability to cover distances. Feed-restricted hybrid sows with different gait scores were subjected to a feed reward collection test in which they had to walk distances to obtain subsequent rewards. In all, 29 group-housed sows at similar gestation stage (day 96.6 ± 7 s.d.) were visually recorded for gait and classified as non-lame, mildly lame, moderately lame or severely lame. All sows received 2.6 kg of standard commercial gestation feed per day. The test arena consisted of two feeding locations separated from each other by a Y-shaped middle barrier. Feed rewards were presented at the two feeders in turn, using both light and sound cues to signal the availability of a new feed reward. Sows were individually trained during 5 non-consecutive days for 10 min/day with increasing barrier length (range: 0 to 3.5 m) each day. After training, sows were individually tested once per day on 3 non-consecutive days with the maximum barrier length such that they had to cover 9.3 m to walk from one feeder to the other. The outcome variable was the number of rewards collected in a 15-min time span. Non-lame and mildly lame sows obtained more rewards than moderately lame and severely lame sows (Psows (P=0.69), nor between moderately lame and severely lame sows (P=1.00). This feed reward collection test indicates that both moderately lame and severely lame sows are limited in their combined ability and willingness to walk, but did not reveal an effect of mild lameness on mobility. These findings suggest that moderately and more severely lame sows, but not mildly lame

  3. Test-retest reliability of the Advanced Psychodiagnostic Interpretation (API) scoring system for the Bender Gestalt in chronic schizophrenics.

    Science.gov (United States)

    Aucone, E J; Wagner, E E; Raphael, A J; Golden, C J; Espe-Pfeifer, P; Dornheim, L; Seldon, J; Pospisil, T; Proctor-Weber, Z; Calabria, M

    2001-09-01

    This study assesses the test-retest reliability of the revised Advanced Psychodiagnostic Interpretation (API) scoring system for the Bender Gestalt Test (BGT). The API system identifies 207 possible distortions in a BGT protocol. Test-retest reliability for 40 schizophrenic patients tested twice with a mean interval of 6.4 years (SD=3.8 years) was good, ranging from .71 to .80. Further reliability and validity studies are needed to further demonstrate the effectiveness of the system.

  4. Age-correction of test scores reduces the validity of mild cognitive impairment in predicting progression to dementia.

    Directory of Open Access Journals (Sweden)

    Johannes Hessler

    Full Text Available A phase of mild cognitive impairment (MCI precedes most forms of neurodegenerative dementia. Many definitions of MCI recommend the use of test norms to diagnose cognitive impairment. It is, however, unclear whether the use of norms actually improves the detection of individuals at risk of dementia. Therefore, the effects of age- and education-norms on the validity of test scores in predicting progression to dementia were investigated.Baseline cognitive test scores (Syndrome Short Test of dementia-free participants aged ≥65 were used to predict progression to dementia within three years. Participants were comprehensively examined one, two, and three years after baseline. Test scores were calculated with correction for (1 age and education, (2 education only, (3 age only and (4 without correction. Predictive validity was estimated with Cox proportional hazard regressions. Areas under the curve (AUCs were calculated for the one-, two-, and three-year intervals.82 (15.3% of initially 537 participants, developed dementia. Model coefficients, hazard ratios, and AUCs of all scores were significant (p<0.001. Predictive validity was the lowest with age-corrected scores (-2 log likelihood  = 840.90, model fit χ2 (1  = 144.27, HR  = 1.33, AUCs between 0.73 and 0.87 and the highest with education-corrected scores (-2 log likelihood  = 815.80, model fit χ2 (1  = 171.16, HR  = 1.34, AUCs between 0.85 and 0.88.The predictive validity of test scores is markedly reduced by age-correction. Therefore, definitions of MCI should not recommend the use of age-norms in order to improve the detection of individuals at risk of dementia.

  5. The Validity of Graduate Management Admission Test Scores: A Summary of Studies Conducted from 1997 to 2004

    Science.gov (United States)

    Talento-Miller, Eileen; Rudner, Lawrence M.

    2008-01-01

    The validity of Graduate Management Admission Test (GMAT) scores is examined by summarizing 273 studies conducted between 1997 and 2004. Each of the studies was conducted through the Validity Study Service of the test sponsor and contained identical variables and statistical methods. Validity coefficients from each of the studies were corrected…

  6. The Validity of Graduate Management Admission Test Scores: A Summary of Studies Conducted from 1997 to 2004

    Science.gov (United States)

    Talento-Miller, Eileen; Rudner, Lawrence M.

    2008-01-01

    The validity of Graduate Management Admission Test (GMAT) scores is examined by summarizing 273 studies conducted between 1997 and 2004. Each of the studies was conducted through the Validity Study Service of the test sponsor and contained identical variables and statistical methods. Validity coefficients from each of the studies were corrected…

  7. Language Learner Strategies and Linguistic Competence as Factors Affecting Achievement Test Scores in English for Specific Purposes

    Science.gov (United States)

    Jurkovic, Violeta

    2010-01-01

    The article examines the effect of two factors on achievement test scores in English as a foreign language for specific purposes in higher education: preexisting linguistic competence and frequency of use of language learner strategies. The rationale for the analysis of language learner strategies as a factor affecting achievement test outcomes is…

  8. Child Abuse: Its Relationship to Birthweight, Apgar Score, and Developmental Testing.

    Science.gov (United States)

    Goldson, Edward; And Others

    1978-01-01

    The relationship of child abuse to birthweight, five-minute Apgar score, and performance on the Bayley Scales of Infant Development was studied in 75 low socioeconomic infants (ages 2-30 months). Journal availability: see EC 111 042. (Author)

  9. Consistency and Raw Scores Survive Another Test: A Last Response to Prediger and His Colleagues

    Science.gov (United States)

    Holland, John L.

    1976-01-01

    Prediger confuses observations about the data with Holland's theoretical statement, performs some uninterpretable analyses, omits much relevant data, and provides an incomplete account of what psychometric authorities have said about raw scores in interest inventories. (Author)

  10. A physical function test for use in the intensive care unit: validity, responsiveness, and predictive utility of the physical function ICU test (scored).

    Science.gov (United States)

    Denehy, Linda; de Morton, Natalie A; Skinner, Elizabeth H; Edbrooke, Lara; Haines, Kimberley; Warrillow, Stephen; Berney, Sue

    2013-12-01

    Several tests have recently been developed to measure changes in patient strength and functional outcomes in the intensive care unit (ICU). The original Physical Function ICU Test (PFIT) demonstrates reliability and sensitivity. The aims of this study were to further develop the original PFIT, to derive an interval score (the PFIT-s), and to test the clinimetric properties of the PFIT-s. A nested cohort study was conducted. One hundred forty-four and 116 participants performed the PFIT at ICU admission and discharge, respectively. Original test components were modified using principal component analysis. Rasch analysis examined the unidimensionality of the PFIT, and an interval score was derived. Correlations tested validity, and multiple regression analyses investigated predictive ability. Responsiveness was assessed using the effect size index (ESI), and the minimal clinically important difference (MCID) was calculated. The shoulder lift component was removed. Unidimensionality of combined admission and discharge PFIT-s scores was confirmed. The PFIT-s displayed moderate convergent validity with the Timed "Up & Go" Test (r=-.60), the Six-Minute Walk Test (r=.41), and the Medical Research Council (MRC) sum score (rho=.49). The ESI of the PFIT-s was 0.82, and the MCID was 1.5 points (interval scale range=0-10). A higher admission PFIT-s score was predictive of: an MRC score of ≥48, increased likelihood of discharge home, reduced likelihood of discharge to inpatient rehabilitation, and reduced acute care hospital length of stay. Scoring of sit-to-stand assistance required is subjective, and cadence cutpoints used may not be generalizable. The PFIT-s is a safe and inexpensive test of physical function with high clinical utility. It is valid, responsive to change, and predictive of key outcomes. It is recommended that the PFIT-s be adopted to test physical function in the ICU.

  11. ETS Psychometric Contributions: Focus on Test Scores. Research Report. ETS RR-13-15. ETS R&D Scientific and Policy Contributions Series. ETS SPC-13-03

    Science.gov (United States)

    Moses, Tim

    2013-01-01

    The purpose of this report is to review ETS psychometric contributions that focus on test scores. Two major sections review contributions based on assessing test scores' measurement characteristics and other contributions about using test scores as predictors in correlational and regression relationships. An additional section reviews additional…

  12. Utility of a scoring balloon for a severely calcified lesion: bench test and finite element analysis.

    Science.gov (United States)

    Kawase, Yoshiaki; Saito, Naritatsu; Watanabe, Shin; Bao, Bingyuan; Yamamoto, Erika; Watanabe, Hiroki; Higami, Hirooki; Matsuo, Hitoshi; Ueno, Katsumi; Kimura, Takeshi

    2014-04-01

    We aimed to investigate the effectiveness of a scoring balloon catheter in expanding a circumferentially calcified lesion compared to a conventional balloon catheter using an in vitro experiment setting and elucidate the underlying mechanisms of this ability using a finite element analysis. True efficacy of the scoring device and the underlying mechanisms for heavily calcified coronary lesions are unclear. We employed a Scoreflex scoring balloon catheter (OrbusNeich, Hong Kong, China). The ability of Scoreflex to dilate a calcified lesion was compared with a conventional balloon catheter using 3 different sized calcium tubes. The thickness of the calcium tubes were 2.0, 2.25, and 2.5 mm. The primary endpoints were the successful induction of cracks in the calcium tubes and the inflation pressures required for inducing cracks. The inflation pressure required for cracking the calcium tubes were consistently lower with Scoreflex (p finite element analysis revealed that the first principal stress applied to the calcified plaque was higher by at least threefold when applying the balloon catheter with scoring elements. A scoring balloon catheter can expand a calcified lesion with lower pressure than that of a conventional balloon. The finite element analysis revealed that the concentration of the stress observed in the outside of the calcified plaque just opposite to the scoring element is the underlying mechanism of the increased ability of Scoreflex to dilate the calcified lesion.

  13. Longitudinal analysis of standardized test scores of students in the Science Writing Heuristic approach

    Science.gov (United States)

    Chanlen, Niphon

    The purpose of this study was to examine the longitudinal impacts of the Science Writing Heuristic (SWH) approach on student science achievement measured by the Iowa Test of Basic Skills (ITBS). A number of studies have reported positive impact of an inquiry-based instruction on student achievement, critical thinking skills, reasoning skills, attitude toward science, etc. So far, studies have focused on exploring how an intervention affects student achievement using teacher/researcher-generated measurement. Only a few studies have attempted to explore the long-term impacts of an intervention on student science achievement measured by standardized tests. The students' science and reading ITBS data was collected from 2000 to 2011 from a school district which had adopted the SWH approach as the main approach in science classrooms since 2002. The data consisted of 12,350 data points from 3,039 students. The multilevel model for change with discontinuity in elevation and slope technique was used to analyze changes in student science achievement growth trajectories prior and after adopting the SWH approach. The results showed that the SWH approach positively impacted students by initially raising science achievement scores. The initial impact was maintained and gradually increased when students were continuously exposed to the SWH approach. Disadvantaged students who were at risk of having low science achievement had bigger benefits from experience with the SWH approach. As a result, existing problematic achievement gaps were narrowed down. Moreover, students who started experience with the SWH approach as early as elementary school seemed to have better science achievement growth compared to students who started experiencing with the SWH approach only in high school. The results found in this study not only confirmed the positive impacts of the SWH approach on student achievement, but also demonstrated additive impacts found when students had longitudinal experiences

  14. Lead exposure and the 2010 achievement test scores of children in New York counties

    Directory of Open Access Journals (Sweden)

    Strayhorn Jillian C

    2012-01-01

    Full Text Available Abstract Background Lead is toxic to cognitive and behavioral functioning in children even at levels well below those producing physical symptoms. Continuing efforts in the U.S. since about the 1970s to reduce lead exposure in children have dramatically reduced the incidence of elevated blood lead levels (with elevated levels defined by the current U.S. Centers for Disease Control threshold of 10 μg/dl. The current study examines how much lead toxicity continues to impair the academic achievement of children of New York State, using 2010 test data. Methods This study relies on three sets of data published for the 57 New York counties outside New York City: school achievement data from the New York State Department of Education, data on incidence of elevated blood lead levels from the New York State Department of Health, and data on income from the U.S. Census Bureau. We studied third grade and eighth grade test scores in English Language Arts and mathematics. Using the county as the unit of analysis, we computed bivariate correlations and regression coefficients, with percent of children achieving at the lowest reported level as the dependent variable and the percent of preschoolers in the county with elevated blood lead levels as the independent variable. Then we repeated those analyses using partial correlations to control for possible confounding effects of family income, and using multiple regressions with income included. Results The bivariate correlations between incidence of elevated lead and number of children in the lowest achievement group ranged between 0.38 and 0.47. The partial correlations ranged from 0.29 to 0.40. The regression coefficients, both bivariate and partial (both estimating the increase in percent of children in the lowest achievement group for every percent increase in the children with elevated blood lead levels, ranged from 0.52 to 1.31. All regression coefficients, when rounded to the nearest integer, were

  15. A Generalized Approach to the Two Sample Problem: The Quantile Approach.

    Science.gov (United States)

    1981-04-01

    Tests for the Two Sample Problem and Their Power," I, II, III, Indagationes Math., 14, 453-458, 15, 303-310, 15, 80. Wald , A. and Wolfowitz , J. (1940...where 0 < p < q < 1 or use p,q an inner product based on the censored observations. Other directions to go include the Wald andWolfowitz (1940) runs

  16. Likelihood ratio and score tests to test the non-inferiority (or equivalence) of the odds ratio in a crossover study with binary outcomes.

    Science.gov (United States)

    Li, Xiaochun; Li, Huilin; Jin, Man; D Goldberg, Judith

    2016-09-10

    We consider the non-inferiority (or equivalence) test of the odds ratio (OR) in a crossover study with binary outcomes to evaluate the treatment effects of two drugs. To solve this problem, Lui and Chang (2011) proposed both an asymptotic method and a conditional method based on a random effects logit model. Kenward and Jones (1987) proposed a likelihood ratio test (LRTM ) based on a log linear model. These existing methods are all subject to model misspecification. In this paper, we propose a likelihood ratio test (LRT) and a score test that are independent of model specification. Monte Carlo simulation studies show that, in scenarios considered in this paper, both the LRT and the score test have higher power than the asymptotic and conditional methods for the non-inferiority test; the LRT, score, and asymptotic methods have similar power, and they all have higher power than the conditional method for the equivalence test. When data can be well described by a log linear model, the LRTM has the highest power among all the five methods (LRTM , LRT, score, asymptotic, and conditional) for both non-inferiority and equivalence tests. However, in scenarios for which a log linear model does not describe the data well, the LRTM has the lowest power for the non-inferiority test and has inflated type I error rates for the equivalence test. We provide an example from a clinical trial that illustrates our methods. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  17. The Health Professions Admission Test (HPAT) score and leaving certificate results can independently predict academic performance in medical school: do we need both tests?

    LENUS (Irish Health Repository)

    Halpenny, D

    2010-11-01

    A recent study raised concerns regarding the ability of the health professions admission test (HPAT) Ireland to improve the selection process in Irish medical schools. We aimed to establish whether performance in a mock HPAT correlated with academic success in medicine. A modified HPAT examination and a questionnaire were administered to a group of doctors and medical students. There was a significant correlation between HPAT score and college results (r2: 0.314, P = 0.018, Spearman Rank) and between leaving cert score and college results (r2: 0.306, P = 0.049, Spearman Rank). There was no correlation between leaving cert points score and HPAT score. There was no difference in HPAT score across a number of other variables including gender, age and medical speciality. Our results suggest that both the HPAT Ireland and the leaving certificate examination could act as independent predictors of academic achievement in medicine.

  18. Predictive validity of the UPDRS postural stability score and the Functional Reach Test, when compared with ecologically valid reaching tasks.

    Science.gov (United States)

    Jenkins, M E; Johnson, A M; Holmes, J D; Stephenson, F F; Spaulding, S J

    2010-07-01

    Balance problems and falls are a common concern among individuals with Parkinson's disease (PD). Falls frequently occur during daily activities such as reaching into cupboards in the kitchen or bathroom. This study compared the correlation among two standard postural stability tests - the postural stability score on the Unified Parkinson's Disease Rating Scale (UPDRS) and the Functional Reach Test (FRT) - and ecologically valid reaching tasks that correspond to reaching at different cupboard heights among 20 individuals with PD and 20 age-matched controls. Both the FRT and the UPDRS postural stability tests are quick measures that can be performed during the clinical examination. The FRT, but not the postural stability score, demonstrated a significant correlation with the ecologically valid reaching tasks, among individuals with PD. Furthermore the FRT scores did not correlate with the UPDRS postural stability scores, indicating that these are measuring different aspects of balance. This study suggests that the FRT score may better predict the risk of postural instability encountered during daily activities among individuals with PD.

  19. Exploration of analysis methods for diagnostic imaging tests: problems with ROC AUC and confidence scores in CT colonography.

    Directory of Open Access Journals (Sweden)

    Susan Mallett

    Full Text Available BACKGROUND: Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC for the evaluation of CT colonography for the detection of polyps, either with or without computer assisted detection. METHODS: In a multireader multicase study of 10 readers and 107 cases we compared sensitivity and specificity, using radiological reporting of the presence or absence of polyps, to ROC AUC calculated from confidence scores concerning the presence of polyps. Both methods were assessed against a reference standard. Here we focus on five readers, selected to illustrate issues in design and analysis. We compared diagnostic measures within readers, showing that differences in results are due to statistical methods. RESULTS: Reader performance varied widely depending on whether sensitivity and specificity or ROC AUC was used. There were problems using confidence scores; in assigning scores to all cases; in use of zero scores when no polyps were identified; the bimodal non-normal distribution of scores; fitting ROC curves due to extrapolation beyond the study data; and the undue influence of a few false positive results. Variation due to use of different ROC methods exceeded differences between test results for ROC AUC. CONCLUSIONS: The confidence scores recorded in our study violated many assumptions of ROC AUC methods, rendering these methods inappropriate. The problems we identified will apply to other detection studies using confidence scores. We found sensitivity and specificity were a more reliable and clinically appropriate method to compare diagnostic tests.

  20. Classifying and scoring of molecules with the NGN: new datasets, significance tests, and generalization

    Directory of Open Access Journals (Sweden)

    Cameron Christopher JF

    2010-10-01

    Full Text Available Abstract This paper demonstrates how a Neural Grammar Network learns to classify and score molecules for a variety of tasks in chemistry and toxicology. In addition to a more detailed analysis on datasets previously studied, we introduce three new datasets (BBB, FXa, and toxicology to show the generality of the approach. A new experimental methodology is developed and applied to both the new datasets as well as previously studied datasets. This methodology is rigorous and statistically grounded, and ultimately culminates in a Wilcoxon significance test that proves the effectiveness of the system. We further include a complete generalization of the specific technique to arbitrary grammars and datasets using a mathematical abstraction that allows researchers in different domains to apply the method to their own work. Background Our work can be viewed as an alternative to existing methods to solve the quantitative structure-activity relationship (QSAR problem. To this end, we review a number approaches both from a methodological and also a performance perspective. In addition to these approaches, we also examined a number of chemical properties that can be used by generic classifier systems, such as feed-forward artificial neural networks. In studying these approaches, we identified a set of interesting benchmark problem sets to which many of the above approaches had been applied. These included: ACE, AChE, AR, BBB, BZR, Cox2, DHFR, ER, FXa, GPB, Therm, and Thr. Finally, we developed our own benchmark set by collecting data on toxicology. Results Our results show that our system performs better than, or comparatively to, the existing methods over a broad range of problem types. Our method does not require the expert knowledge that is necessary to apply the other methods to novel problems. Conclusions We conclude that our success is due to the ability of our system to: 1 encode molecules losslessly before presentation to the learning system, and 2

  1. Comparative testing of reliability and audit utility of ordinal objective calculus complexity scores. Can we make an informed choice yet?

    Science.gov (United States)

    Jaipuria, Jiten; Suryavanshi, Manav; Sen, Tridib K

    2016-12-01

    To assess the reliability of the Guy's Stone Score, the Seoul National University Renal Stone Complexity (S-ReSC) score and the S.T.O.N.E. scores in percutaneous nephrolithotomy (PCNL), and assess their utility in discriminating outcomes [stone free rate (SFR), complications, need for multiple PCNL sessions, and auxiliary procedures] valid across parameters of experience of surgeon, independence from surgical approach, and variations in institution-specific instrumentation. A prospectively maintained database of two tertiary institutions was analysed (606 cases). Institutes differed in instrumentation, while the overall surgical team comprised: two trainees (experience 1000 cases). Scores were assigned and re-assigned after 4 months by one trainee and an expert surgeon. Inter-rater and test-retest agreement were analysed by Cohen's κ and intraclass correlation coefficient. Multivariate logistic regression models were created adjusting outcomes for the institution, comorbidity, Amplatz size, access tract location, the number of punctures, the experience level of the surgeon, and individual scoring system, and receiver operating curves were analysed for comparison. Despite some areas of inconsistencies, individually all scores had excellent inter-rater and test-retest concordance. On multivariable analyses, while the experience of the surgeon and surgical approach characteristics (such as access tract location, Amplatz size, and number of punctures) remained independently associated with different outcomes in varying combinations, calculus complexity scores were found consistently to be independently associated with all outcomes. The S-ReSC score had a superior association with SFR, the need for multiple PCNL sessions, and auxiliary procedures. Individually all scoring systems performed well. On cross comparison, the S-ReSC score consistently emerged to be more superiorly associated with all outcomes, signifying the importance of the distributional complexity of the

  2. Association testing for next-generation sequencing data using score statistics

    DEFF Research Database (Denmark)

    Skotte, Line; Korneliussen, Thorfinn Sand; Albrechtsen, Anders

    2012-01-01

    of genotype calls into account have been proposed; most require numerical optimization which for large-scale data is not always computationally feasible. We show that using a score statistic for the joint likelihood of observed phenotypes and observed sequencing data provides an attractive approach...... computationally feasible due to the use of score statistics. As part of the joint likelihood, we model the distribution of the phenotypes using a generalized linear model framework, which works for both quantitative and discrete phenotypes. Thus, the method presented here is applicable to case-control studies...

  3. Adjustment of cognitive scores with a co-normed estimate of premorbid intelligence: implementation using mindstreams computerized testing.

    Science.gov (United States)

    Doniger, Glen M; Simon, Ely S; Schweiger, Avraham

    2008-01-01

    Neuropsychological assessment is critically dependent upon comparison to a standard normative database. While generally appropriate for individuals of near-average intelligence, high-intelligence individuals may be erroneously scored as unimpaired and low-intelligence individuals as impaired on cognitive measures. The current paper describes an approach for minimizing such misclassifications that is standardized and practical for clinical use. A computerized test of nonverbal reasoning co-normed with cognitive measures is used for automatic adjustment of normalized cognitive scores. This premorbid estimate showed good construct validity, and adjustment raised cognitive scores for low-intelligence individuals, and lowered cognitive scores for high-intelligence individuals similarly across demographic (age, education, computer experience) and clinical (cognitively healthy, mild cognitive impairment, dementia) subgroups. Adjustment was typically up to three normalized units for scores on the premorbid estimate of +/-1 SD and 6 normalized units for scores of +/-2 SD. The present approach shows promise as a practical solution for assessment of high- and low-intelligence individuals.

  4. Predictive validity of the classroom strategies scale-observer form on statewide testing scores: an initial investigation.

    Science.gov (United States)

    Reddy, Linda A; Fabiano, Gregory A; Dudek, Christopher M; Hsu, Louis

    2013-12-01

    The present study examined the validity of a teacher observation measure, the Classroom Strategies Scale--Observer Form (CSS), as a predictor of student performance on statewide tests of mathematics and English language arts. The CSS is a teacher practice observational measure that assesses evidence-based instructional and behavioral management practices in elementary school. A series of two-level hierarchical generalized linear models were fitted to data of a sample of 662 third- through fifth-grade students to assess whether CSS Part 2 Instructional Strategy and Behavioral Management Strategy scale discrepancy scores (i.e., ∑ |recommended frequency--frequency ratings|) predicted statewide mathematics and English language arts proficiency scores when percentage of minority students in schools was controlled. Results indicated that the Instructional Strategy scale discrepancy scores significantly predicted mathematics and English language arts proficiency scores: Relatively larger discrepancies on observer ratings of what teachers did versus what should have been done were associated with lower proficiency scores. Results offer initial evidence of the predictive validity of the CSS Part 2 Instructional Strategy discrepancy scores on student academic outcomes.

  5. Contributions of Selected Perinatal Variables to Seven-Year Psychological and Achievement Test Scores.

    Science.gov (United States)

    Henderson, N. B.; And Others

    Perinatal variables were used to predict 7-year outcome for 538 children, 32% Negro and 68% white. Mother's age, birthplace, education, occupation, marital status, neuropsychiatric status, family income, number supported, birth weight, one- and five-minute Apgar scores were regressed on 7-year Verbal, Performance and Full Scale IQ, Bender, Wide…

  6. Resolving Differences among Methods of Establishing Confidence Limits for Test Scores.

    Science.gov (United States)

    Glutting, Joseph J.; And Others

    1987-01-01

    This paper discusses the basic theory underlying confidence limits and presents reasons why psychologists should incorporate confidence ranges in their psychodiagnostic reports. Four methods for establishing confidence limits are compared. Three of the methods involve estimated true scores, and the fourth is the standard error of measurement…

  7. Effects of Public Preschool Expenditures on the Test Scores of Fourth Graders: Evidence from TIMSS

    Science.gov (United States)

    Waldfogel, Jane; Zhai, Fuhua

    2008-01-01

    This study examines the effects of public preschool expenditures on the math and science scores of 4th graders, holding constant child, family, and school characteristics, other relevant social expenditures, and country and year effects, in 7 Organisation for Economic Co-operation and Development (OECD) countries--Australia, Japan, the…

  8. Out-of-School Time Program Test Score Impact for Black Children of Single-Parents

    Science.gov (United States)

    Nagle, Barry T.

    2013-01-01

    Out-of-School Time programs and their impact on standardized college entrance exam scores for black or African-American children of single parents who have applied for a competitive college scholarship program is the study focus. Study importance is supported by the large percentage of black children raised by single parents, the large percentage…

  9. Teacher Empathy and Its Relationship to the Standardized Test Scores of Diverse Secondary English Students

    Science.gov (United States)

    Bostic, Timothy B.

    2014-01-01

    The purpose of this research study was to ascertain whether there is a relationship between teachers' cognitive role taking aspect of empathy and the Virginia Standards of Learning (VSOL), English/Reading scores of their students. A correlational research design using hierarchical multiple regression was used to look for this relationship. In…

  10. 76 FR 16350 - Medical Devices; Ovarian Adnexal Mass Assessment Score Test System; Labeling; Black Box Restrictions

    Science.gov (United States)

    2011-03-23

    ... received may be posted without change to http://www.regulations.gov , including any personal information... combines the values into a single score that is then used to determine the likelihood that the pre-surgical... benefits (including potential economic, environmental, public health and safety, and other advantages...

  11. Reliability and validity test of a Scoring Rubric for Information Literacy

    NARCIS (Netherlands)

    A.A.J. (Jos) van Helvoort; Frank Huysmans; Saskia Brand-Gruwel; Ellen Sjoer

    2017-01-01

    Purpose: The main purpose of the research was to measure reliability and validity of the Scoring Rubric for Information Literacy (Van Helvoort, 2010). Design/methodology/approach: Percentages of agreement and Intraclass Correlation were used to describe interrater reliability. For the determination

  12. Out-of-School Time Program Test Score Impact for Black Children of Single-Parents

    Science.gov (United States)

    Nagle, Barry T.

    2013-01-01

    Out-of-School Time programs and their impact on standardized college entrance exam scores for black or African-American children of single parents who have applied for a competitive college scholarship program is the study focus. Study importance is supported by the large percentage of black children raised by single parents, the large percentage…

  13. Effects of Public Preschool Expenditures on the Test Scores of Fourth Graders: Evidence from TIMSS

    Science.gov (United States)

    Waldfogel, Jane; Zhai, Fuhua

    2008-01-01

    This study examines the effects of public preschool expenditures on the math and science scores of 4th graders, holding constant child, family, and school characteristics, other relevant social expenditures, and country and year effects, in 7 Organisation for Economic Co-operation and Development (OECD) countries--Australia, Japan, the…

  14. Automated Scoring for the "TOEFL Junior"® Comprehensive Writing and Speaking Test. Research Report. ETS RR-15-09

    Science.gov (United States)

    Evanini, Keelan; Heilman, Michael; Wang, Xinhao; Blanchard, Daniel

    2015-01-01

    This report describes the initial automated scoring results that were obtained using the constructed responses from the Writing and Speaking sections of the pilot forms of the "TOEFL Junior"® Comprehensive test administered in late 2011. For all of the items except one (the edit item in the Writing section), existing automated scoring…

  15. The Franck Test for Gender Identity: Correlation with Occupation and Long-Term Stability of Score in Normal Men.

    Science.gov (United States)

    Berg, Roland

    1985-01-01

    A correlation was found between high (independent) occupational positions and masculine scores on the Franck Drawing Completion Test (FDCT). Acceptable individual long-term stability was also evident. The FDCT appears to be useful in assessing gender identity. (Author/ABB)

  16. Differential Predictive Validity of High School GPA and College Entrance Test Scores for University Students in Yemen

    Science.gov (United States)

    Al-Hattami, Abdulghani Ali Dawod

    2012-01-01

    High school grade point average and college entrance test scores are two admission criteria that are currently used by most colleges in Yemen to select their prospective students. Given their widespread use, it is important to investigate their predictive validity to ensure the accuracy of the admission decisions in these institutions. This study…

  17. Differential Predictive Validity of High School GPA and College Entrance Test Scores for University Students in Yemen

    Science.gov (United States)

    Al-Hattami, Abdulghani Ali Dawod

    2012-01-01

    High school grade point average and college entrance test scores are two admission criteria that are currently used by most colleges in Yemen to select their prospective students. Given their widespread use, it is important to investigate their predictive validity to ensure the accuracy of the admission decisions in these institutions. This study…

  18. Legal Issues in the Use of Student Test Scores and Value-Added Models (VAM) to Determine Educational Quality

    Science.gov (United States)

    Pullin, Diana

    2013-01-01

    A growing number of states and local schools across the country have adopted educator evaluation and accountability programs based on the use of student test scores and value-added models (VAM). A wide array of potential legal issues could arise from the implementation of these programs. This article uses legal analysis and social science evidence…

  19. Predicting Pre-Service Classroom Teachers' Civil Servant Recruitment Examination's Educational Sciences Test Scores Using Artificial Neural Networks

    Science.gov (United States)

    Demir, Metin

    2015-01-01

    This study predicts the number of correct answers given by pre-service classroom teachers in Civil Servant Recruitment Examination's (CSRE) educational sciences test based on their high school grade point averages, university entrance scores, and grades (mid-term and final exams) from their undergraduate educational courses. This study was…

  20. The Score Reliability of Draw-a-Person Intellectual Ability Test (DAP: IQ) for Rural Malawi Students

    Science.gov (United States)

    Khasu, Denis S.; Williams, Thomas O., Jr.

    2016-01-01

    In this brief article, the reliability of scores for the Draw-A-Person Intellectual Ability Test for Children, Adolescents, and Adults (DAP: IQ; Reynolds & Hickman, 2004) was examined through several analyses with a sample of 147 children from rural Malawi, Africa using a Chichewa translation of instructions. Cronbach alpha coefficients for…

  1. Quality Control for Scoring Tests Administered in Continuous Mode: An NCME Instructional Module

    Science.gov (United States)

    Allalouf, Avi; Gutentag, Tony; Baumer, Michal

    2017-01-01

    Quality control (QC) in testing is paramount. QC procedures for tests can be divided into two types. The first type, one that has been well researched, is QC for tests administered to large population groups on few administration dates using a small set of test forms (e.g., large-scale assessment). The second type is QC for tests, usually…

  2. Emotional intelligence as ability : assessing the construct validity of scores from the Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT)

    OpenAIRE

    Føllesdal, Hallvard

    2008-01-01

    This thesis presents the results from three papers assessing the validity of the Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT; Mayer, Salovey, & Caruso, 2002). The MSCEIT is the only performance test measuring the entire four-branch ability model of EI (Mayer & Salovey, 1997; Salovey & Mayer, 1990). Previous studies have reported low reliability coefficients for the branch scores for MSCEIT and reliability estimates vary greatly from study to study. The reported reliability coe...

  3. A comparison of Mallampati scoring, upper lip bite test and sternomental distance in predicting difficult intubation

    Directory of Open Access Journals (Sweden)

    Arun Varghese

    2016-07-01

    Conclusions: The high specificity, NPV, PPV and accuracy of sternomental distance compared to other tests makes it the single best test in predicting difficulty intubation. However, a combination of all three tests was found to be more sensitive and had higher discriminative power compared to any single test alone. [Int J Res Med Sci 2016; 4(7.000: 2645-2648

  4. Correlates of Children's Eating Attitude Test scores among primary school children.

    Science.gov (United States)

    Shariff, Zalilah Mohd; Yasin, Zaidah Mohamed

    2005-04-01

    A total of 107 Malay primary school girls (8-9 yr. old) completed a set of measurements on eating behavior (ChEAT, food neophobia scales, and dieting experience), the Rosenberg Self-Esteem Scale, body shape satisfaction, dietary intake, weight, and height. About 38% of the girls scored 20 and more on the ChEAT, and 46% of them reported dieting by reducing sugar and sweets (73%), skipping meals (67%), reducing fat foods (60%) and snacks (53%) as the most frequent methods practiced. In general, those girls with higher ChEAT scores tended to have lower self-esteem (r=.39), indicating they were more unwilling to try new foods (food neophobic) (r=.29), chose a smaller figure for desired body size (r=-.25), and were more dissatisfied with their body size (r=.31).

  5. A multiparametric clinical and echocardiographic score to risk stratify patients with chronic systolic heart failure: derivation and testing.

    Science.gov (United States)

    Fontanive, Paolo; Miccoli, Mario; Simioniuc, Anca; Angelillis, Marco; Di Bello, Vitantonio; Baggiani, Angelo; Bongiorni, Maria Grazia; Marzilli, Mario; Dini, Frank Lloyd

    2013-11-01

    Although echo Doppler and biomarkers are the most common examinations performed worldwide in heart failure (HF), they are rarely considered in risk scores. In outpatients with chronic HF and left ventricular ejection fraction (LVEF) ≤45%, data on clinical status, echo Doppler variables, aminoterminal pro-type B natriuretic peptide (NT-proBNP), estimated glomerular filtration rate (eGFR), and drug therapies were combined to build up a multiparametric score. We randomly selected 250 patients to produce a derivation cohort and 388 patients were used as a testing cohort. Follow-up lasted 29 ± 23 months. The univariable predictors that entered into the multivariable Cox model were as follows: furosemide daily dose >25 mg, inability to tolerate angiotensin converting enzyme (ACE) inhibitors, inability to tolerate β-blockers, age >75 years, New York Heart Association (NYHA) >2, eGFR96 mL/m(2) , moderate-to-severe mitral regurgitation (MR) and LVEF derivation cohort (68.4% sensitivity, 79.5% specificity, area under the curve [AUC] 78.7%) or in the testing cohort (73.7% sensitivity, 71.3% specificity, AUC 77.2%). All-cause mortality significantly increased with increasing score both in the derivation and in the testing cohort (P < 0.0001). In conclusion, this multiparametric score is able to predict mortality in chronic systolic HF.

  6. Effect of seat height and turning direction on the timed up and go test scores of people after stroke.

    Science.gov (United States)

    Heung, Thomas H M; Ng, Shamay S M

    2009-09-01

    To identify the effect of chair seat height and turning direction on the Timed Up and Go scores of patients after stroke. A cross-sectional study. A geriatric day hospital in Hong Kong. Twenty-five patients with sub-acute stroke. The time taken to complete the Timed Up and Go test with various chair seat heights (65%, 90% and 115% of each subject's leg length - distance from lateral knee joint line to ground in sitting) and turning directions (toward the affected and unaffected side) was recorded using a stopwatch with randomized test order. There were significant differences in Timed Up and Go scores between the 3 levels of chair seat height (p Timed Up and Go scores recorded when the seat height was 115% of the subject's leg length and the highest at a seat height of 65% of the subject's leg length. Turning toward the affected side was found to be significantly quicker than turning toward the unaffected side (p Timed Up and Go scores of patients after sub-acute stroke. Optimizing chair seat height with reference to subject's leg length and turning direction is essential when using the Timed Up and Go test as an outcome measure in stroke rehabilitation.

  7. Zero Calcium Score as a Filter for Further Testing in Patients Admitted to the Coronary Care Unit with Chest Pain.

    Science.gov (United States)

    Correia, Luis Cláudio Lemos; Esteves, Fábio P; Carvalhal, Manuela; Souza, Thiago Menezes Barbosa de; Sá, Nicole de; Correia, Vitor Calixto de Almeida; Alexandre, Felipe Kalil Beirão; Lopes, Fernanda; Ferreira, Felipe; Noya-Rabelo, Márcia

    2017-06-12

    The accuracy of zero coronary calcium score as a filter in patients with chest pain has been demonstrated at the emergency room and outpatient clinics, populations with low prevalence of coronary artery disease (CAD). To test the gatekeeping role of zero calcium score in patients with chest pain admitted to the coronary care unit (CCU), where the pretest probability of CAD is higher than that of other populations. Patients underwent computed tomography for calcium scoring, and obstructive CAD was defined by a minimum 70% stenosis on invasive angiography. In 146 patients studied, the prevalence of CAD was 41%. A zero calcium score was present in 35% of the patients. The sensitivity and specificity of zero calcium score yielded a negative likelihood ratio of 0.16. After logistic regression adjustment for pretest probability, zero calcium score was independently associated with lower odds of CAD (OR = 0.12, 95%CI = 0.04-0.36), increasing the area under the ROC curve of the clinical model from 0.76 to 0.82 (p = 0.006). Zero calcium score provided a net reclassification improvement of 0.20 (p = 0.0018) over the clinical model when using a pretest probability threshold of 10% for discharging without further testing. In patients with pretest probability valores preditivos negativos do escore zero. Em 146 pacientes estudados, a prevalência de DAC foi 41% e o escore de cálcio zero foi demonstrado em 35% deles. A sensibilidade e a especificidade para escore de cálcio zero resultaram numa razão de verossimilhança negativa de 0,16. Após ajuste com um escore clínico com a regressão logística para a probabilidade pré-teste, o escore de cálcio zero foi preditor independente associado a baixa probabilidade de DAC (OR = 0,12, IC95% = 0,04-0,36), aumentando a área abaixo da curva ROC do modelo clínico de 0,76 para 0,82 (p = 0,006). Considerando a probabilidade de DAC valor preditivo negativo de 90%. Em pacientes com probabilidade pré-teste valor preditivo negativo foi

  8. Analysis of WISC-III, Stanford-Binet:IV, and academic achievement test scores in children with autism.

    Science.gov (United States)

    Mayes, Susan Dickerson; Calhoun, Susan L

    2003-06-01

    Nonverbal IQs were greater than verbal IQs for young children (3-7 years of age) on the Stanford-Binet:IV (n = 53). However, WISC-III verbal and nonverbal IQs were similar for older children, 6-15 years of age (n = 63). Stanford-Binet:IV profiles were generally consistent for the low-IQ ( or = 80) groups, with high scores on visual matching tests (Bead Memory and Quantitative Reasoning). The low- and high-WISC-III IQ groups both performed well relative to IQ on tests of lexical knowledge (Similarities, Information, and Vocabulary), but not on language comprehension and social reasoning (Comprehension). The low-IQ group did best on visuo-motor subtests (Object Assembly and Block Design), but the high-IQ group did not. The high-IQ group had significantly low scores on the Digit Span, Arithmetic, Coding, VMI, and WIAT Written Expression tests, suggesting attention and writing weaknesses.

  9. Timed up & go test score in patients with hip fracture is related to the type of walking aid

    DEFF Research Database (Denmark)

    Kristensen, Morten T; Bandholm, Thomas; Holm, Bente

    2009-01-01

    Kristensen MT, Bandholm T, Holm B, Ekdahl C, Kehlet H. Timed Up & Go test score in patients with hip fracture is related to the type of walking aid. OBJECTIVE: To determine the relationship between Timed Up & Go (TUG) test scores and type of walking aid used during the test, and to determine...... the feasibility of using the rollator as a standardized walking aid during the TUG in patients with hip fracture who were allowed full weight-bearing (FWB). DESIGN: Prospective methodological study. SETTING: An acute orthopedic hip fracture unit at a university hospital. PARTICIPANTS: Patients (N=126; 90 women......, 36 men) with hip fracture with a mean age +/- SD of 74.8+/-12.7 years performed the TUG the day before discharge from the orthopedic ward. INTERVENTIONS: Not applicable. MAIN OUTCOME MEASURES: The TUG was performed with the walking aid the patient was to be discharged with: a walker (n=88) or elbow...

  10. A Note on the Effect on Power of Score Tests via Dimension Reduction by Penalized Regression under the Null*

    Science.gov (United States)

    Martinez, Josue G.; Carroll, Raymond J; Muller, Samuel; Sampson, Joshua N.; Chatterjee, Nilanjan

    2010-01-01

    We consider the problem of score testing for certain low dimensional parameters of interest in a model that could include finite but high dimensional secondary covariates and associated nuisance parameters. We investigate the possibility of the potential gain in power by reducing the dimensionality of the secondary variables via oracle estimators such as the Adaptive Lasso. As an application, we use a recently developed framework for score tests of association of a disease outcome with an exposure of interest in the presence of a possible interaction of the exposure with other co-factors of the model. We derive the local power of such tests and show that if the primary and secondary predictors are independent, then having an oracle estimator does not improve the local power of the score test. Conversely, if they are dependent, there is the potential for power gain. Simulations are used to validate the theoretical results and explore the extent of correlation needed between the primary and secondary covariates to observe an improvement of the power of the test by using the oracle estimator. Our conclusions are likely to hold more generally beyond the model of interactions considered here. PMID:20405045

  11. Testing an OMERACT MRI scoring system for peripheral psoriatic arthritis in cross-sectional and longitudinal settings

    DEFF Research Database (Denmark)

    McQueen, Fiona; Lassere, Marissa; Duer-Jensen, Anne

    2009-01-01

    OBJECTIVE: Magnetic resonance imaging (MRI) is increasingly used to measure articular inflammation and damage in patients with psoriatic arthritis (PsA). We evaluated the reliability of a new OMERACT PsA MRI scoring system, PsAMRIS, in PsA fingers. METHODS: In 2 separate studies, MRI scans were...... obtained from patients with clinical evidence of synovitis or dactylitis of the fingers. For the first cross-sectional study, images were obtained at one timepoint. For the second longitudinal study, images were obtained at 2 timepoints, 6 weeks apart. Scans were scored using PsAMRIS in an international......, reliability for change scores was acceptable only for synovitis and tenosynovitis. CONCLUSION: Further development and testing of the PsAMRIS is planned to improve its performance as a clinical and research tool to identify and measure pathology in peripheral joint PsA....

  12. Will Teacher Value-Added Scores Change When Accountability Tests Change? What We Know Series: Value-Added Methods and Applications. Knowledge Brief 8

    Science.gov (United States)

    McCaffrey, Daniel F.

    2013-01-01

    Value-added evaluations use student test scores to assess teacher effectiveness. How student achievement is judged can depend on which test is used to measure it. Thus it is reasonable to ask whether a teacher's value-added score depends on which test is used to calculate it. Would it change if a different test was used? Specifically, might a…

  13. Interpreting the g loadings of intelligence test composite scores in light of Spearman's law of diminishing returns.

    Science.gov (United States)

    Reynolds, Matthew R

    2013-03-01

    The linear loadings of intelligence test composite scores on a general factor (g) have been investigated recently in factor analytic studies. Spearman's law of diminishing returns (SLODR), however, implies that the g loadings of test scores likely decrease in magnitude as g increases, or they are nonlinear. The purpose of this study was to (a) investigate whether the g loadings of composite scores from the Differential Ability Scales (2nd ed.) (DAS-II, C. D. Elliott, 2007a, Differential Ability Scales (2nd ed.). San Antonio, TX: Pearson) were nonlinear and (b) if they were nonlinear, to compare them with linear g loadings to demonstrate how SLODR alters the interpretation of these loadings. Linear and nonlinear confirmatory factor analysis (CFA) models were used to model Nonverbal Reasoning, Verbal Ability, Visual Spatial Ability, Working Memory, and Processing Speed composite scores in four age groups (5-6, 7-8, 9-13, and 14-17) from the DAS-II norming sample. The nonlinear CFA models provided better fit to the data than did the linear models. In support of SLODR, estimates obtained from the nonlinear CFAs indicated that g loadings decreased as g level increased. The nonlinear portion for the nonverbal reasoning loading, however, was not statistically significant across the age groups. Knowledge of general ability level informs composite score interpretation because g is less likely to produce differences, or is measured less, in those scores at higher g levels. One implication is that it may be more important to examine the pattern of specific abilities at higher general ability levels.

  14. Comparison of Physical Therapy Anatomy Performance and Anxiety Scores in Timed and Untimed Practical Tests

    Science.gov (United States)

    Schwartz, Sarah M.; Evans, Cathy; Agur, Anne M.R.

    2015-01-01

    Students in health care professional programs face many stressful tests that determine successful completion of their program. Test anxiety during these high stakes examinations can affect working memory and lead to poor outcomes. Methods of decreasing test anxiety include lengthening the time available to complete examinations or evaluating…

  15. Comparison of Test Scores Obtained by Eighth Graders on Illustrated and Abstract Content Questions: A Quantitative

    Science.gov (United States)

    Aksakalli, Ayhan; Turgut, Umit; Salar, Riza

    2016-01-01

    The purpose of this study is to investigate whether students are more successful on abstract or illustrated test questions. To this end, the questions on an abstract test were changed into a visual format, and these tests were administered every three days to a total of 240 students at six middle schools located in the Erzurum city center and…

  16. Measuring English Language Workplace Proficiency across Subgroups: Using CFA Models to Validate Test Score Interpretation

    Science.gov (United States)

    Yoo, Hanwook; Manna, Venessa F.

    2017-01-01

    This study assessed the factor structure of the Test of English for International Communication (TOEIC®) Listening and Reading test, and its invariance across subgroups of test-takers. The subgroups were defined by (a) gender, (b) age, (c) employment status, (d) time spent studying English, and (e) having lived in a country where English is the…

  17. The Search for the Holy Grail: Content-Referenced Score Interpretations from Large-Scale Tests

    Science.gov (United States)

    Marion, Scott F.

    2015-01-01

    The measurement industry is in crisis. The public outcry against "over testing" and the opt-out movement are symptoms of a larger sociopolitical battle being fought over Common Core, teacher evaluation, federal intrusion, and a host of other issues, but much of the vitriol is directed at the tests and the testing industry. If we, as…

  18. GMAT and GRE Aptitude Test Performance in Relation to Primary Language and Scores on TOEFL.

    Science.gov (United States)

    Wilson, Kenneth M.

    This study was designed to describe and analyze (1) the performance of foreign candidates taking the Graduate Management Admission Test (GMAT) or the Graduate Record Examinations (GRE) Aptitude Test in relation to self-reported primary language (English vs. other), and (2) relationships between performance on the respective admissions tests and…

  19. Measuring English Language Workplace Proficiency across Subgroups: Using CFA Models to Validate Test Score Interpretation

    Science.gov (United States)

    Yoo, Hanwook; Manna, Venessa F.

    2017-01-01

    This study assessed the factor structure of the Test of English for International Communication (TOEIC®) Listening and Reading test, and its invariance across subgroups of test-takers. The subgroups were defined by (a) gender, (b) age, (c) employment status, (d) time spent studying English, and (e) having lived in a country where English is the…

  20. The Standard Error of a Proportion for Different Scores and Test Length.

    Directory of Open Access Journals (Sweden)

    David A. Walker

    2005-06-01

    Full Text Available This paper examines Smith's (2003 proposed standard error of a proportion index..associated with the idea of reliability as sufficiency of information. A detailed table..indexing all of the standard error values affiliated with assessments that range from 5 to..100 items, where students scored as low as 50% correct and 50% incorrect to as high as..95% correct and 5% incorrect, calculated in increments of 1 percentage point, is..presented, along with distributional qualities. Examples using this measure for classroom..teachers and higher education instructors of assessment are provided.

  1. Reliability and Practicality of the Core Score: Four Dynamic Core Stability Tests Performed in a Physician Office Setting.

    Science.gov (United States)

    Friedrich, Jason; Brakke, Rachel; Akuthota, Venu; Sullivan, William

    2017-07-01

    Pilot study to determine the practicality and inter-rater reliability of the "Core Score," a composite measure of 4 clinical core stability tests. Repeated measures. Academic hospital physician clinic. 23 healthy volunteers with mean age of 32 years (12 females, 11 males). All subjects performed 4 core stability maneuvers under direct observation from 3 independent physicians in sequence. Inter-rater reliability and time necessary to perform examination. The Core Score scale is 0 to 12, with 12 reflecting the best core stability. The mean composite score of all 4 tests for all subjects was 9.54 (SD, 1.897; range, 4-12). The intraclass correlation coefficients (ICC 1,1) for inter-rater reliability for the composite Core Score and 4 individual tests were 0.68 (Core Score), 0.14 (single-leg squat), 0.40 (supine bridge), 0.69 (side bridge), and 0.46 (prone bridge). The time required for a single examiner to assess a given subject's core stability in all 4 maneuvers averaged 4 minutes (range, 2-6 minutes). Even without specialized equipment, a clinically practical and moderately reliable measure of core stability may be possible. Further research is necessary to optimize this measure for clinical application. Despite the known value of core stability to athletes and patients with low back pain, there is currently no reliable and practical means for rating core stability in a typical office-based practice. This pilot study provides a starting point for future reliability research on clinical core stability assessments.

  2. Standardised test protocol (Constant Score) for evaluation of functionality in patients with shoulder disorders

    DEFF Research Database (Denmark)

    Ban, Ilija; Troelsen, Anders; Christiansen, David Høyrup;

    2013-01-01

    -culturally adapt this version into Danish. MATERIAL AND METHODS: An English test protocol was developed and translated into Danish at two independent centres according to international recommendations. Consensus on a preliminary version was achieved. The subjective part was tested on six patients, while two...... published in 2008 with several new recommendations, but a standardised test protocol was not included. Also, this new version has not been translated into Danish. The aims of the present study were to develop a standardised English test protocol for the newly modified CS, and to translate and cross...... differences. One of the authors of the modified CS approved both the English and the Danish test protocol. CONCLUSION: A simple test protocol of the modified CS was developed in both English and Danish. With precise terminology and definitions, the test protocol is the first of its kind. We suggest its use...

  3. Nose biopsy: a comparison between two sampling techniques.

    Science.gov (United States)

    Segal, Nili; Osyntsov, Lidia; Olchowski, Judith; Kordeluk, Sofia; Plakht, Ygal

    2016-06-01

    Pre operative biopsy is important in obtaining preliminary information that may help in tailoring the optimal treatment. The aim of this study was to compare two sampling techniques of obtaining nasal biopsy-nasal forceps and nasal scissors in terms of pathological results. Biopsies of nasal lesions were taken from patients undergoing nasal surgery by two techniques- with nasal forceps and with nasal scissors. Each sample was examined by a senior pathologist that was blinded to the sampling method. A grading system was used to rate the crush artifact in every sample (none, mild, moderate, severe). A comparison was made between the severity of the crush artifact and the pathological results of the two techniques. One hundred and forty-four samples were taken from 46 patients. Thirty-one were males and the mean age was 49.6 years. Samples taken by forceps had significantly higher grades of crush artifacts compared to those taken by scissors. The degree of crush artifacts had a significant influence on the accuracy of the pre operative biopsy. Forceps cause significant amount of crush artifacts compared to scissors. The degree of crush artifact in the tissue sample influences the accuracy of the biopsy.

  4. Assessment of the reliability in two groups of age, using the Qualitative Scoring System for the Bender Gestalt Test - Modified

    Directory of Open Access Journals (Sweden)

    César Ayax Merino Soto

    2011-01-01

    Full Text Available This study is looking for evidences of reliability, for the Qualification Qua- litative System (Brannigan y Brunner, 2002 applied to the Bender Gestalt Test – Modified. The participants were 86 children, divided in two groups: pre- school and school; and three students who scored the designs in both groups. The analysis was done in the final grade and the item. The results pointed to the good levels of results of external reliability and internal consistence in the pre- school group, while these levels were scored in the school group. These differences establish the relation between these two aspects of measurement error and the emphasis in an appropriate training of measurements that require the examiner’s judgments. We discussed our results considering the potential utility of this relative version of the Bender Gestalt Test for the clinical practice and investigation as well.

  5. The effect of an intervention program on functional movement screen test scores in mixed martial arts athletes.

    Science.gov (United States)

    Bodden, Jamie G; Needham, Robert A; Chockalingam, Nachiappan

    2015-01-01

    This study assessed the basic fundamental movements of mixed martial arts (MMA) athletes using the functional movement screen (FMS) assessment and determined if an intervention program was successful at improving results. Participants were placed into 1 of the 2 groups: intervention and control groups. The intervention group was required to complete a corrective exercise program 4 times per week, and all participants were asked to continue their usual MMA training routine. A mid-intervention FMS test was included to examine if successful results were noticed sooner than the 8-week period. Results highlighted differences in FMS test scores between the control group and intervention group (p = 0.006). Post hoc testing revealed a significant increase in the FMS score of the intervention group between weeks 0 and 8 (p = 0.00) and weeks 0 and 4 (p = 0.00) and no significant increase between weeks 4 and 8 (p = 1.00). A χ analysis revealed that the intervention group participants were more likely to have an FMS score >14 than participants in the control group at week 4 (χ = 7.29, p < 0.01) and week 8 (χ = 5.2, p ≤ 0.05). Finally, a greater number of participants in the intervention group were free from asymmetry at week 4 and week 8 compared with the initial test period. The results of the study suggested that a 4-week intervention program was sufficient at improving FMS scores. Most if not all, the movements covered on the FMS relate to many aspects of MMA training. The knowledge that the FMS can identify movement dysfunctions and, furthermore, the fact that the issues can be improved through a standardized intervention program could be advantageous to MMA coaches, thus, providing the opportunity to adapt and implement new additions to training programs.

  6. The Effects of a Translation Bias on the Scores for the "Basic Economics Test"

    Science.gov (United States)

    Hahn, Jinsoo; Jang, Kyungho

    2012-01-01

    International comparisons of economic understanding generally require a translation of a standardized test written in English into another language. Test results can differ based on how researchers translate the English written exam into one in their own language. To confirm this hypothesis, two differently translated versions of the "Basic…

  7. The Relative Effects of Traditional Lectures and Guided Notes Lectures on University Student Test Scores

    Science.gov (United States)

    Williams, W. Larry; Weil, Timothy M.; Porter, James C. K.

    2012-01-01

    Guided notes were employed in two undergraduate Psychology courses involving 71 students. The study design utilized an alternating treatments format to compare Traditional Lectures with Guided Notes lectures. In one of the two courses, tests were administered after each class lecture, whereas the same type of test was administered at the beginning…

  8. Health Behaviors and Standardized Test Scores: The Impact of School Health Climate on Performance

    Science.gov (United States)

    Gunter, Whitney D.; Daly, Kevin

    2013-01-01

    Research has found that many characteristics are related to performance on standardized tests. Many of these are not necessarily "academic" attributes. One area of this research is on the connection between physical health or lifestyles and test performance. The research that exists in this area is often disconnected with each other and…

  9. EAP Study Recommendations and Score Gains on the IELTS Academic Writing Test

    Science.gov (United States)

    Green, Anthony

    2005-01-01

    The IELTS test is widely accepted by university admissions offices as evidence of English language ability. The test is also used to guide decisions about the amount of language study required for students to satisfy admissions requirements. Guidelines currently published by the British Association of Lecturers in English for Academic Purposes…

  10. Wisconsin-Milesky Test of Lip Reading Potential: A Composite of Subtest Scores

    Science.gov (United States)

    Milesky, Samuel D.

    1977-01-01

    It is noted that the Wisconsin-Milesky Battery, which includes such tests as subtests from the Wechsler Intelligence Scale for Children and the Goodenough-Harris Draw-A-Man Test, provides a profile of elements predictive of the young deaf child's lip reading potential. (SBH)

  11. The Clergy Occupational Distress Index (CODI): background and findings from two samples of clergy.

    Science.gov (United States)

    Frenk, Steven M; Mustillo, Sarah A; Hooten, Elizabeth G; Meador, Keith G

    2013-06-01

    This study demonstrates the reliability and validity of the Clergy Occupational Distress Index (CODI). The five-item index allows researchers to measure the frequency that clergy, who traditionally have not been the subject of occupational health studies, experience occupational distress. We assess the reliability and validity of the index using two samples of clergy: a nationally representative sample of clergy and a sample of clergy from nine Protestant denominations. Exploratory factor analysis and Cronbach's scores are generated. Construct validity is measured by examining the association between CODI scores and depressive symptoms while controlling for demographic, ministerial, and health variables. In both samples, the five items of the CODI load onto a single factor and the Cronbach's alpha scores are robust. The regression model indicates that a high score on the CODI (i.e., more frequent occupational distress) is positively associated with having depressive symptoms within the last 4 weeks. The CODI can be used to identify clergy who frequently experience occupational distress and to understand how occupational distress affects clergy's health, ministerial career, and the functioning of their congregation.

  12. Establishing the Validity of TOEIC Bridge™ Test Scores for Students in Colombia, Chile, and Ecuador. Research Report. ETS RR-08-58

    Science.gov (United States)

    Sinharay, Sandip; Feng, Ying; Saldivia, Luis; Powers, Donald E.; Ginuta, Anthony; Simpson, Annabelle; Weng, Vincent

    2008-01-01

    The validity of TOEIC Bridge™ scores as a measure of English language skill was examined from the standpoint of a unified concept of test validity. In this study, more than 6,000 test takers in 3 Latin American countries (Chile, Colombia, and Ecuador) took 1 form of the TOEIC Bridge test, and their scores were compared to additional information…

  13. Use of e-rater[R] in Scoring of the TOEFL iBT[R] Writing Test. Research Report. ETS RR-11-25

    Science.gov (United States)

    Haberman, Shelby J.

    2011-01-01

    Alternative approaches are discussed for use of e-rater[R] to score the TOEFL iBT[R] Writing test. These approaches involve alternate criteria. In the 1st approach, the predicted variable is the expected rater score of the examinee's 2 essays. In the 2nd approach, the predicted variable is the expected rater score of 2 essay responses by the…

  14. Back extensor muscle endurance test scores in coal miners in Australia

    Energy Technology Data Exchange (ETDEWEB)

    Stewart, M.; Latimer, J.; Jamieson, M. [University of Sydney, Sydney, NSW (Australia). Faculty of Health and Science, School of Physiotherapy

    2003-06-01

    Low back pain is a common complaint among those working in the Australian coal mining industry. One test that may be predictive of first-time episodes of low back pain is the Biering-Sorensen test of back extensor endurance strength. While this test has been evaluated in overseas sedentary populations, normative data and the discriminative ability of the test have not been evaluated with coal miners. Eighty-eight coal miners completed a questionnaire for known risk factors for low back pain, performed the Biering-Sorensen test, and undertook a test of aerobic fitness. Data analysis was performed to describe the groups and to determine whether any significant difference existed between those with a past history of low back pain and those without. Significantly lower than expected holding times were found in this group of coal miners (mean 113 s). This result was significantly lower than demonstrated in previous studies. When holding times for those with a past history of low back pain were compared with times for those with no history of low back pain, the difference was not statistically significant, nor was there a significant difference in fitness between those with a past history of low back pain and those without. It is concluded that coal miners in Australia have lower than normal Biering-Sorensen holding times. This lower back holding time does not differ between coal miners with a past history of low back pain and those without.

  15. Second Language Reading Topic Familiarity and Test Score: Test-Taking Strategies for Multiple-Choice Comprehension Questions

    Science.gov (United States)

    Lee, Jia-Ying

    2011-01-01

    The main purpose of this study was to compare the strategies used by Chinese-speaking students when confronted with familiar versus unfamiliar topics in a multiple-choice format reading comprehension test. The focus was on describing what students do when they are taking reading comprehension tests by asking students to verbalize their thoughts.…

  16. PENENTUAN Streptococcus Group A PENYEBAB FARINGITIS PADA ANAK MENGGUNAKAN McIsaac SCORE DAN RAPID ANTIGEN DETECTION TEST (RADT DALAM UPAYA PENGGUNAAN ANTIBIOTIKA SECARA BIJAK

    Directory of Open Access Journals (Sweden)

    AA Agustia Sinta Dewi

    2014-03-01

    Full Text Available Pharyngitis can be caused by viruses and bacteria. The bacteria that most commonly causes pharyngitis is Streptococcus Group A. In the treatment of pharyngitis, it is very important to ensure the cause for determining the appropriate treatments, therefore unnecessary use of antibiotics can be avoided. Antibiotics should be prescribed in patients with pharyngitis caused by bacteria. Diagnostic test that can be applied to determine the causes of pharyngitis are McIsaac score and Rapid Antigen Detection Test (RADT. The purpose of this study was to investigate the presence of Streptococcus Group A as the cause of pharyngitis applying McIsaac scores and the RADT. This study was cross-sectional. Patients with the inclusion and exclusion criteria were given an initial assessment using the McIsaac score, subsequently tested with the RADT. The results gained from the McIsaac scores and subsequent RADT were compared.  It was found that as many as 124 patients suspected of having bacterial pharyngitis. Forty two of them were scored 3; 55 patients scored 4, and  27 patients scored 5. All patients tested with the RADT, only 18 patients gave positive results. Out of those 18 patients positively tested, 6 patients scored 3; 8 patients scored 4, and 4 patients scored 5. In was concluded that the use of RADT was better than McIsaac scores in determining pharyngitis caused by Streptococcus Group A.

  17. Polytrauma Defined by the New Berlin Definition: A Validation Test Based on Propensity-Score Matching Approach.

    Science.gov (United States)

    Rau, Cheng-Shyuan; Wu, Shao-Chun; Kuo, Pao-Jen; Chen, Yi-Chun; Chien, Peng-Chen; Hsieh, Hsiao-Yun; Hsieh, Ching-Hua

    2017-09-11

    Background: Polytrauma patients are expected to have a higher risk of mortality than that obtained by the summation of expected mortality owing to their individual injuries. This study was designed to investigate the outcome of patients with polytrauma, which was defined using the new Berlin definition, as cases with an Abbreviated Injury Scale (AIS) ≥ 3 for two or more different body regions and one or more additional variables from five physiologic parameters (hypotension [systolic blood pressure ≤ 90 mmHg], unconsciousness [Glasgow Coma Scale score ≤ 8], acidosis [base excess ≤ -6.0], coagulopathy [partial thromboplastin time ≥ 40 s or international normalized ratio ≥ 1.4], and age [≥70 years]). Methods: We retrieved detailed data on 369 polytrauma patients and 1260 non-polytrauma patients with an overall Injury Severity Score (ISS) ≥ 18 who were hospitalized between 1 January 2009 and 31 December 2015 for the treatment of all traumatic injuries, from the Trauma Registry System at a level I trauma center. Patients with burn injury or incomplete registered data were excluded. Categorical data were compared with two-sided Fisher exact or Pearson chi-square tests. The unpaired Student t-test and the Mann-Whitney U-test was used to analyze normally distributed continuous data and non-normally distributed data, respectively. Propensity-score matched cohort in a 1:1 ratio was allocated using the NCSS software with logistic regression to evaluate the effect of polytrauma on patient outcomes. Results: The polytrauma patients had a significantly higher ISS than non-polytrauma patients (median (interquartile range Q1-Q3), 29 (22-36) vs. 24 (20-25), respectively; p propensity score-matched pairs of polytrauma and non-polytrauma patients who showed no significant difference in sex, age, co-morbidity, AIS ≥ 3, and Injury Severity Score (ISS), the polytrauma patients had a significantly higher mortality rate (OR 17.5, 95% CI 4.21-72.76; p propensity-score

  18. Relationships Between Strategy Use, Listening Proficiency Level, Task Type, and Scores in an L2 Listening Test

    Directory of Open Access Journals (Sweden)

    Yi-Ching Pan

    2015-12-01

    Full Text Available Abstract We examined strategy use in relation to L2 proficiency, types of test task, and test performance in listening assessment. A total of 170 Taiwanese university students completed the Test of English for International Communication (TOEIC® practice listening test and questionnaires designed to measure cognitive and metacognitive strategies. We found that some strategies—voice and imagery inference and elaboration, approaches, and top-down processing strategies—were used with similar frequency regardless of learners’ proficiency, while others—planning, monitoring and evaluation, linguistic inference and elaboration, and bottom-up processing—were more often used by advanced listeners. Additionally, planning (and linguistic inference and elaboration, and top-down processing strategies were more often used in easier tasks. Finally, the relationship between reported strategy use and test scores was weak, accounting for 7% of the total score variance and 5% to 10% of the score variance for each task type section. Résumé Nous avons examiné l’usage de stratégies par rapport à la compétence en L2, le type de tâches de test et la performance au test lors d’une évaluation de l’écoute. Au total, 170 étudiants universitaires d’origine taiwanaise ont complété le test d’entrainement pour l’écoute et les questionnaires mesurant les stratégies cognitives et métacognitives du Test of English for International Communication (TOEICÒ. Nous avons découvert que certaines stratégies—voix, inférence et élaboration par imagerie, approches, traitement ascendant de stratégies—étaient utilisées à la même fréquence quel que soit le niveau de compétence des apprenants, tandis que d’autres—planification, contrôle et évaluation, inférence linguistique et élaboration, traitement ascendant de stratégies—étaient utilisées beaucoup plus fréquemment par les auditeurs avancés. De plus, les stratégies de

  19. Detection of acute deterioration in health status visit among COPD patients by monitoring COPD assessment test score

    Directory of Open Access Journals (Sweden)

    Pothirat C

    2015-02-01

    Full Text Available Chaicharn Pothirat, Warawut Chaiwong, Atikun Limsukon, Athavudh Deesomchok, Chalerm Liwsrisakun, Chaiwat Bumroongkit, Theerakorn Theerakittikul, Nittaya PhetsukDivision of Pulmonary, Critical Care and Allergy, Department of Internal Medicine, Faculty of Medicine, Chiang Mai University, Chiang Mai, ThailandBackground: The Chronic Obstructive Pulmonary Disease Assessment Test (CAT could play a role in detecting acute deterioration in health status during monitoring visits in routine clinical practice.Objective: To evaluate the discriminative property of a change in CAT score from a stable baseline visit for detecting acute deterioration in health status visits of chronic obstructive pulmonary disease (COPD patients.Methods: The CAT questionnaire was administered to stable COPD patients routinely attending the chest clinic of Chiang Mai University Hospital who were monitored using the CAT score every 1–3 months for 15 months. Acute deterioration in health status was defined as worsening or exacerbation. CAT scores at baseline, and subsequent visits with acute deterioration in health status were analyzed using the t-test. The receiver operating characteristic curve was performed to evaluate the discriminative property of change in CAT score for detecting acute deterioration during a health status visit.Results: A total of 354 follow-up visits were made by 140 patients, aged 71.1±8.4 years, with a forced expiratory volume in 1 second of 47.49%±18.2% predicted, who were monitored for 15 months. The mean CAT score change between stable baseline visits, by patients’ and physicians’ global assessments, were 0.05 (95% confidence interval [CI], -0.37–0.46 and 0.18 (95% CI, -0.23–0.60, respectively. At worsening visits, as assessed by patients, there was significant increase in CAT score (6.07; 95% CI, 4.95–7.19. There were also significant increases in CAT scores at visits with mild and moderate exacerbation (5.51 [95% CI, 4.39–6

  20. A risk score for predicting coronary artery disease in women with angina pectoris and abnormal stress test finding.

    Science.gov (United States)

    Lo, Monica Y; Bonthala, Nirupama; Holper, Elizabeth M; Banks, Kamakki; Murphy, Sabina A; McGuire, Darren K; de Lemos, James A; Khera, Amit

    2013-03-15

    Women with angina pectoris and abnormal stress test findings commonly have no epicardial coronary artery disease (CAD) at catheterization. The aim of the present study was to develop a risk score to predict obstructive CAD in such patients. Data were analyzed from 337 consecutive women with angina pectoris and abnormal stress test findings who underwent cardiac catheterization at our center from 2003 to 2007. Forward selection multivariate logistic regression analysis was used to identify the independent predictors of CAD, defined by ≥50% diameter stenosis in ≥1 epicardial coronary artery. The independent predictors included age ≥55 years (odds ratio 2.3, 95% confidence interval 1.3 to 4.0), body mass index angina pectoris and abnormal stress test findings. This tool, if validated, could help to guide testing strategies in women with angina pectoris.

  1. Laboratory assessment by combined z score values in proficiency tests: experience gained through the European Union proficiency tests for pesticide residues in fruits and vegetables.

    Science.gov (United States)

    Medina-Pastor, P; Mezcua, M; Rodríguez-Torreblanca, C; Fernández-Alba, A R

    2010-08-01

    The obligation for accredited laboratories to participate in proficiency tests under ISO 17025, performing multiresidue methods (MRMs) for pesticide residues, involves the reporting of a large number of individual z scores making the evaluation of the overall performance of the laboratories difficult. It entails, time and again, the need for ways to summarise the laboratory's overall assessment into a unique combined index. In addition, the need for ways to continually evaluate the performance of the laboratory over the years is equally acknowledged. For these reasons, following 14 years of the European Union Reference Laboratory for Pesticide Residues in Fruits and Vegetables (EUPT-FV), useful formulas have been designed to globally evaluate the assessment of the participating laboratories. The aim is to achieve a formula which is easy to understand, which can be applied and which fits the purposes of long-term evaluation detecting positive and negative trends. Moreover, consideration is needed for a fair compensation of bad results in MRM, taking into account the large number of compounds that are covered. It is therefore important to be aware of the difficulties in getting satisfactory values from a wide range of compounds. This work presents an evaluation of the main well-established combined z score formulas together with those new ones developed here which have been applied to the European proficiency test results (EUPTs) over the years. Previous formulas such as the rescaled sum of z score (RSZ), the sum squared of z score (SSZ) and the relative laboratory performance (RLP) are compared with the newer ones: the sum of weighted z scores (SWZ) and the sum of squared z scores (SZ2). By means of formula comparisons, conclusions on the advantages, drawbacks and the most fit-for-purpose approach are achieved.

  2. A score test for the agronomical overlap effect in a two-way classification model

    Directory of Open Access Journals (Sweden)

    Aquiles Darghan

    2014-12-01

    Full Text Available In some agricultural research, a treatment applied to an experimental unit may affect the response in the neighboring experimental units. This phenomenon is known as overlap. In this article, a test to evaluate this effect in the Draper and Guttman model was developed by imposing side conditions on the parameters of a two-way classification model to obtain a re-parameterized model which can be used in different neighboring patterns of experimental units, usually plants within a crop, whenever the nearest neighbor is considered a directly affected experimental unit and the two-way model is used. Three methods, namely maximum likelihood, least squares with side conditions and generalized inverse, were used to estimate the parameters of the original model in order to calculate the value of the test statistics for the null hypothesis associated with the absence of the overlapping effect. The three alternatives were invariant with respect to the use of test. The proposed test is simple to adopt and can be implemented in agronomy since its asymptotic nature is in agreement with the large number of experimental units which generally exist in this type of research, where each plant represents the experimental unit being assessed.

  3. Distributed Leadership and High-Stakes Testing: Examining the Relationship between Distributed Leadership and LEAP Scores

    Science.gov (United States)

    Boudreaux, Wilbert

    2011-01-01

    Educational stakeholders are aware that school administration has become an incredibly intricate dynamic that is too complex for principals to handle alone. Test-driven accountability has made the already daunting task of school administration even more challenging. Distributed leadership presents an opportunity to explore increased leadership…

  4. The College Ambition Program: Indicators of College Plans-Ambitions and Test Scores

    Science.gov (United States)

    Judy, Justina

    2011-01-01

    This study is part of a larger project that will test the effectiveness of the College Ambition Program intervention model beginning with two schools experimental schools and two control schools. The study will evaluate the effectiveness of the overall intervention, as well as each of the four specific programmatic components. Data will be…

  5. The Impact of Mobile Learning on Student Performance as Gauged by Standardised Test (NAPLAN) Scores

    Science.gov (United States)

    Males, Steven; Bate, Frank; Macnish, Jean

    2017-01-01

    This paper discusses the National Assessment Program for Literacy and Numeracy (NAPLAN) performance of Years Five, Seven and Nine students in standardised tests prior and post the implementation of a mobile learning initiative in a Western Australian school for boys. The school sees the use of ICT as important in enhancing its potential to deliver…

  6. Interpretations of Rod-and-Frame Test Scores: An Application of Pattern Analysis.

    Science.gov (United States)

    Haller, Otto; Edgington, Eugene S.

    1982-01-01

    Rod-and-frame test data of undergraduates were subjected to pattern analysis, which showed that most tilt toward the spatial position of the frame, while some utilize two frame cues, i.e., the nearest to vertical side and corner of the frame. Other interpretations of performance were not supported by results. (Author/RD)

  7. Nasalance Scores of Children with Repaired Cleft Palate Who Exhibit Normal Velopharyngeal Closure during Aerodynamic Testing

    Science.gov (United States)

    Zajac, David J.

    2013-01-01

    Purpose: To determine if children with repaired cleft palate and normal velopharyngeal (VP) closure as determined by aerodynamic testing exhibit greater acoustic nasalance than control children without cleft palate. Method: Pressure-flow procedures were used to identify 2 groups of children based on VP closure during the production of /p/ in the…

  8. Cognitive Ability and Personality Variables as Predictors of School Grades and Test Scores in Adolescents

    Science.gov (United States)

    Hofer, Manfred; Kuhnle, Claudia; Kilian, Britta; Fries, Stefan

    2012-01-01

    The predictive power of cognitive ability and self-control strength for self-reported grades and an achievement test were studied. It was expected that the variables use of time structure, academic procrastination, and motivational interference during learning further aid in predicting students' achievement because they are operative in situations…

  9. The Impact of Mobile Learning on Student Performance as Gauged by Standardised Test (NAPLAN) Scores

    Science.gov (United States)

    Males, Steven; Bate, Frank; Macnish, Jean

    2017-01-01

    This paper discusses the National Assessment Program for Literacy and Numeracy (NAPLAN) performance of Years Five, Seven and Nine students in standardised tests prior and post the implementation of a mobile learning initiative in a Western Australian school for boys. The school sees the use of ICT as important in enhancing its potential to deliver…

  10. Grading as a Reform Effort: Do Standards-Based Grades Converge with Test Scores?

    Science.gov (United States)

    Welsh, Megan E.; D'Agostino, Jerome V.; Kaniskan, Burcu

    2013-01-01

    Standards-based progress reports (SBPRs) require teachers to grade students using the performance levels reported by state tests and are an increasingly popular report card format. They may help to increase teacher familiarity with state standards, encourage teachers to exclude nonacademic factors from grades, and/or improve communication with…

  11. Cervical degenerative index: a new quantitative radiographic scoring system for cervical spondylosis with interobserver and intraobserver reliability testing

    Science.gov (United States)

    Garvey, Timothy A.; Schwender, James D.; Denis, Francis; Perra, Joseph H.; Transfeldt, Ensor E.; Winter, Robert B.; Wroblewski, Jill M.

    2009-01-01

    Background The lack of a widely available scoring system for cervical degenerative spondylosis encouraged the authors to establish and validate a systematic quantitative radiographic index. Materials and methods This study included intraobserver and interobserver reliability testing among three reviewers with different years of experience. Each observer independently scored four cervical radiographs of 48 patients at separate intervals, and statistical analysis of the grading was performed. Results There was high intraobserver and interobserver reliability between the two experienced observers. There was fair reliability between the less experienced observer and the more experienced observers. Conclusions The cervical degenerative index appears to be a reliable and reproducible radiographic assessment of cervical spondylosis. The index will have direct applicability for longitudinal study of cervical spondylosis and may be clinically relevant as well. PMID:19384631

  12. Predisposing factors of pneumothorax in percutaneous transthoracic fine needle aspiration biopsy: comparison between CT emphysema score and pulmonary function test

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Chang Ho; Park, Kyung Joo; Park, Dong Won; Jung, Kyung Il; Suh, Jung Ho [Ajou Univ. College of Medicine, Seoul (Korea, Republic of)

    1997-11-01

    To compare the CT emphysema score with various factors of pulmonary function test by simple spirometry and to use the result as a predictor of pneumothorax in percutaneous transthoracic fine needle aspiration biopsy. The CT scans of 106 patients who had undergone percutaneous transthoracic fine needle aspiration biopsy of lung lesions within the previous 18 months were retrospectively reviewed. In 75 of these 106 cases, the results of the pulmonary function test were also reviewed. On plain chest radiography, pneumothorax was noted in 20 cases (19%). Emphysema was blindly evaluated. We divided each lung into four segments and determined the severity and involved volume of emphysema, as seen on CT. Severity was classified as one of four grades, as follow : absence of emphysema=0 ; low attenuation area of less than 5mm=1 ; low attenuation area of more than 5mm, and vascular pruning with normal lung intervening=2 ; and diffuse low attenuation without intervening normal lung, and larger confluent low attenuation with vascular pruning and distortion of branching pattern occupying all or almost all the involved parenchyma=3. The involved area was also classified as one of four grades : less than 25%=1 ; 25 - 49%=2 ; 51 - 74%=3 ; and more than 75%=4. The CT emphysema score was defined as the average of the grade of severity multiplied by the grade of involved area. Pulmonary function tests, consisting of simple spirometry and a pulmonologist's interpretation, were evaluated. We also evaluated depth and size of lesion as known predisposing factors in postbioptic pneumothorax. Statistical analysis was performed using the chi-square test, Wilcoxon ranks sum W test and the student t test. A comparison between the two groups of occurrence(with or without pneumothorax) showed the emphysema scores to be 1.69{+-}2.0 and 1.11{+-}2.9, respectively ; there was thus no significant difference between the two groups (z= - 0.048, p>0.10). Nor were differences revealed by the

  13. Measurement of liver function for patients with cirrhosis by 13C-methacetin breath test compared with Child-Pugh score and routine liver function tests

    Institute of Scientific and Technical Information of China (English)

    LIU Yun-xiang; HUANG Liu-ye; WU Cheng-rong; CUI Jun

    2006-01-01

    @@ 13C-methacetin breath test was used for the evaluation of liver function, as for quantitative data could be achieved using this method, it had the characteristics of safety,quantification, and repetition and got recognition gradually through the world.1,2 We began this 13C-methacetin test to assess liver function of patients with cirrhosis from January 2002. The aim of this study was to explore the characteristic of this test for liver function evaluation and explore the correlation of this method with some clinical liver biochemical parameters and Child-Pugh score.

  14. [The effect of a warm-up protocol on the sit-and-reach test score in adolescent students].

    Science.gov (United States)

    Díaz-Soler, María Angeles; Vaquero-Cristóbal, Raquel; Espejo-Antúnez, Luis; López-Miñarro, Pedro Ángel

    2015-06-01

    Sit-and-reach tests are often used in physical education classes for measurement of hamstring extensibility in students, without a standar protocol to perform it. To analyze the effect of a warm-up protocol based on locomotion activities and stretching in the sit-and-reach scores in adolescent students. A total of 47 teenagers students (17 boys and 30 girls) performed the sit-and-reach test before, immediately after, and 5 and 10 minutes after completing a structured warm-up. The warm-up consisted on a part of continuous running, dynamic locomotor and mobility activities as well as static stretching of lower limbs (quadriceps, hamstrings, adductors, iliopsoas and gastrocnemius), with a total duration of 8 minutes. Between measurements after warm-up, the participants remained standing without performing any exercise and/or stretching. After warm-up there was a significant improvement in the sit-and-reach score (+ 2.15 cm) (p sit-and-reach test, comprised by locomotion, dynamic activities and stretching, improves significantly the distance achieved in this test. Copyright AULA MEDICA EDICIONES 2014. Published by AULA MEDICA. All rights reserved.

  15. Power and related statistical properties of conditional likelihood score tests for association studies in nuclear families with parental genotypes.

    Science.gov (United States)

    Li, Z; Gastwirth, J L; Gail, M H

    2005-05-01

    Both population based and family based case control studies are used to test whether particular genotypes are associated with disease. While population based studies have more power, cryptic population stratification can produce false-positive results. Family-based methods have been introduced to control for this problem. This paper presents the full likelihood function for family-based association studies for nuclear families ascertained on the basis of their number of affected and unaffected children. The likelihood of a family factors into the probability of parental mating type, conditional on offspring phenotypes, times the probability of offspring genotypes given their phenotypes and the parental mating type. The first factor can be influenced by population stratification, whereas the latter factor, called the conditional likelihood, is not. The conditional likelihood is used to obtain score tests with proper size in the presence of population stratification (see also Clayton (1999) and Whittemore & Tu (2000)). Under either the additive or multiplicative model, the TDT is known to be the optimal score test when the family has only one affected child. Thus, the class of score tests explored can be considered as a general family of TDT-like procedures. The relative informativeness of the various mating types is assessed using the Fisher information, which depends on the number of affected and unaffected offspring and the penetrances. When the additive model is true, families with parental mating type Aa x Aa are most informative. Under the dominant (recessive) model, however, a family with mating type Aa x aa(AA x Aa) is more informative than a family with doubly heterozygous (Aa x Aa) parents. Because we derive explicit formulae for all components of the likelihood, we are able to present tables giving required sample sizes for dominant, additive and recessive inheritance models.

  16. What do Klein et al. tell us about test scores in Texas?

    Directory of Open Access Journals (Sweden)

    Laurence A. Toenjes

    2005-08-01

    Full Text Available A paper appearing in this journal by Klein, Hamilton, McCaffrey and Stecher (2000 attempted to raise serious questions about the validity of the gains in student performance as measured by Texas' standardized test, the Texas Assessment of Academic Skills (TAAS. Part of their analysis was based on the results of three tests which they administered to 2,000 fifth grade students in 20 Texas schools. Although Klein et al. indicated that the 20 schools were not selected in a way which would insure that they were representative of the nearly 3,000 Texas schools that enrolled fifth graders, generalizations based upon the results for those schools were nonetheless offered. The purpose of this short paper is to demonstrate just how unrepresentative the 20 schools used by Klein et al. actually were, and in so doing to cast doubt on certain of their conclusions.

  17. Sistem Scoring Conversion TOEFL Paper Based Test (PBT Politeknik Negeri Cilacap Menggunakan Metode User Centered Design

    Directory of Open Access Journals (Sweden)

    Cahya Vikasari

    2017-06-01

    Full Text Available Sistem komputer interaktif untuk dipakai oleh useruntuk mendukung pekerjannya. User merupakan object yang penting didalam pengembangan dan pembangun sistem. User adalah personal-personal yang terlibat langsung dalam pemakaian aplikasi. Konsep dari UCD adalah user sebagai pusat dari proses pengembangan sistem, dan tujuan/sifat-sifat, konteks dan lingkungan sistem semua didasarkan dari pengalaman pengguna Pembangunan sistem skoring test TOEFL paper based test (PBT di UPT bahasa politeknik negeri cilacapmenggunakan metode UCD. Dengan menggunakan metode UCD sistem dapat   mempermudah dan mempercepat pendaftaran oleh calon pendaftar dengan tampilan antarmuka yang user friendly , mempermudah proses pengelolaan data dan rekap data pendaftar, mempermudah pengkonversian skor TOEFL yang dilakukan secara otomatis, serta  meminimalisir terjadinya kesalahan, duplikasi data dan duplikasi kegiatan.

  18. Changes in rod and frame test scores recorded in schoolchildren during development--a longitudinal study.

    Directory of Open Access Journals (Sweden)

    Jeff Bagust

    Full Text Available The Rod and Frame Test has been used to assess the degree to which subjects rely on the visual frame of reference to perceive vertical (visual field dependence-independence perceptual style. Early investigations found children exhibited a wide range of alignment errors, which reduced as they matured. These studies used a mechanical Rod and Frame system, and presented only mean values of grouped data. The current study also considered changes in individual performance. Changes in rod alignment accuracy in 419 school children were measured using a computer-based Rod and Frame test. Each child was tested at school Grade 2 and retested in Grades 4 and 6. The results confirmed that children displayed a wide range of alignment errors, which decreased with age but did not reach the expected adult values. Although most children showed a decrease in frame dependency over the 4 years of the study, almost 20% had increased alignment errors suggesting that they were becoming more frame-dependent. Plots of individual variation (SD against mean error allowed the sample to be divided into 4 groups; the majority with small errors and SDs; a group with small SDs, but alignments clustering around the frame angle of 18°; a group showing large errors in the opposite direction to the frame tilt; and a small number with large SDs whose alignment appeared to be random. The errors in the last 3 groups could largely be explained by alignment of the rod to different aspects of the frame. At corresponding ages females exhibited larger alignment errors than males although this did not reach statistical significance. This study confirms that children rely more heavily on the visual frame of reference for processing spatial orientation cues. Most become less frame-dependent as they mature, but there are considerable individual differences.

  19. MiPS (Mi Prostate Score Urine test) — EDRN Public Portal

    Science.gov (United States)

    The MiPS assay is a multiplex analysis of T2-ERG gene fusion, PCA3, and serum PSA (KLK3). It is commercially available through the University of Michigan MLabs. The MiPS assay tests for the presence of two prostate cancer biomarkers: a piece of RNA made from the PCA3 gene, found to be overactive in 95 percent of all prostate cancers, and another RNA marker that is found only when TMPRSS2 and ERG abnormally fuse. TMPRSS2:ERG, or T2-ERG, is a strong indicator of prostate cancer.

  20. Development of a High-fidelity Experimental Substructure Test Rig for Grid-scored Sandwich Panels in Wind Turbine Blades

    DEFF Research Database (Denmark)

    Laustsen, Steffen; Lund, Erik; Kühlmeier, L.;

    2014-01-01

    This paper outlines high-fidelity experimental substructure testing of sandwich panels which constitute the aerodynamic outer shell of modern wind turbine blades. A full-scale structural experimental and numerical characterisation of a composite wind turbine blade has been conducted. The developm...... of substructure tests for composite wind turbine blades. Furthermore, recommendations on the use of grid-scored sandwich structures in wind turbine blades are presented, which outline the sensitivity in terms of quasi-static strength to the established loading conditions.......This paper outlines high-fidelity experimental substructure testing of sandwich panels which constitute the aerodynamic outer shell of modern wind turbine blades. A full-scale structural experimental and numerical characterisation of a composite wind turbine blade has been conducted...

  1. ¿Exito en California? A Validity Critique of Language Program Evaluations and Analysis of English Learner Test Scores

    Directory of Open Access Journals (Sweden)

    Marilyn S. Thompson

    2002-01-01

    Full Text Available Several states have recently faced ballot initiatives that propose to functionally eliminate bilingual education in favor of English-only approaches. Proponents of these initiatives have argued an overall rise in standardized achievement scores of California's limited English proficient (LEP students is largely due to the implementation of English immersion programs mandated by Proposition 227 in 1998, hence, they claim Exito en California (Success in California. However, many such arguments presented in the media were based on flawed summaries of these data. We first discuss the background, media coverage, and previous research associated with California's Proposition 227. We then present a series of validity concerns regarding use of Stanford-9 achievement data to address policy for educating LEP students; these concerns include the language of the test, alternative explanations, sample selection, and data analysis decisions. Finally, we present a comprehensive summary of scaled-score achievement means and trajectories for California's LEP and non-LEP students for 1998-2000. Our analyses indicate that although scores have risen overall, the achievement gap between LEP and EP students does not appear to be narrowing.

  2. Addressing criticisms of existing predictive bias research: cognitive ability test scores still overpredict African Americans' job performance.

    Science.gov (United States)

    Berry, Christopher M; Zhao, Peng

    2015-01-01

    Predictive bias studies have generally suggested that cognitive ability test scores overpredict job performance of African Americans, meaning these tests are not predictively biased against African Americans. However, at least 2 issues call into question existing over-/underprediction evidence: (a) a bias identified by Aguinis, Culpepper, and Pierce (2010) in the intercept test typically used to assess over-/underprediction and (b) a focus on the level of observed validity instead of operational validity. The present study developed and utilized a method of assessing over-/underprediction that draws on the math of subgroup regression intercept differences, does not rely on the biased intercept test, allows for analysis at the level of operational validity, and can use meta-analytic estimates as input values. Therefore, existing meta-analytic estimates of key parameters, corrected for relevant statistical artifacts, were used to determine whether African American job performance remains overpredicted at the level of operational validity. African American job performance was typically overpredicted by cognitive ability tests across levels of job complexity and across conditions wherein African American and White regression slopes did and did not differ. Because the present study does not rely on the biased intercept test and because appropriate statistical artifact corrections were carried out, the present study's results are not affected by the 2 issues mentioned above. The present study represents strong evidence that cognitive ability tests generally overpredict job performance of African Americans.

  3. Understanding and using the brief Implicit Association Test: recommended scoring procedures.

    Directory of Open Access Journals (Sweden)

    Brian A Nosek

    Full Text Available A brief version of the Implicit Association Test (BIAT has been introduced. The present research identified analytical best practices for overall psychometric performance of the BIAT. In 7 studies and multiple replications, we investigated analytic practices with several evaluation criteria: sensitivity to detecting known effects and group differences, internal consistency, relations with implicit measures of the same topic, relations with explicit measures of the same topic and other criterion variables, and resistance to an extraneous influence of average response time. The data transformation algorithms D outperformed other approaches. This replicates and extends the strong prior performance of D compared to conventional analytic techniques. We conclude with recommended analytic practices for standard use of the BIAT.

  4. Assessing the discriminating power of item and test scores in the linear factor-analysis model

    Directory of Open Access Journals (Sweden)

    Pere J. Ferrando

    2012-01-01

    Full Text Available Las propuestas rigurosas y basadas en un modelo psicométrico para estudiar el impreciso concepto de "capacidad discriminativa" son escasas y generalmente limitadas a los modelos no-lineales para items binarios. En este artículo se propone un marco general para evaluar la capacidad discriminativa de las puntuaciones en ítems y tests que son calibrados mediante el modelo de un factor común. La propuesta se organiza en torno a tres criterios: (a tipo de puntuación, (b rango de discriminación y (c aspecto específico que se evalúa. Dentro del marco propuesto: (a se discuten las relaciones entre 16 medidas, de las cuales 6 parecen ser nuevas, y (b se estudian las relaciones entre ellas. La utilidad de la propuesta en las aplicaciones psicométricas que usan el modelo factorial se ilustra mediante un ejemplo empírico.

  5. Associations between Benzodiazepine Use and Neuropsychological Test Scores in Older Adults.

    Science.gov (United States)

    Helmes, Edward; Østbye, Truls

    2015-06-01

    Benzodiazepines are widely prescribed for anxiety, although use of this class of medications has been associated with dependency and cognitive changes. This article describes the study in which we investigated the relationship between the class of benzodiazepine available for use and associated performance on neuropsychological tests in a community sample of 1,754 older Canadians from the Canadian Study of Health and Aging. Benzodiazepines were classified as short-, intermediate-, and long-acting. Associations were calculated between each class of benzodiazepine and eight neuropsychological measures, using multiple regression analysis and controlling for demographic variables. Results showed different effects of the co-variates across the three drug classes, and short half-life benzodiazepines were not associated with any neuropsychological measure. Intermediate half-life and long half-life benzodiazepine use were each associated with two measures. Increased focus on specific domains of cognitive function is needed to improve our understanding of how benzodiazepine use influences cognition.

  6. Correlations between the scores of computerized adaptive testing, paper and pencil tests, and the Korean Medical Licensing Examination

    Directory of Open Access Journals (Sweden)

    Mee Young Kim

    2005-06-01

    Full Text Available To evaluate the usefulness of computerized adaptive testing (CAT in medical school, the General Examination for senior medical students was administered as a paper and pencil test (P&P and using CAT. The General Examination is a graduate examination, which is also a preliminary examination for the Korean Medical Licensing Examination (KMLE. The correlations between the results of the CAT and P&P and KMLE were analyzed. The correlation between the CAT and P&P was 0.8013 (p=0.000; that between the CAT and P&P was 0.7861 (p=0.000; and that between the CAT and KMLE was 0.6436 (p=0.000. Six out of 12 students with an ability estimate below 0.52 failed the KMLE. The results showed that CAT could replace P&P in medical school. The ability of CAT to predict whether students would pass the KMLE was 0.5 when the criterion of the theta value was set at -0.52 that was chosen arbitrarily for the prediction of pass or failure.

  7. Personality Assessment in the Diagnostic Manuals: On Mindfulness, Multiple Methods, and Test Score Discontinuities.

    Science.gov (United States)

    Bornstein, Robert F

    2015-01-01

    Recent controversies have illuminated the strengths and limitations of different frameworks for conceptualizing personality pathology (e.g., trait perspectives, categorical models), and stimulated debate regarding how best to diagnose personality disorders (PDs) in the Diagnostic and Statistical Manual of Mental Disorders (5th ed.), and in other diagnostic systems (i.e., the International Classification of Diseases, the Psychodynamic Diagnostic Manual). In this article I argue that regardless of how PDs are conceptualized and which diagnostic system is employed, multimethod assessment must play a central role in PD diagnosis. By complementing self-reports with evidence from other domains (e.g., performance-based tests), a broader range of psychological processes are engaged in the patient, and the impact of self-perception and self-presentation biases can be better understood. By providing the assessor with evidence drawn from multiple modalities, some of which provide converging patterns and some of which yield divergent results, a multimethod assessment compels the assessor to engage this evidence more deeply. The mindful processing that ensues can help minimize the deleterious impact of naturally occurring information processing bias and distortion on the part of the clinician (e.g., heuristics, attribution errors), bringing greater clarity to the synthesis and integration of assessment data.

  8. State-Trait Decomposition of Name Letter Test Scores and Relationships With Global Self-Esteem.

    Science.gov (United States)

    Perinelli, Enrico; Alessandri, Guido; Donnellan, M Brent; Łaguna, Mariola

    2017-01-09

    The Name Letter Test (NLT) assesses the degree that participants show a preference for an individual's own initials. The NLT was often thought to measure implicit self-esteem, but recent literature reviews do not equivocally support this hypothesis. Several authors have argued that the NLT is most strongly associated with the state component of self-esteem. The current research uses a modified STARTS model to (a) estimate the percentage of stable and transient components of the NLT and (b) estimate the covariances between stable/transient components of the NLT and stable/transient components of self-esteem and positive and negative affect. Two longitudinal studies were conducted with different time lags: In Study 1, participants were assessed daily for 7 consecutive days, whereas in Study 2, participants were assessed weekly for 8 consecutive weeks. Participants also completed a battery of questionnaires including global self-esteem, positive affect, and negative affect. In both studies, the NLT showed (a) high stability across time, (b) a high percentage of stable variance, (c) no significant covariance with stable and transient factors for global self-esteem, and (d) a different pattern of correlations with stable and transient factors of affect than global self-esteem. Collectively, these results further undermine the claim that the NLT is a valid measure of implicit self-esteem. Future work is needed to identify theoretically grounded correlates of the NLT. (PsycINFO Database Record

  9. The utility of respiratory inductance plethysmography in REM sleep scoring during multiple sleep latency testing.

    Science.gov (United States)

    Drakatos, Panagis; Higgins, Sean; Duncan, Iain; Bridle, Kate; Briscoe, Sam; Leschziner, Guy D; Kent, Brian D; Williams, Adrian J

    2016-08-01

    Rapid eye movement sleep (REM) presents with a characteristic erratic breathing pattern. We investigated the feasibility of using respiration, derived from respiratory inductance plethysmography (RIP), in conjunction with chin electromyography, electrocardiography and pulse oximetry to facilitate the identification of REM sleep (RespREM) during nocturnal polysomnography (NPSG) and Multiple Sleep Latency Testing (MSLT). The Cohen's weighted kappa for the presence of REM and its duration in 20 consecutive NPSGs, using RespREM and compared to the current guidelines, ranged between 0.74-0.93 and 0.68-0.73 respectively for 5 scorers. The respective intraclass correlation coefficients were above 0.89. In 97.7% of the Sleep-Onset-REM-Periods (SOREMPs) during 41 consecutive MSLTs with preserved RIP, the RespREM was present and in 46.6% it coincided with the REM onset, while in the majority of the remainder RespREM preceded conventional REM onset. The erratic breathing pattern during REM, derived from RIP, is present and easily recognisable during SOREMPs in the MSLTs and may serve as a useful adjunctive measurement in identifying REM sleep.

  10. Lord-Wingersky Algorithm Version 2.0 for Hierarchical Item Factor Models with Applications in Test Scoring, Scale Alignment, and Model Fit Testing.

    Science.gov (United States)

    Cai, Li

    2015-06-01

    Lord and Wingersky's (Appl Psychol Meas 8:453-461, 1984) recursive algorithm for creating summed score based likelihoods and posteriors has a proven track record in unidimensional item response theory (IRT) applications. Extending the recursive algorithm to handle multidimensionality is relatively simple, especially with fixed quadrature because the recursions can be defined on a grid formed by direct products of quadrature points. However, the increase in computational burden remains exponential in the number of dimensions, making the implementation of the recursive algorithm cumbersome for truly high-dimensional models. In this paper, a dimension reduction method that is specific to the Lord-Wingersky recursions is developed. This method can take advantage of the restrictions implied by hierarchical item factor models, e.g., the bifactor model, the testlet model, or the two-tier model, such that a version of the Lord-Wingersky recursive algorithm can operate on a dramatically reduced set of quadrature points. For instance, in a bifactor model, the dimension of integration is always equal to 2, regardless of the number of factors. The new algorithm not only provides an effective mechanism to produce summed score to IRT scaled score translation tables properly adjusted for residual dependence, but leads to new applications in test scoring, linking, and model fit checking as well. Simulated and empirical examples are used to illustrate the new applications.

  11. Behavioural linear standardized scoring system of the Lidia cattle breed by testing in herd: estimation of genetic parameters.

    Science.gov (United States)

    Pelayo, R; Solé, M; Sánchez, M J; Molina, A; Valera, M

    2016-10-01

    Docility is very important for cattle production, and many behavioural tests to measure this trait have been developed. However, very few objective behavioural tests to measure the opposite approach 'aggressive behaviour' have been described. Therefore, the aim of this work was to validate in the Lidia cattle breed a behavioural linear standardized scoring system that measure the aggressiveness and enable genetic analysis of behavioural traits expressing fearless and fighting ability. Reproducibility and repeatability measures were calculated for the 12 linear traits of this scoring system to assess its accuracy, and ranged from 85.3 and 94.2%, and from 66.7 to 97.9%, respectively. Genetic parameters were estimated using an animal model with a Bayesian approach. A total of 1202 behavioural records were used. The pedigree matrix contained 5001 individuals. Heritability values (with standard deviations) ranged between 0.13 (0.04) (Falls of the bull) and 0.41 (0.08) (Speed of approach to horse). Genetic correlations varied from 0.01 (0.07) to 0.90 (0.13). Finally, an exploratory factor analysis using the genetic correlation matrix was calculated. Three main factors were retained to describe the traditional genetic indexes aggressiveness, strength and mobility.

  12. Associations between MMPI-2-RF validity scale scores and extra-test measures of personality and psychopathology.

    Science.gov (United States)

    Forbey, Johnathan D; Lee, Tayla T C; Ben-Porath, Yossef S; Arbisi, Paul A; Gartland, Diane

    2013-08-01

    The current study explored associations between two potentially invalidating self-report styles detected by the Validity scales of the Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF), over-reporting and under-reporting, and scores on the MMPI-2-RF substantive, as well as eight collateral self-report measures administered either at the same time or within 1 to 10 days of MMPI-2-RF administration. Analyses were conducted with data provided by college students, male prisoners, and male psychiatric outpatients from a Veterans Administration facility. Results indicated that if either an over- or under-reporting response style was suggested by the MMPI-2-RF Validity scales, scores on the majority of the MMPI-2-RF substantive scales, as well as a number of collateral measures, were significantly affected in all three groups in the expected directions. Test takers who were identified as potentially engaging in an over- or under-reporting response style by the MMPI-2-RF Validity scales appeared to approach extra-test measures similarly regardless of when these measures were administered in relation to the MMPI-2-RF. Limitations and suggestions for future study are discussed.

  13. Associations between Symptom Validity Test failure and scores on the MMPI-2-RF validity and substantive scales.

    Science.gov (United States)

    Gervais, Roger O; Wygant, Dustin B; Sellbom, Martin; Ben-Porath, Yossef S

    2011-01-01

    This study examined the association between Symptom Validity Test (SVT) failure and the Minnesota Multiphasic Personality Inventory-2 Restructured Form (MMPI-2-RF; Ben-Porath & Tellegen, 2008), in the Forensic Disability Claimant samples described in the MMPI-2-RF Technical Manual (Tellegen & Ben-Porath, 2008 a, 2008b). SVTs used included the Word Memory Test (Green, 2003), the Computerized Assessment of Response Bias (Allen, Conder, Green, & Cox, 1997), the Medical Symptom Validity Test (Green, 2004), and the Test of Memory Malingering (Tombaugh, 1996). SVT failure was associated with significant elevations throughout the MMPI-2-RF overreporting validity scales and substantive scales. Pairwise contrasts between groups failing 0 and 3 SVTs revealed predominantly large effect sizes for the overreporting validity scales (d = 0.78-1.11), and many of the substantive scales, including the Cognitive Complaints (COG) scale. Results of this study demonstrate an association between SVT performance and elevated scores on the MMPI-2-RF. These results suggest that exaggeration of cognitive symptoms as demonstrated by SVT failure is also associated with overreported emotional, somatic, and neurocognitive complaints on the MMPI-2-RF.

  14. Determining differential item functioning and its effect on the test scores of selected pib indexes, using item response theory techniques

    Directory of Open Access Journals (Sweden)

    Pieter Schaap

    2001-02-01

    Full Text Available The objective of this article is to present the results of an investigation into the item and test characteristics of two tests of the Potential Index Batteries (PIB in terms of differential item functioning (DIP and the effect thereof on test scores of different race groups. The English Vocabulary (Index 12 and Spelling Tests (Index 22 of the PIB were analysed for white, black and coloured South Africans. Item response theory (IRT methods were used to identify items which function differentially for white, black and coloured race groups. Opsomming Die doel van hierdie artikel is om die resultate van n ondersoek na die item- en toetseienskappe van twee PIB (Potential Index Batteries toetse in terme van itemsydigheid en die invloed wat dit op die toetstellings van rassegroepe het, weer te gee. Die Potential Index Batteries (PIB se Engelse Woordeskat (Index 12 en Spellingtoetse (Index 22 is ten opsigte van blanke, swart en gekleurde Suid-Afrikaners ontleed. Itemresponsteorie (IRT is gebruik om items te identifiseer wat as sydig (DIP vir die onderskeie rassegroepe beskou kan word.

  15. SCORE TEST PARA EL EFECTO DEL COEFICIENTE DE SOLAPAMIENTO EN MODELOS DE SUPERFICIES DE RESPUESTA DE PRIMER Y SEGUNDO ORDEN

    Directory of Open Access Journals (Sweden)

    ENRIQUE DARGHAN

    2011-01-01

    Full Text Available En este artículo ha sido propuesto un test para el coeficiente de solapamiento para el modelo de Draper y Guttman utilizando modelos de superficies de respuesta de primer y segundo orden. El test está basado en el test de score de Rao y hace uso de la teoría de operadores de proyección perpendicular. El test puede utilizarse en diferentes patrones de vecindad siempre y cuando se considere al vecino más cercano como la unidad experimental directamente afectada por los tratamientos y los modelos de la superficie sean de primero y segundo orden. El método es simple de adoptar y puede implementarse en el campo de la agronomía o en la investigación de mercados, pues su naturaleza asintótica está en concordancia con el gran número de unidades experimentales generalmente presentes en este tipo de investigaciones.

  16. Measurement of coronary calcium scores or exercise testing as initial screening tool in asymptomatic subjects with ST-T changes on the resting ECG: An evaluation study

    NARCIS (Netherlands)

    C.A. Geluk (Christiane); R. Dikkers (Riksta); J.A. Kors (Jan); R.A. Tio (René); R.H.J.A. Slart (Riemer); R. Vliegenthart (Rozemarijn); H.L. Hillege (Hans); T.P. Willems (Tineke); P. de Jong (Paul); W.H. van Gilst (Wiek); M. Oudkerk (Matthijs); F. Zijlstra (Felix)

    2007-01-01

    textabstractBackground: Asymptomatic subjects at intermediate coronary risk may need diagnostic testing for risk stratification. Both measurement of coronary calcium scores and exercise testing are well established tests for this purpose. However, it is not clear which test should be preferred as in

  17. Performance of children with and without learning disabilities on Canter's Background Interference Procedure and Koppitz's scoring system for the Bender test.

    Science.gov (United States)

    Mitchell-Burns, J A

    2000-06-01

    Performance of 66 children, 30 with and 36 without learning disabilities, in four ways using the Bender Visual-motor Gestalt Test was compared. First, the test with the standard Koppitz scoring procedure, second with the Canter Background Interference Procedure sheet using the standard Koppitz scoring procedure; third, the Bender test on a standard sheet of paper using Canter scoring procedure; and fourth, the Canter Background Interference Procedure (BIP) sheet using the Canter scoring procedure. The effectiveness of the Canter procedure was examined when scored with an age-appropriate normative scoring system. This was accomplished by combining the Canter BIP interference sheet with the Koppitz scoring system. The children ranged in age from 6 to 10 years. Using discriminant analysis, all four methods correctly categorized statistically significant percentages of both types of students but there was a significant difference on the Canter BIP sheet using the Canter scoring procedure. This procedure classified students with the least absolute number and percentage of either false negatives or false positives when compared with the other three methods, suggesting that using this scoring method with the Bender Gestalt may be better for identifying younger children with learning disabilities.

  18. California mastitis test scores as indicators of subclinical intra-mammary infections at the end of lactation in dairy cows.

    Science.gov (United States)

    Bhutto, A L; Murray, R D; Woldehiwet, Z

    2012-02-01

    Intramammary infections (IMI) during the dry period can be reduced through the use of dry cow therapy (DCT); in the future, its blanket use is likely to be questioned in the light of public concern regarding the routine use of antibiotics in food producing animals. One possible alternative is to limit DCT to cows with IMI just before drying off, which would require a quick, simple identification of sub-clinical IMI. In the present study we examined quarter milk samples obtained from 240 cows one week before and on the day of drying off, using the California mastitis test (CMT) and for IMI by bacteriological culture. The results indicated that high CMT scores at drying off may be good indicators of IMI: there was a significant association between the frequency of isolation of major pathogens and the CMT score in milk samples obtained one week before (Pearson's χ(2)=27.04, df=4, p<0.001) and those at drying off (Pearson's χ(2)=25.87, df=4, p<0.001). Copyright © 2010 Elsevier Ltd. All rights reserved.

  19. The effects of Georgia's Choice curricular reform model on third grade science scores on the Georgia Criterion Referenced Competency Test

    Science.gov (United States)

    Phemister, Art W.

    The purpose of this study was to evaluate the effectiveness of the Georgia's Choice reading curriculum on third grade science scores on the Georgia Criterion Referenced Competency Test from 2002 to 2008. In assessing the effectiveness of the Georgia's Choice curriculum model this causal comparative study examined the 105 elementary schools that implemented Georgia's Choice and 105 randomly selected elementary schools that did not elect to use Georgia's Choice. The Georgia's Choice reading program used intensified instruction in an effort to increase reading levels for all students. The study used a non-equivalent control group with a pretest and posttest design to determine the effectiveness of the Georgia's Choice curriculum model. Findings indicated that third grade students in Non-Georgia's Choice schools outscored third grade students in Georgia's Choice schools across the span of the study.

  20. Cognitive capacity: no association with recovery of sensibility by Semmes Weinstein test score after peripheral nerve injury of the forearm.

    Science.gov (United States)

    Boender, Z J; Ultee, J; Hovius, S E R

    2010-02-01

    In the recovery process of sensibility after repair of a peripheral nerve injury of the forearm, not only age but also surgical repair techniques are of importance. If regenerating axons are misdirected, reorganisation or other adaptic processes are needed at the level of the somatosensory brain cortex. These processes are thought to be dependent on the patient's cognitive capacity. We conducted a prospective multicentre study to assess the association between cognitive capacity and recovery of sensibility after peripheral nerve damage of the forearm. Patients with a traumatic peripheral nerve lesion of the forearm and consecutive surgical repair were included. After 12 months, the patients were assessed with respect to recovery of sensibility (Semmes-Weinstein monofilaments) and cognitive capacity, with four tests assessing different aspects of cognitive functioning. Twenty-eight patients (25 male, three female; median age: 28.5 years; range: 15-79 years) with median and/or ulnar nerve injury of the forearm were included in the study. Younger age showed a positive association with sensory recovery (beta =-0.845, 95% CI: -1.456 to -0.233; p=0.01). No association was found between the cognitive-capacity tests used and sensory recovery. The present prospective study did not reveal any association between recovery of sensibility measured by Semmes-Weinstein test score and cognitive capacity. Further studies should be performed to confirm these results.

  1. Validation of phone interview for follow-up in clinical trials on dyspepsia: evaluation of the Glasgow Dyspepsia Severity Score and a Likert-scale symptoms test.

    Science.gov (United States)

    Calvet, X; Bustamante, E; Montserrat, A; Roqué, M; Campo, R; Gené, E; Brullet, E

    2000-08-01

    To validate two widely used dyspepsia scores performed by phone interview. Spanish translations of the Glasgow Dyspepsia Severity Score and a Likert-scale symptomatic test were evaluated. Responsiveness to the treatment, validity of the tests, and reproducibility of tests performed by phone interview were assessed. Gastroenterology and endoscopy unit of a county hospital. Group I consisted of 16 ulcer patients undergoing Helicobacter pylori eradication; Group II consisted of 29 healthy volunteers; and Group III consisted of 95 patients undergoing upper endoscopy. Glasgow Severity Dyspepsia Score and Likert test. Both tests showed adequate improvement (responsiveness) after H. pylori eradication. With regard to validity, the Glasgow and Likert test were significantly higher in 95 patients undergoing endoscopy than in 29 healthy controls. Analysis of reproducibility showed that intraobserver variation was low on both the Glasgow and Likert scores. No differences were found between consecutive tests regardless of whether both were performed by phone (24 patients) or one by phone and the other by clinical interview (40 patients). Interobserver variation analysis showed that the Glasgow test remained highly reproducible even when performed by different observers using different methods (clinical interview 8.83, phone 8.44, P = 0.12). By contrast, Likert-scale tests showed significant differences between observers for all symptoms except abdominal pain. (1) The Glasgow score is highly reproducible even when performed by different observers and using different methods. (2) By contrast, Likert tests show greater variability. To be reproducible in different conditions, they need to be performed by the same observer.

  2. Predictive value of grade point average (GPA), Medical College Admission Test (MCAT), internal examinations (Block) and National Board of Medical Examiners (NBME) scores on Medical Council of Canada qualifying examination part I (MCCQE-1) scores.

    Science.gov (United States)

    Roy, Banibrata; Ripstein, Ira; Perry, Kyle; Cohen, Barry

    2016-01-01

    To determine whether the pre-medical Grade Point Average (GPA), Medical College Admission Test (MCAT), Internal examinations (Block) and National Board of Medical Examiners (NBME) scores are correlated with and predict the Medical Council of Canada Qualifying Examination Part I (MCCQE-1) scores. Data from 392 admitted students in the graduating classes of 2010-2013 at University of Manitoba (UofM), College of Medicine was considered. Pearson's correlation to assess the strength of the relationship, multiple linear regression to estimate MCCQE-1 score and stepwise linear regression to investigate the amount of variance were employed. Complete data from 367 (94%) students were studied. The MCCQE-1 had a moderate-to-large positive correlation with NBME scores and Block scores but a low correlation with GPA and MCAT scores. The multiple linear regression model gives a good estimate of the MCCQE-1 (R2 =0.604). Stepwise regression analysis demonstrated that 59.2% of the variation in the MCCQE-1 was accounted for by the NBME, but only 1.9% by the Block exams, and negligible variation came from the GPA and the MCAT. Amongst all the examinations used at UofM, the NBME is most closely correlated with MCCQE-1.

  3. Linear-rank testing of a non-binary, responder-analysis, efficacy score to evaluate pharmacotherapies for substance use disorders.

    Science.gov (United States)

    Holmes, Tyson H; Li, Shou-Hua; McCann, David J

    2016-11-23

    The design of pharmacological trials for management of substance use disorders is shifting toward outcomes of successful individual-level behavior (abstinence or no heavy use). While binary success/failure analyses are common, McCann and Li (CNS Neurosci Ther 2012; 18: 414-418) introduced "number of beyond-threshold weeks of success" (NOBWOS) scores to avoid dichotomized outcomes. NOBWOS scoring employs an efficacy "hurdle" with values reflecting duration of success. Here, we evaluate NOBWOS scores rigorously. Formal analysis of mathematical structure of NOBWOS scores is followed by simulation studies spanning diverse conditions to assess operating characteristics of five linear-rank tests on NOBWOS scores. Simulations include assessment of Fisher's exact test applied to hurdle component. On average, statistical power was approximately equal for five linear-rank tests. Under none of conditions examined did Fisher's exact test exhibit greater statistical power than any of the linear-rank tests. These linear-rank tests provide good Type I and Type II error control for comparing distributions of NOBWOS scores between groups (e.g. active vs. placebo). All methods were applied to re-analyses of data from four clinical trials of differing lengths and substances of abuse. These linear-rank tests agreed across all trials in rejecting (or not) their null (equality of distributions) at ≤ 0.05. © The Author(s) 2016.

  4. From cutting edge to guideline: A first step in harmonization of the zebrafish embryotoxicity test (ZET) by describing the most optimal test conditions and morphology scoring system.

    Science.gov (United States)

    Beekhuijzen, Manon; de Koning, Coco; Flores-Guillén, Maria-Eugenia; de Vries-Buitenweg, Selinda; Tobor-Kaplon, Marysia; van de Waart, Beppy; Emmen, Harry

    2015-08-15

    In the last couple of years, the interest in the zebrafish embryotoxicity test (ZET) for use in developmental toxicity assessment has been growing exponentially. This is also evident from the recent proposal for updating the ICHS5 guideline. The methodology of the ZET used by the different groups varies greatly. To further evaluate its successfulness and to take the ZET to the next level, harmonization of procedures is crucial. In the present study, based on literature and empirical data, the most optimal study design regarding temperature, test chamber, exposure period, presence of chorion, solvent use, exposure method, choice of concentrations, and teratogenic classification is proposed. Furthermore, our morphology scoring system is reported in detail as protocol to further enhance study design harmonization.

  5. Angoff Method of Setting Cut Scores for High-Stakes Testing: Foley Catheter Checkoff as an Exemplar.

    Science.gov (United States)

    Kardong-Edgren, Suzan; Mulcock, Pamela M

    2016-01-01

    The Angoff method is a commonly used and legally defensible method for setting passing or cut scores for high-stakes examinations. It also can be used for setting passing scores on clinical skill checklists. Two variations of the Angoff method were compared with a traditional and arbitrary 75% passing score, using a Foley catheter insertion checklist as an exemplar. Both Angoff methods produced slightly lower scores than our traditional scoring; because of "must pass" steps on our checklist, 12 of 13 students still failed the evaluation. The project uncovered multiple variations of checklists within different courses and variations in teaching practices for this skill.

  6. Outlier removal, sum scores, and the inflation of the type I error rate in independent samples t tests : The power of alternatives and recommendations

    NARCIS (Netherlands)

    Bakker, M.; Wicherts, J.M.

    2014-01-01

    In psychology, outliers are often excluded before running an independent samples t test, and data are often nonnormal because of the use of sum scores based on tests and questionnaires. This article concerns the handling of outliers in the context of independent samples t tests applied to nonnormal

  7. An Argument against Using Standardized Test Scores for Placement of International Undergraduate Students in English as a Second Language (ESL) Courses

    Science.gov (United States)

    Kokhan, Kateryna

    2013-01-01

    Development and administration of institutional ESL placement tests require a great deal of financial and human resources. Due to a steady increase in the number of international students studying in the United States, some US universities have started to consider using standardized test scores for ESL placement. The English Placement Test (EPT)…

  8. Comparison of gross anatomy test scores using traditional specimens vs. QuickTime Virtual Reality animated specimens

    Science.gov (United States)

    Maza, Paul Sadiri

    movie modules. The comparison of the two sample group means of the examinations show that there was no difference in results between using QTVR movie modules to test gross anatomy knowledge versus using physical specimens. The results of this study are discussed to explain the benefits of using such computer based anatomy resources in gross anatomy assessments.

  9. Do candidate reactions relate to job performance or affect criterion-related validity? A multistudy investigation of relations among reactions, selection test scores, and job performance.

    Science.gov (United States)

    McCarthy, Julie M; Van Iddekinge, Chad H; Lievens, Filip; Kung, Mei-Chuan; Sinar, Evan F; Campion, Michael A

    2013-09-01

    Considerable evidence suggests that how candidates react to selection procedures can affect their test performance and their attitudes toward the hiring organization (e.g., recommending the firm to others). However, very few studies of candidate reactions have examined one of the outcomes organizations care most about: job performance. We attempt to address this gap by developing and testing a conceptual framework that delineates whether and how candidate reactions might influence job performance. We accomplish this objective using data from 4 studies (total N = 6,480), 6 selection procedures (personality tests, job knowledge tests, cognitive ability tests, work samples, situational judgment tests, and a selection inventory), 5 key candidate reactions (anxiety, motivation, belief in tests, self-efficacy, and procedural justice), 2 contexts (industry and education), 3 continents (North America, South America, and Europe), 2 study designs (predictive and concurrent), and 4 occupational areas (medical, sales, customer service, and technological). Consistent with previous research, candidate reactions were related to test scores, and test scores were related to job performance. Further, there was some evidence that reactions affected performance indirectly through their influence on test scores. Finally, in no cases did candidate reactions affect the prediction of job performance by increasing or decreasing the criterion-related validity of test scores. Implications of these findings and avenues for future research are discussed.

  10. Power of IRT in GWAS: successful QTL mapping of sum score phenotypes depends on interplay between risk allele frequency, variance explained by the risk allele, and test characteristics.

    Science.gov (United States)

    van den Berg, Stéphanie M; Service, Susan K

    2012-12-01

    As data from sequencing studies in humans accumulate, rare genetic variants influencing liability to disease and disorders are expected to be identified. Three simulation studies show that characteristics and properties of diagnostic instruments interact with risk allele frequency to affect the power to detect a quantitative trait locus (QTL) based on a test score derived from symptom counts or questionnaire items. Clinical tests, that is, tests that show a positively skewed phenotypic sum score distribution in the general population, are optimal to find rare risk alleles of large effect. Tests that show a negatively skewed sum score distribution are optimal to find rare protective alleles of large effect. For alleles of small effect, tests with normally distributed item parameters give best power for a wide range of allele frequencies. The item-response theory framework can help understand why an existing measurement instrument has more power to detect risk alleles with either low or high frequency, or both kinds.

  11. Investigating the Value of Section Scores for the "TOEFL iBT"® Test. "TOEFL iBT"® Research Report. TOEFL iBT-21. ETS Research Report RR-13-35

    Science.gov (United States)

    Sawaki, Yasuyo; Sinharay, Sandip

    2013-01-01

    This study investigates the value of reporting the reading, listening, speaking, and writing section scores for the "TOEFL iBT"® test, focusing on 4 related aspects of the psychometric quality of the TOEFL iBT section scores: reliability of the section scores, dimensionality of the test, presence of distinct score profiles, and the…

  12. Animal source foods have a positive impact on the primary school test scores of Kenyan schoolchildren in a cluster-randomised, controlled feeding intervention trial.

    Science.gov (United States)

    Hulett, Judie L; Weiss, Robert E; Bwibo, Nimrod O; Galal, Osman M; Drorbaugh, Natalie; Neumann, Charlotte G

    2014-03-14

    Micronutrient deficiencies and suboptimal energy intake are widespread in rural Kenya, with detrimental effects on child growth and development. Sporadic school feeding programmes rarely include animal source foods (ASF). In the present study, a cluster-randomised feeding trial was undertaken to determine the impact of snacks containing ASF on district-wide, end-term standardised school test scores and nutrient intake. A total of twelve primary schools were randomly assigned to one of three isoenergetic feeding groups (a local plant-based stew (githeri) with meat, githeri plus whole milk or githeri with added oil) or a control group receiving no intervention feeding. After the initial term that served as baseline, children were fed at school for five consecutive terms over two school years from 1999 to 2001. Longitudinal analysis was used controlling for average energy intake, school attendance, and baseline socio-economic status, age, sex and maternal literacy. Children in the Meat group showed significantly greater improvements in test scores than those in all the other groups, and the Milk group showed significantly greater improvements in test scores than the Plain Githeri (githeri+oil) and Control groups. Compared with the Control group, the Meat group showed significant improvements in test scores in Arithmetic, English, Kiembu, Kiswahili and Geography. The Milk group showed significant improvements compared with the Control group in test scores in English, Kiswahili, Geography and Science. Folate, Fe, available Fe, energy per body weight, vitamin B₁₂, Zn and riboflavin intake were significant contributors to the change in test scores. The greater improvements in test scores of children receiving ASF indicate improved academic performance, which can result in greater academic achievement.

  13. Screening of intellectual maturity: exploring South African preschoolers' scores on the Goodenough-Harris Drawing Test and teachers' assessment.

    Science.gov (United States)

    Loxton, Helene; Mostert, Jemona; Moffatt, Diane

    2006-10-01

    The present study explored the relationship between South African preschool children's intelligence scores achieved on the Goodenough-Harris Drawing Test (GHD), and the accuracy of teachers' ratings of the human-figure drawings and teachers' general perceptions of children's intellectual maturity. The GHD was administered individually to 30 boys and 30 girls between the ages of 4 and 6 years (M = 4.5. SD = 0.7) from a multicultural (Black, Colored, and White) preschool near the Cape, South Africa. The three class teachers of these preschoolers provided the ratings and perceptions of each child's intellectual maturity. Results indicated that the teachers' assessments of children's intellectual maturity were fairly similar to the formal measures of children's intellectual maturity using human figure drawings and their own perceptions. It appears that teacher ratings of drawings could be relied upon as a means of assessment. General perceptions of children's intellectual maturity should not be solely relied upon, but instead these perceptions should be used as an aid for enhancing the teachers' assessment of children's intellectual maturity in addition to the rating of human figure drawings.

  14. Unexplained Graft Dysfunction after Heart Transplantation—Role of Novel Molecular Expression Test Score and QTc-Interval: A Case Report

    Directory of Open Access Journals (Sweden)

    Khurram Shahzad

    2010-01-01

    Full Text Available In the current era of immunosuppressive medications there is increased observed incidence of graft dysfunction in the absence of known histological criteria of rejection after heart transplantation. A noninvasive molecular expression diagnostic test was developed and validated to rule out histological acute cellular rejection. In this paper we present for the first time, longitudinal pattern of changes in this novel diagnostic test score along with QTc-interval in a patient who was admitted with unexplained graft dysfunction. Patient presented with graft failure with negative findings on all known criteria of rejection including acute cellular rejection, antibody mediated rejection and cardiac allograft vasculopathy. The molecular expression test score showed gradual increase and QTc-interval showed gradual prolongation with the gradual decline in graft function. This paper exemplifies that in patients presenting with unexplained graft dysfunction, GEP test score and QTc-interval correlate with the changes in the graft function.

  15. What "No Child Left Behind" Leaves behind: The Roles of IQ and Self-Control in Predicting Standardized Achievement Test Scores and Report Card Grades

    Science.gov (United States)

    Duckworth, Angela L.; Quinn, Patrick D.; Tsukayama, Eli

    2012-01-01

    The increasing prominence of standardized testing to assess student learning motivated the current investigation. We propose that standardized achievement test scores assess competencies determined more by intelligence than by self-control, whereas report card grades assess competencies determined more by self-control than by intelligence. In…

  16. What "No Child Left Behind" Leaves behind: The Roles of IQ and Self-Control in Predicting Standardized Achievement Test Scores and Report Card Grades

    Science.gov (United States)

    Duckworth, Angela L.; Quinn, Patrick D.; Tsukayama, Eli

    2012-01-01

    The increasing prominence of standardized testing to assess student learning motivated the current investigation. We propose that standardized achievement test scores assess competencies determined more by intelligence than by self-control, whereas report card grades assess competencies determined more by self-control than by intelligence. In…

  17. A Case for Adjusting Subjectively Rated Scores in the Advanced Placement Tests. Program Statistics Research. Technical Report No. 94-5.

    Science.gov (United States)

    Longford, Nicholas T.

    A case is presented for adjusting the scores for free response items in the Advanced Placement (AP) tests. Using information about the rating process from the reliability studies, administrations of the AP test for three subject areas, psychology, computer science, and English language and composition, are analyzed. In the reliability studies, 299…

  18. Evaluation of the reliability of preoperative descriptive airway assessment tests in prediction of the Cormack-Lehane score: A prospective randomized clinical study.

    Science.gov (United States)

    Selvi, Onur; Kahraman, Tugce; Senturk, Ozgur; Tulgar, Serkan; Serifsoy, Ercan; Ozer, Zeliha

    2017-02-01

    In this study we investigated and compared the predictive values of different airway assessments tests including thyromental height measurement test, which has been recently suggested, in difficult laryngoscopy (Cormack and Lehane [C-L] scores 3 and 4). In addition, we compared the effectiveness of methods and C-L scores, by IDS, in terms of predicting difficult intubation. Prospective, blinded study. Maltepe University. Four hundred fifty-one patients selected randomly who underwent general anesthesia. In this study we compared predictive value of thyromental height measurement test (TMH), which has been recently suggested, modified Mallampati test (MMT), upper lip bite test (ULBT), and thyromental distance measurement test (TMD) in difficult laryngoscopy. Final C-L scores were compared with intubation difficulty scale (IDS) in terms of predicting difficult intubation. Patient's American Society of Anesthesiology score, age and weight were recorded. TMH, TMD, MMT, ULBT, IDS and C-L scores were measured and determined. The optimal cut-off point for TMH for predicting difficult laryngoscopy was 43.5 mm and for TMD was 82.06 mm. Use of TMH <43.5 with MMT has the highest sensitivity for predicting difficult intubation (78.38) with 75.36% specificity and 97.50% negative predictive value. TMH showed sensitivity of 91.89% and specificity 52.17% at 50 mm cut-off value. In the comparison of the area under the receiver operating characteristic curve values, none of the tests came forth individually or in combination with MMT test. The present study demonstrates the practicality of TMH as a digitalized test however the clinical benefits of TMH in daily medical practice are drawn into question. The additional variable of race may have had some bearing on this and further studies, larger in patient sample size, may need to use different methodology concerning age-, sex-, and race-dependent variables in evaluating these tests. Copyright © 2016 Elsevier Inc. All rights reserved.

  19. Poorer clock draw test scores are associated with greater functional impairment in peripheral artery disease: The Walking and Leg Circulation Study II

    Science.gov (United States)

    Zimmermann, Laura J; Ferrucci, Luigi; Liu, Kiang; Tian, Lu; Guralnik, Jack M; Criqui, Michael H; Liao, Yihua; McDermott, Mary M

    2016-01-01

    We hypothesized that, in the absence of clinically recognized dementia, cognitive dysfunction measured by the clock draw test (CDT) is associated with greater functional impairment in men and women with peripheral artery disease (PAD). Participants were men and women aged 60 years and older with Mini-Mental Status Examination scores ≥ 24 with PAD (n = 335) and without PAD (n = 234). We evaluated the 6-minute walk test, 4-meter walking velocity at usual and fastest pace, the Short Physical Performance Battery (SPPB), and accelerometer-measured physical activity. CDTs were scored using the Shulman system as follows: Category 1 (worst): CDT score 0–2; Category 2: CDT score 3; Category 3 (best): CDT score 4–5. Results were adjusted for age, sex, race, education, ankle–brachial index (ABI), and comorbidities. In individuals with PAD, lower CDT scores were associated with slower 4-meter usual-paced walking velocity (Category 1: 0.78 meters/second; Category 2: 0.83 meters/second; Category 3: 0.86 meters/second; p-trend = 0.025) and lower physical activity (Category 1: 420 activity units; Category 2: 677 activity units; Category 3: 701 activity units; p-trend = 0.045). Poorer CDT scores were also associated with worse functional performance in individuals without PAD (usual and fast-paced walking velocity and SPPB, p-trend = 0.022, 0.043, and 0.031, respectively). In conclusion, cognitive impairment identified with CDT is independently associated with greater functional impairment in older, dementia-free individuals with and without PAD. Longitudinal studies are necessary to explore whether baseline CDT scores and changes in CDT scores over time can predict long-term decline in functional performance in individuals with and without PAD. PMID:21636676

  20. Co-norming the WAIS-III and WMS-III: Is there a test-order effect on IQ and memory scores?

    Science.gov (United States)

    Zhu, J; Tulsky, D S

    2000-11-01

    Test-order effect on the WAIS-III and WMS-III scores was evaluated using the WMS-III standardization sample. Participants completed the standardization editions of the WAIS-III and WMS-III in one session, with the tests administered in roughly counterbalanced order. Repeated measure MANOVA analyses were conducted to determine if there was an overall test-order effect for subtest, index, or IQ scores. No significant test-order effects were found for either the WAIS-III index or IQ scores or for the WMS-III index scores. At the subtest level, the majority of the WAIS-III and WMS-III subtests did not show a significant test-order effect. The exceptions were Digit Span and Digit Symbol-Coding on the WAIS-III and Faces II and Logical Memory II on the WMS-III. Although statistically significant test-order effects were found on these subtests, the effect sizes were small. This study indicates that the test-order effect is not a potential threat to the internal validity of the WAIS-III and WMS-III normative data. The practical implications of the current study are discussed.

  1. Grouped to Achieve: Are There Benefits to Assigning Students to Heterogeneous Cooperative Learning Groups Based on Pre-Test Scores?

    Science.gov (United States)

    Werth, Arman Karl

    Cooperative learning has been one of the most widely used instructional practices around the world since the early 1980's. Small learning groups have been in existence since the beginning of the human race. These groups have grown in their variance and complexity overtime. Classrooms are getting more diverse every year and instructors need a way to take advantage of this diversity to improve learning. The purpose of this study was to see if heterogeneous cooperative learning groups based on student achievement can be used as a differentiated instructional strategy to increase students' ability to demonstrate knowledge of science concepts and ability to do engineering design. This study includes two different groups made up of two different middle school science classrooms of 25-30 students. These students were given an engineering design problem to solve within cooperative learning groups. One class was put into heterogeneous cooperative learning groups based on student's pre-test scores. The other class was grouped based on random assignment. The study measured the difference between each class's pre-post gains, student's responses to a group interaction form and interview questions addressing their perceptions of the makeup of their groups. The findings of the study were that there was no significant difference between learning gains for the treatment and comparison groups. There was a significant difference between the treatment and comparison groups in student perceptions of their group's ability to stay on task and manage their time efficiently. Both the comparison and treatment groups had a positive perception of the composition of their cooperative learning groups.

  2. The statistical performance of an MCF-7 cell culture assay evaluated using generalized linear mixed models and a score test.

    Science.gov (United States)

    Rey deCastro, B; Neuberg, Donna

    2007-05-30

    Biological assays often utilize experimental designs where observations are replicated at multiple levels, and where each level represents a separate component of the assay's overall variance. Statistical analysis of such data usually ignores these design effects, whereas more sophisticated methods would improve the statistical power of assays. This report evaluates the statistical performance of an in vitro MCF-7 cell proliferation assay (E-SCREEN) by identifying the optimal generalized linear mixed model (GLMM) that accurately represents the assay's experimental design and variance components. Our statistical assessment found that 17beta-oestradiol cell culture assay data were best modelled with a GLMM configured with a reciprocal link function, a gamma error distribution, and three sources of design variation: plate-to-plate; well-to-well, and the interaction between plate-to-plate variation and dose. The gamma-distributed random error of the assay was estimated to have a coefficient of variation (COV) = 3.2 per cent, and a variance component score test described by X. Lin found that each of the three variance components were statistically significant. The optimal GLMM also confirmed the estrogenicity of five weakly oestrogenic polychlorinated biphenyls (PCBs 17, 49, 66, 74, and 128). Based on information criteria, the optimal gamma GLMM consistently out-performed equivalent naive normal and log-normal linear models, both with and without random effects terms. Because the gamma GLMM was by far the best model on conceptual and empirical grounds, and requires only trivially more effort to use, we encourage its use and suggest that naive models be avoided when possible. Copyright 2006 John Wiley & Sons, Ltd.

  3. Combining one-sample confidence procedures for inference in the two-sample case.

    Science.gov (United States)

    Fay, Michael P; Proschan, Michael A; Brittain, Erica

    2015-03-01

    We present a simple general method for combining two one-sample confidence procedures to obtain inferences in the two-sample problem. Some applications give striking connections to established methods; for example, combining exact binomial confidence procedures gives new confidence intervals on the difference or ratio of proportions that match inferences using Fisher's exact test, and numeric studies show the associated confidence intervals bound the type I error rate. Combining exact one-sample Poisson confidence procedures recreates standard confidence intervals on the ratio, and introduces new ones for the difference. Combining confidence procedures associated with one-sample t-tests recreates the Behrens-Fisher intervals. Other applications provide new confidence intervals with fewer assumptions than previously needed. For example, the method creates new confidence intervals on the difference in medians that do not require shift and continuity assumptions. We create a new confidence interval for the difference between two survival distributions at a fixed time point when there is independent censoring by combining the recently developed beta product confidence procedure for each single sample. The resulting interval is designed to guarantee coverage regardless of sample size or censoring distribution, and produces equivalent inferences to Fisher's exact test when there is no censoring. We show theoretically that when combining intervals asymptotically equivalent to normal intervals, our method has asymptotically accurate coverage. Importantly, all situations studied suggest guaranteed nominal coverage for our new interval whenever the original confidence procedures themselves guarantee coverage.

  4. The Effect of Nonnormality on Student's Two-Sample T Test.

    Science.gov (United States)

    Delaney, Harold D.; Vargha, Andras

    While violation of the homogeneity of variance assumption has received considerable attention, violation of the assumption of normally distributed data has not received as much attention. As a result, researchers may have the mistaken impression that as long as the assumptions of independence of observations and homogeneity of variance are…

  5. Two-Sample, Bivariate Hypothesis Testing Methods Based on Tukey's Depth.

    Science.gov (United States)

    Wilcox, Rand R.

    2003-01-01

    Conducted simulations to explore methods for comparing bivariate distributions corresponding to two independent groups, all of which are based on Tukey's "depth," a generalization of the notion of ranks to multivariate data. Discusses steps needed to control Type I error. (SLD)

  6. The Implementation of Role-Playing Model in Principles of Finance Accounting Learning to Improve Students’ Enjoyment and Students’ Test Scores

    Directory of Open Access Journals (Sweden)

    L. Saptono

    2010-01-01

    Full Text Available This research is a classroom action research. The goal of conducting this research is to improve students’ enjoyment level and their test scores by implementing role-playing method. The research is conducted in Accounting Education Study Program of Sanata Dharma University at odd semester on academic year 2010/2011. The participants were divided into two classes. The first class was the class that got the treatment, while the second class was the control class. The result of the study showed that there was an improvement of students’ enjoyment level and test scores in the class which implemented role-playing method.

  7. Prediction of mortality using on-line, self-reported health data: empirical test of the RealAge score.

    Directory of Open Access Journals (Sweden)

    William R Hobbs

    Full Text Available OBJECTIVE: We validate an online, personalized mortality risk measure called "RealAge" assigned to 30 million individuals over the past 10 years. METHODS: 188,698 RealAge survey respondents were linked to California Department of Public Health death records using a one-way cryptographic hash of first name, last name, and date of birth. 1,046 were identified as deceased. We used Cox proportional hazards models and receiver operating characteristic (ROC curves to estimate the relative scales and predictive accuracies of chronological age, the RealAge score, and the Framingham ATP-III score for hard coronary heart disease (HCHD in this data. To address concerns about selection and to examine possible heterogeneity, we compared the results by time to death at registration, underlying cause of death, and relative health among users. RESULTS: THE REALAGE SCORE IS ACCURATELY SCALED (HAZARD RATIOS: age 1.076; RealAge-age 1.084 and more accurate than chronological age (age c-statistic: 0.748; RealAge c-statistic: 0.847 in predicting mortality from hard coronary heart disease following survey completion. The score is more accurate than the Framingham ATP-III score for hard coronary heart disease (c-statistic: 0.814, perhaps because self-reported cholesterol levels are relatively uninformative in the RealAge user sample. RealAge predicts deaths from malignant neoplasms, heart disease, and external causes. The score does not predict malignant neoplasm deaths when restricted to users with no smoking history, no prior cancer diagnosis, and no indicated health interest in cancer (p-value 0.820. CONCLUSION: The RealAge score is a valid measure of mortality risk in its user population.

  8. Predicting Second Grade Achievement Scores with the Slosson Intelligence Test, Peabody Picture Vocabulary Test, Goodenough-Harris Drawing Test, Developmental Test of Visual Motor and the Metropolitan Readiness Test.

    Science.gov (United States)

    Flynn, Timothy M.

    The predictive validity of the Slosson Intelligence Test, Peabody Picture Vocabulary Test, Goodenough-Harris Drawing Test, Developmental Test of Visual Motor Integration, and the Metropolitan Readiness Test was evaluated for use with kindergarten children. The criterion measure was the California Achievement Tests administered when the children…

  9. Might the Rorschach be a projective test after all? Social projection of an undesired trait alters Rorschach Oral Dependency scores.

    Science.gov (United States)

    Bornstein, Robert F

    2007-06-01

    The degree to which projection plays a role in Rorschach (Rorschach, 1921/1942) responding remains controversial, in part because extant data have yielded inconclusive results. In this investigation, I examined the impact of social projection on Rorschach Oral Dependency (ROD) scores using methods adapted from social cognition research. In Study 1, I prescreened 85 college students (40 women and 45 men) with the ROD scale and a widely used self-report measure of dependency, the Interpersonal Dependency Inventory (IDI; Hirschfeld et al., 1977). Results show that informing participants who scored low on the IDI that they were in fact highly dependent led to significant increases in ROD scores; I did not obtain parallel ROD increases for participants who scored high on the IDI or for participants who received low-dependent feedback. In Study 2, I examined a separate sample of 80 prescreened college students (40 women and 40 men) and showed that providing low self-report participants an opportunity to attribute dependency to a fictional target person prior to Rorschach responding attenuated the impact of high-dependent feedback on ROD scores. These results suggest that projection played a role in at least one domain of Rorschach responding. I discuss theoretical, clinical, and empirical implications of these results.

  10. Construct validity of change scores of the Chair Stand Test versus Timed Up and Go Test, KOOS questionnaire and the isometric muscle strength test in patients with severe knee osteoarthritis undergoing total knee replacement.

    Science.gov (United States)

    Huber, Erika O; Meichtry, Andre; de Bie, Rob A; Bastiaenen, Caroline H

    2016-02-01

    The Chair Stand Test (CST) is a frequently used performance-based test in clinical studies involving individuals with knee osteoarthritis and demonstrates good reliability. To assess the construct validity of change scores of the CST compared to three other measures in patients before and after total knee replacement surgery. The construct validity of change scores of the CST compared to the Timed Up and Go (TUG) test, the Knee Injury and Osteoarthritis Outcome Score questionnaire (KOOS, subscale ADL) and the isometric muscle strength test of the knee extensors (IMS sum) was measured 1-2 week before and 3 months after surgery. Change (%) CST = -4.45, TUG = -2.08, KOOS ADL = 43.90, IMS sum = -13.24. Correlations CST-TUG = 0.56 (95% confidence interval (CI) 0.29, 0.74), CST-KOOS = -0.31 (95% CI -0.57, 0.01), CST-IMS sum = -0.11 (95% CI -0.42, 0.22). Comparison of pairwise correlations: CST-KOOS versus CST-TUG (p < 0.0004), CST-TUG versus CST-IMS sum (p < 0.0068), CST-KOOS versus CST-IMS sum (p < 0.3100). For patients undergoing TKR, the CST might not be an ideal measure to assess change between pre-surgery and 3 months post-surgery. Construct validity of change scores was close to zero but the result might have been influenced by the relatively small homogeneous sample size and the chosen timespan of measurement. We ordered pairwise correlations based on the strength of correlation between the different instruments, which to our knowledge has never been done before. Copyright © 2015 Elsevier Ltd. All rights reserved.

  11. Relationships between state and trait anxiety inventory and alcohol use disorder identification test scores among Korean twins and families: the healthy twin study.

    Science.gov (United States)

    Sung, Joohon; Lee, Kayoung; Song, Yun-Mi; Kim, Ji-Hae

    2011-02-01

    We explored heritabilities of the State and Trait Anxiety Inventory (STAI) and the Alcohol Use Disorders Identification Test (AUDIT), and associations including genetic and environmental correlations between the phenotypes among Korean twins and their families. We analyzed the data of 1,748 participants (835 men, 913 women, 656 individuals of monozygotic twins, 173 individuals of same-sexed dizygotic twins, 919 non-twin family members, age 30-79 years) from the Healthy Twin study. Heritabilities and bivariate analyses were assessed using the SOLAR package software. In the methods of generalized estimation equations, women in the 4th quartile of state and trait scores were 17% and 15%, respectively more likely to be hazardous alcohol users compared to women in the lower three quartiles (P genetic correlation between the trait score and the AUDIT score, and a significant non-genetic correlation between the state score and the AUDIT score in women, while there were no significant genetic or non-genetic correlations between these phenotypes in men. The STAI and AUDIT scores are heritable in Koreans and the relationships between these phenotypes may be inconsistent by sex.

  12. Comparative evaluation of chest radiography, low-field MRI, the Shwachman-Kulczycki score and pulmonary function tests in patients with cystic fibrosis

    Energy Technology Data Exchange (ETDEWEB)

    Anjorin, Angela; Vogl, Thomas J. [Johann Wolfgang Goethe University, Institute for Diagnostic and Interventional Radiology, Frankfurt am Main (Germany); Schmidt, Helga [Johann Wolfgang Goethe University, Department of Pediatric Radiology, Institute for Diagnostic and Interventional Radiology, Frankfurt am Main (Germany); Posselt, Hans-Georg [Johann Wolfgang Goethe University, Clinics for Pediatry, Gastroenterology, Frankfurt am Main (Germany); Smaczny, Christina [Johann Wolfgang Goethe University, Medical Clinics I, Pneumology, Frankfurt am Main (Germany); Ackermann, Hanns [Johann Wolfgang Goethe University, Department of Biomathematics, Frankfurt am Main (Germany); Deimling, Michael [Siemens Medical Solutions, Erlangen (Germany); Abolmaali, Nasreddin [Dresden University of Technology, OncoRay - Molecular Imaging, Medical Faculty Carl Gustav Carus, Dresden (Germany)

    2008-06-15

    The aim of this study was to investigate whether the parenchymal lung damage in patients suffering from cystic fibrosis (CF) can be equivalently quantified by the Chrispin-Norman (CN) scores determined with low-field magnetic resonance imaging (MRI) and conventional chest radiography (CXR). Both scores were correlated with pulmonary function tests (PFT) and the Shwachman-Kulczycki method (SKM). To evaluate the comparability of MRI and CXR for different states of the disease, all scores were applied to patients divided into three age groups. Seventy-three CF patients (mean SKM score: 62 {+-} 8) with a median age (range) of 14 years (7-32) were included. The mean CN scores determined with both imaging methods were comparable (CXR: 12.1 {+-} 4.7; MRI: 12.0 {+-} 4.5) and showed high correlation (P < 0.05, R = 0.97). Only weak correlations were found between imaging, PFT, and SKM. Both imaging modalities revealed significantly more severe disease expression with age, while PFT and SKM failed to detect early signs of disease. We conclude that imaging of the lung in CF patients is capable of detecting subtle and early parenchymal destruction before lung function or clinical scoring is affected. Furthermore, low-field MRI revealed high consistency with chest radiography and may be used for a thorough follow-up while avoiding radiation exposure. (orig.)

  13. Adjusting the Passing Scores for Gearing up for Safety: Production Agriculture Safety Training for Youth Curriculum Test Instruments

    Science.gov (United States)

    Hoover, William Brian; French, Brian F.; Field, William E.; Tormoehlen, Roger L.

    2012-01-01

    Minimum passing scores for the Gearing Up for Safety: Production Agriculture Safety Training for Youth curriculum (Gearing Up for Safety) were set in 2006 with widely used and established procedures by efforts of subject matter experts (French, Breidenbach et al., 2007; French, Field, and Tormoehlen, 2006, 2007). While providing a research-based…

  14. Change and Continuity in Grades 3-5: Effects of Poverty and Grade on Standardized Test Scores

    Science.gov (United States)

    Burross, Heidi Legg

    2008-01-01

    Background/Context: The question of the influence of Comprehensive School Reform (CSR) on achievement is an important one because many policy makers use achievement scores as the measure of success for schools, classrooms, and students. Research has demonstrated that high-poverty schools have less experienced teachers and access to fewer resources…

  15. Mapping English Language Proficiency Test Scores onto the Common European Framework. TOEFL® Research Reports. RR-80. ETS RR-05-18

    Science.gov (United States)

    Tannenbaum, Richard J.; Wylie, E. Caroline

    2005-01-01

    The Common European Framework describes language proficiency in reading, writing, speaking, and listening on a six-level scale. The Framework provides a common language with which to discuss students' progress. This report describes a study conducted with two panels of English language experts to map scores from four tests that collectively assess…

  16. What Makes a Test Score? The Respective Contributions of Pupils, Schools and Peers in Achievement in English Primary Education. CEE DP 102

    Science.gov (United States)

    Kramarz, Francis; Machin, Stephen; Ouazad, Amine

    2009-01-01

    What makes a test score? There is a great deal of uncertainty surrounding the exact contribution of school quality, pupil background, and peers in educational achievement. If peers make most of the difference, then diversity and heterogeneous classrooms may narrow the gap between high- and low-performing students. If pupil background is the first…

  17. Using Logistic Regression for Validating or Invalidating Initial Statewide Cut-Off Scores on Basic Skills Placement Tests at the Community College Level

    Science.gov (United States)

    Secolsky, Charles; Krishnan, Sathasivam; Judd, Thomas P.

    2013-01-01

    The community colleges in the state of New Jersey went through a process of establishing statewide cut-off scores for English and mathematics placement tests. The colleges wanted to communicate to secondary schools a consistent preparation that would be necessary for enrolling in Freshman Composition and College Algebra at the community college…

  18. The Politics of Achievement Gaps: U.S. Public Opinion on Race-Based and Wealth-Based Differences in Test Scores

    Science.gov (United States)

    Valant, Jon; Newark, Daniel A.

    2016-01-01

    For decades, researchers have documented large differences in average test scores between minority and White students and between poor and wealthy students. These gaps are a focal point of reformers' and policymakers' efforts to address educational inequities. However, the U.S. public's views on achievement gaps have received little attention from…

  19. Measurement of coronary calcium scores by electron beam computed tomography or exercise testing as initial diagnostic tool in low-risk patients with suspected coronary artery disease

    NARCIS (Netherlands)

    Geluk, Christiane A.; Dikkers, Riksta; Perik, Patrick J.; Tio, Rene A.; Gotte, Marco J. W.; Hillege, Hans L.; Vliegenthart, Rozemarijn; Houwers, Janneke B.; Willems, Tineke P.; Oudkerk, Matthijs; Zijlstra, Felix

    2008-01-01

    We determined the efficiency of a screening protocol based on coronary calcium scores (CCS) compared with exercise testing in patients with suspected coronary artery disease (CAD), a normal ECG and troponin levels. Three-hundred-and-four patients were enrolled in a screening protocol including CCS b

  20. Beyond Standardized Test Scores: An Examination of Leadership and Climate as Leading Indicators of Future Success in the Transformation of Turnaround Schools

    Science.gov (United States)

    May, Judy Jackson; Sanders, Eugene T. W.

    2013-01-01

    Districts throughout the nation are engaged in comprehensive transformation to "turn around" low performing schools. Standardized test scores are used to gauge student achievement; however, academic gains may lag behind leading indicators such as improved school climate and effective leadership. This study examines 16 underperforming…

  1. Splitting statistical potentials into meaningful scoring functions: Testing the prediction of near-native structures from decoy conformations

    Directory of Open Access Journals (Sweden)

    Oliva Baldo

    2009-11-01

    Full Text Available Abstract Background Recent advances on high-throughput technologies have produced a vast amount of protein sequences, while the number of high-resolution structures has seen a limited increase. This has impelled the production of many strategies to built protein structures from its sequence, generating a considerable amount of alternative models. The selection of the closest model to the native conformation has thus become crucial for structure prediction. Several methods have been developed to score protein models by energies, knowledge-based potentials and combination of both. Results Here, we present and demonstrate a theory to split the knowledge-based potentials in scoring terms biologically meaningful and to combine them in new scores to predict near-native structures. Our strategy allows circumventing the problem of defining the reference state. In this approach we give the proof for a simple and linear application that can be further improved by optimizing the combination of Zscores. Using the simplest composite score ( we obtained predictions similar to state-of-the-art methods. Besides, our approach has the advantage of identifying the most relevant terms involved in the stability of the protein structure. Finally, we also use the composite Zscores to assess the conformation of models and to detect local errors. Conclusion We have introduced a method to split knowledge-based potentials and to solve the problem of defining a reference state. The new scores have detected near-native structures as accurately as state-of-art methods and have been successful to identify wrongly modeled regions of many near-native conformations.

  2. The patterning of test scores of children living in proximity to an inactive toxic waste disposal site who are classified as neurologically impaired

    Energy Technology Data Exchange (ETDEWEB)

    Licata, L.

    1992-01-01

    This study investigated the relationship between the pattern of impairment on test scores of the neurologically impaired children and proximity to an inactive toxic waste disposal site. Subjects (N = 147) were students, ages 6-16, classified as neurologically impaired. Seventy-six who lived within six miles of the site served as the experimental group and 71 who did not live near a site comprised the control group. Research was based on existing data available through the Child Study Team evaluation process. Attention was given to the ACID cluster of the WISC-R, the Arithmetic and Reading subtests on the WRAT, and the Koppitz scores of the Bender Visual Motor Gestalt Test. No significant difference was found between the experimental and control groups. Sex differences within the experimental group were not significant. Time of exposure and patterning of scores in the experimental group were investigated. Time had a significant main effect on WISC-R Arithmetic and Digit Span subtests, the ACID cluster and the Bender Test for the total group. Main effect for sex was significant for the WISC-R Information subtest. An interaction effect was found to be significant on the WRAT Arithmetic subtest WRAT. The longer the girls lived within the site area the lower they scored on the WISC-R Information subtest and the WRAT Arithmetic subtest. The variable exposure (interaction of distance and time) was related to lower scores on the WISC-R Arithmetic and Digit Span subtest. A two-way interaction was found on the WRAT Arithmetic subtest. The longer the females were exposed to the waste site area, the lower they scored on the WRAT Arithmetic subtest. A comparison of those children in the site area from birth and those in the area three years prior to the evaluation was done. A significant main effect was found for the Bender Gestalt.

  3. Relations among conceptual knowledge, procedural knowledge, and procedural flexibility in two samples differing in prior knowledge.

    Science.gov (United States)

    Schneider, Michael; Rittle-Johnson, Bethany; Star, Jon R

    2011-11-01

    Competence in many domains rests on children developing conceptual and procedural knowledge, as well as procedural flexibility. However, research on the developmental relations between these different types of knowledge has yielded unclear results, in part because little attention has been paid to the validity of the measures or to the effects of prior knowledge on the relations. To overcome these problems, we modeled the three constructs in the domain of equation solving as latent factors and tested (a) whether the predictive relations between conceptual and procedural knowledge were bidirectional, (b) whether these interrelations were moderated by prior knowledge, and (c) how both constructs contributed to procedural flexibility. We analyzed data from 2 measurement points each from two samples (Ns = 228 and 304) of middle school students who differed in prior knowledge. Conceptual and procedural knowledge had stable bidirectional relations that were not moderated by prior knowledge. Both kinds of knowledge contributed independently to procedural flexibility. The results demonstrate how changes in complex knowledge structures contribute to competence development.

  4. The Five Score and Percentage of Group Test Scores of Measurement Standard in Ancient Imperial Examination%中国科举“五级百分”计量标准研究

    Institute of Scientific and Technical Information of China (English)

    黄裕泉; 干有成; 刘立云; 赵霖

    2016-01-01

    群体考试成绩如何计量,时至今日仍属研究领域。美国大学入学考试、托福、GRE、雅思英语考试、中国国际汉语等级考试、中国大学招生考试、各级学校学生的学期考试等,其计量标准和方法都不相同!中国国家计量法要求计量应建立国家统一标准,计量要有统一的计量单位和统一的计量基准、计量还必须有统一的监管。我国古代科举考试“五级百分”计量标准是古今中外最科学简便精准的计量标准。%How to measure the group test scores, which are still in the research category. For example, the entrance examin-ation of American University GRE TOEFL IELTS exam China international Chinese language Test, the measurement stan-dards and methods are not the same. China national measurement law requires measurement should establish a national unified standard, measurement must also have a unified supervision. Chinese ancient imperial examination“Five score and percentage scoring method”is the most scientific the most simple and convenient and the most accurate measurement standard at all times and in all over the world.

  5. Impact of a computer-based auto-tutorial program on parasitology test scores of four consecutive classes of veterinary medical students.

    Science.gov (United States)

    Pinckney, R D; Mealy, M J; Thomas, C B; MacWilliams, P S

    2001-01-01

    A "Hard and Soft Tick" auto-tutorial that integrates basic knowledge of the parasite biology with practical aspects of tick identification, clinical presentation, pathology, disease transmission, treatment, and control was developed at the University of Wisconsin-Madison School of Veterinary Medicine. The purpose of this study was to assess impact of the auto-tutorial on parasitology test scores in four classes (1999, 2000, 2001, and 2002) of veterinary students. The analysis revealed a small but significant increase (p = 0.054) in mean percentage examination scores for students who used the tutorial over those who did not.

  6. Estimation of ability and item parameters in mathematics testing by using the combination of 3PLM/ GRM and MCM/ GPCM scoring model

    Directory of Open Access Journals (Sweden)

    Abadyo Abadyo

    2015-06-01

    Full Text Available The main purpose of the study was to investigate the superiority of scoring by utilizing the combination of MCM/GPCM model in comparison to 3PLM/GRM model within a mixed-item format of Mathematics tests. To achieve the purpose, the impact of two scoring models was investigated based on the test length, the sample size, and the M-C item proportion within the mixed-item format test and the investigation was conducted on the aspects of: (1 estimation of ability and item parameters, (2 optimalization of TIF, (3 standard error rates, and (4 model fitness on the data. The investigation made use of simulated data that was generated based on fixed effects factorial design 2 x 3 x 3 x 3 and 5 replications resulting in 270 data sets. The data were analyzed by means of fixed effect MANOVA on Root Mean Square Error (RMSE of the ability and RMSE and Root Mean Square Deviation (RNSD of the itemparameters in order to identify the significant main effects at level of a = .05; on the other hand, the interaction effects were incorporated into the error term for statistical testing. The -2LL statistics were also used in order to evaluate the moel fitness on the data set. The results of the study show that the combination of MCM/GPCM model provide higher accurate estimation than that of 3PLM/GRM model. In addition, the test information given by the combination of MCM/GPCM model is three times hhigher than that of 3PLM/GRM model although the test information cannot offer a solid conclusion in relation to the sample size and the M-C item proportion on each test length which provides the optimal score of thest information. Finally, the differences of fit statistics between the two models of scoring determine the position of MCM/GPCM model rather than that of 3PLM/GRM model.

  7. Patterns of Various ESOL Proficiency Test Scores by Native Language and Proficiency Levels. Occasional Papers on Linguistics, No. 1.

    Science.gov (United States)

    Hisama, Kay K.

    A profile method was used to analyze the patterns of four English proficiency tests (Comprehensive English Language Test for Speakers of English as a Second Language: Structure, CELT: Listening, Reading for Understanding Test, and The New Cloze Test) regarding two examinee characteristics: their language proficiency levels and native language. One…

  8. Evaluation of the performance of 57 Japanese participating laboratories by two types of z-scores in proficiency test for the quantification of pesticide residues in brown rice.

    Science.gov (United States)

    Otake, Takamitsu; Yarita, Takashi; Aoyagi, Yoshie; Numata, Masahiko; Takatsu, Akiko

    2014-11-01

    A proficiency test for the analysis of pesticide residues in brown rice was carried out to support upgrading in analytical skills of participant laboratories. Brown rice containing three target pesticides (etofenprox, fenitrothion, and isoprothiolane) was used as the test samples. The test samples were distributed to the 57 participants and analyzed by appropriate analytical methods chosen by each participant. It was shown that there was no significant difference among the reported values obtained by different types of analytical method. The analytical results obtained by National Metrology Institute of Japan (NMIJ) were 3 % to 10 % greater than those obtained by participants. The results reported by the participant were evaluated by using two types of z-scores, that is, one was the score based on the consensus values calculated from the analytical results of participants, and the other one was the score based on the reference values obtained by NMIJ with high reliability. Acceptable z-scores based on the consensus values and NMIJ reference values were achieved by 87 % to 89 % and 79 % to 94 % of the participants, respectively.

  9. Apgar Scores

    Science.gov (United States)

    ... Stages Listen Español Text Size Email Print Share Apgar Scores Page Content Article Body As soon as your ... the syringe, but is blue; her one minute Apgar score would be 8—two points off because she ...

  10. A Study of Hypotheses Basic to the Use of Rights and Formula Scores. Phase I--Based on Experimental Administration of College Board Tests [and] Phase II--Based on Operational Administration of the GMAT.

    Science.gov (United States)

    Angoff, William H.; Schrader, William B.

    In a study to determine whether a shift from Formula scoring to Rights scoring can be made without causing a discontinuity in the test scale, the analysis of special administrations of the Scholastic Aptitude Test and Chemistry Achievement Test and the variable section of an operational form of the Graduate Management Admission Test (GMAT) is…

  11. Measurement of coronary calcium scores by electron beam computed tomography or exercise testing as initial diagnostic tool in low-risk patients with suspected coronary artery disease

    Energy Technology Data Exchange (ETDEWEB)

    Geluk, Christiane A.; Perik, Patrick J.; Tio, Rene A.; Goette, Marco J.W.; Hillege, Hans L.; Zijlstra, Felix [University Medical Center Groningen, Thoraxcenter, Department of Cardiology, Groningen (Netherlands); Dikkers, Riksta; Vliegenthart, Rozemarijn; Houwers, Janneke B.; Willems, Tineke P.; Oudkerk, Matthijs [University Medical Center Groningen, Department of Radiology, Groningen (Netherlands)

    2008-02-15

    We determined the efficiency of a screening protocol based on coronary calcium scores (CCS) compared with exercise testing in patients with suspected coronary artery disease (CAD), a normal ECG and troponin levels. Three-hundred-and-four patients were enrolled in a screening protocol including CCS by electron beam computed tomography (Agatston score), and exercise testing. Decision-making was based on CCS. When CCS{>=}400, coronary angiography (CAG) was recommended. When CCS<10, patients were discharged. Exercise tests were graded as positive, negative or nondiagnostic. The combined endpoint was defined as coronary event or obstructive CAD at CAG. During 12{+-}4 months, CCS{>=}400, 10-399 and <10 were found in 42, 103 and 159 patients and the combined endpoint occurred in 24 (57%), 14 (14%) and 0 patients (0%), respectively. In 22 patients (7%), myocardial perfusion scintigraphy was performed instead of exercise testing due to the inability to perform an exercise test. A positive, nondiagnostic and negative exercise test result was found in 37, 76 and 191 patients, and the combined endpoint occurred in 11 (30%), 15 (20%) and 12 patients (6%), respectively. Receiver-operator characteristics analysis showed that the area under the curve of 0.89 (95% CI: 0.85-0.93) for CCS was superior to 0.69 (95% CI: 0.61-0.78) for exercise testing (P<0.0001). In conclusion, measurement of CCS is an appropriate initial screening test in a well-defined low-risk population with suspected CAD. (orig.)

  12. Comparison of Scores on Two Visual-Motor Tests for Children Referred for Learning or Adjustment Difficulties.

    Science.gov (United States)

    DeMers, Stephen T.; And Others

    1981-01-01

    This study compared the performance of school-aged children referred for learning or adjustment difficulties on Beery's Developmental Test of Visual-Motor Integration and Koppitz's version of the Bender-Gestalt test. Results indicated that the tests are related but not equivalent when administered to referred populations. (Author/AL)

  13. The Impact of Test-Taking Behaviors on WISC-IV Spanish Domain Scores in Its Standardization Sample

    Science.gov (United States)

    Oakland, Thomas; Callueng, Carmelo; Harris, Josette G.

    2012-01-01

    The use of individually administered measures of intelligence and other cognitive abilities requires clinicians to monitor a client's test behaviors, given the need for a client to be engaged fully, attentive, and cooperative during the testing process. The use of standardized and norm-referenced measures of test-taking behaviors facilitates this…

  14. The Impact of Test-Taking Behaviors on WISC-IV Spanish Domain Scores in Its Standardization Sample

    Science.gov (United States)

    Oakland, Thomas; Callueng, Carmelo; Harris, Josette G.

    2012-01-01

    The use of individually administered measures of intelligence and other cognitive abilities requires clinicians to monitor a client's test behaviors, given the need for a client to be engaged fully, attentive, and cooperative during the testing process. The use of standardized and norm-referenced measures of test-taking behaviors facilitates this…

  15. Self-Discipline Gives Girls the Edge: Gender in Self-Discipline, Grades, and Achievement Test Scores

    Science.gov (United States)

    Duckworth, Angela Lee; Seligman, Martin E. P.

    2006-01-01

    Throughout elementary, middle, and high school, girls earn higher grades than boys in all major subjects. Girls, however, do not out perform boys on achievement or IQ tests. To date, explanations for the underprediction of girls' GPAs by standardized tests have focused on gender differences favoring boys on such tests. The authors' investigation…

  16. Evaluation of the validity of osteoporosis and fracture risk assessment tools (IOF One Minute Test, SCORE, and FRAX) in postmenopausal Palestinian women.

    Science.gov (United States)

    Kharroubi, Akram; Saba, Elias; Ghannam, Ibrahim; Darwish, Hisham

    2017-12-01

    The need for simple self-assessment tools is necessary to predict women at high risk for developing osteoporosis. In this study, tools like the IOF One Minute Test, Fracture Risk Assessment Tool (FRAX), and Simple Calculated Osteoporosis Risk Estimation (SCORE) were found to be valid for Palestinian women. The threshold for predicting women at risk for each tool was estimated. The purpose of this study is to evaluate the validity of the updated IOF (International Osteoporosis Foundation) One Minute Osteoporosis Risk Assessment Test, FRAX, SCORE as well as age alone to detect the risk of developing osteoporosis in postmenopausal Palestinian women. Three hundred eighty-two women 45 years and older were recruited including 131 women with osteoporosis and 251 controls following bone mineral density (BMD) measurement, 287 completed questionnaires of the different risk assessment tools. Receiver operating characteristic (ROC) curves were evaluated for each tool using bone BMD as the gold standard for osteoporosis. The area under the ROC curve (AUC) was the highest for FRAX calculated with BMD for predicting hip fractures (0.897) followed by FRAX for major fractures (0.826) with cut-off values ˃1.5 and ˃7.8%, respectively. The IOF One Minute Test AUC (0.629) was the lowest compared to other tested tools but with sufficient accuracy for predicting the risk of developing osteoporosis with a cut-off value ˃4 total yes questions out of 18. SCORE test and age alone were also as good predictors of risk for developing osteoporosis. According to the ROC curve for age, women ≥64 years had a higher risk of developing osteoporosis. Higher percentage of women with low BMD (T-score ≤-1.5) or osteoporosis (T-score ≤-2.5) was found among women who were not exposed to the sun, who had menopause before the age of 45 years, or had lower body mass index (BMI) compared to controls. Women who often fall had lower BMI and approximately 27% of the recruited postmenopausal

  17. Cognitive disparities, lead plumbing, and water chemistry: prior exposure to water-borne lead and intelligence test scores among World War Two U.S. Army enlistees.

    Science.gov (United States)

    Ferrie, Joseph P; Rolf, Karen; Troesken, Werner

    2012-01-01

    Higher prior exposure to water-borne lead among male World War Two U.S. Army enlistees was associated with lower intelligence test scores. Exposure was proxied by urban residence and the water pH levels of the cities where enlistees lived in 1930. Army General Classification Test scores were six points lower (nearly 1/3 standard deviation) where pH was 6 (so the water lead concentration for a given amount of lead piping was higher) than where pH was 7 (so the concentration was lower). This difference rose with time exposed. At this time, the dangers of exposure to lead in water were not widely known and lead was ubiquitous in water systems, so these results are not likely the effect of individuals selecting into locations with different levels of exposure. Copyright © 2011 Elsevier B.V. All rights reserved.

  18. Lord-Wingersky Algorithm Version 2.0 for Hierarchical Item Factor Models with Applications in Test Scoring, Scale Alignment, and Model Fit Testing. CRESST Report 830

    Science.gov (United States)

    Cai, Li

    2013-01-01

    Lord and Wingersky's (1984) recursive algorithm for creating summed score based likelihoods and posteriors has a proven track record in unidimensional item response theory (IRT) applications. Extending the recursive algorithm to handle multidimensionality is relatively simple, especially with fixed quadrature because the recursions can be defined…

  19. The Effects of Teacher and Teacher-librarian High-end Collaboration on Inquiry-based Project Reports and School Monthly Test Scores of Fifth-grade Students

    Directory of Open Access Journals (Sweden)

    Hai-Hon Chen

    2015-07-01

    Full Text Available The purpose of this study was twofold. The first purpose was to establish the high level collaboration of integrated instruction model between social studies teacher and teacher-librarian. The second purpose was to investigate the effects of high-end collaboration on the individual and groups’ inquiry-based project reports, as well as monthly test scores of fifth-grade students. A quasi-experimental method was adopted, two classes of elementary school fifth graders in Tainan Municipal city, Taiwan were used as samples. Students were randomly assigned to experimental conditions by class. Twenty eight students of the experimental group were taught by the collaboration of social studies teacher and teacher-librarian; while 27 students of the controlled group were taught separately by teacher in didactic teaching method. Inquiry-Based Project Record, Inquiry-Based Project Rubrics, and school monthly test scores were used as instruments for collecting data. A t-test and correlation were used to analyze the data. The results indicate that: (1 High-end collaboration model between social studies teacher and teacher-librarian was established and implemented well in the classroom. (2There was a significant difference between the experimental group and the controlled group in individual and groups’ inquiry-based project reports. Students that were taught by the collaborative teachers got both higher inquiry-based project reports’ scores than those that were taught separately by the teachers. Experimental group’s students got higher school monthly test scores than controlled groups. Suggestions for teachers’ high-end collaboration and future researcher are provided in this paper.

  20. Receiver-operating characteristic curves for somatic cell scores and California mastitis test in Valle del Be lice dairy sheep

    NARCIS (Netherlands)

    Riggio, V.; Pesce, L.L.; Morreale, S.; Portolano, B.

    2013-01-01

    Using receiver-operating characteristic (ROC) curve methodology this study was designed to assess the diagnostic effectiveness of somatic cell count (SCC) and the California mastitis test (CMT) in Valle del Belice sheep, and to propose and evaluate threshold values for those tests that would optimal

  1. Receiver-operating characteristic curves for somatic cell scores and California mastitis test in Valle del Be lice dairy sheep

    NARCIS (Netherlands)

    Riggio, V.; Pesce, L.L.; Morreale, S.; Portolano, B.

    2013-01-01

    Using receiver-operating characteristic (ROC) curve methodology this study was designed to assess the diagnostic effectiveness of somatic cell count (SCC) and the California mastitis test (CMT) in Valle del Belice sheep, and to propose and evaluate threshold values for those tests that would

  2. Walking in postpoliomyelitis syndrome: The relationships between time-scored tests, walking in daily life and perceived mobility problems

    NARCIS (Netherlands)

    H.L.D. Horemans (Herwin); J.B.J. Bussmann (Hans); A. Beelen (Anita); H.J. Stam (Henk); F. Nollet (Frans)

    2005-01-01

    textabstractObjective: To compare walking test results with walking in daily life, and to investigate the relationships between walking tests, walking activity in daily life, and perceived mobility problems in patients with post-poliomyelitis syndrome. Subjects: Twenty-four ambulant patients with po

  3. Adenosine testing during cryoballoon ablation and radiofrequency ablation of atrial fibrillation: A propensity score-matched analysis.

    Science.gov (United States)

    Tokuda, Michifumi; Matsuo, Seiichiro; Isogai, Ryota; Uno, Goki; Tokutake, Kenichi; Yokoyama, Kenichi; Kato, Mika; Narui, Ryohsuke; Tanigawa, Shinichi; Yamashita, Seigo; Inada, Keiichi; Yoshimura, Michihiro; Yamane, Teiichi

    2016-11-01

    The infusion of adenosine triphosphate after radiofrequency (RF) pulmonary vein (PV) isolation (PVI), which may result in acute transient PV-atrium reconnection, can unmask dormant conduction. The purpose of this study was to compare the incidence and characteristics of dormant conduction after cryoballoon (CB) and RF ablation of atrial fibrillation (AF). Of 414 consecutive patients undergoing initial catheter ablation of paroxysmal AF, 246 (59%) propensity score-matched patients (123 CB-PVI and 123 RF-PVI) were included. Dormant conduction was less frequently observed in patients who underwent CB-PVI than in those who underwent RF-PVI (4.5% vs 12.8% of all PVs; P PVI than in those who underwent RF-PVI in the left superior PV (P PVI. Multivariable analysis revealed that a longer time to the elimination of the PV potential (odds ratio 1.018; 95% confidence interval 1.001-1.036; P = .04) and the necessity of touch-up ablation (odds ratio 3.242; 95% confidence interval 2.761-7.111; P PVI. After the elimination of dormant conduction by additional ablation, the AF-free rate was similar in patients with and without dormant conduction after both CB-PVI and RF-PVI (P = .28 and P = .73, respectively). The results of the propensity score-matched analysis showed that dormant PV conduction was less frequent after CB ablation than after RF ablation and was not associated with ablation outcomes. Copyright © 2016 Heart Rhythm Society. Published by Elsevier Inc. All rights reserved.

  4. Effect of differing PowerPoint slide design on multiple-choice test scores for assessment of knowledge and retention in a theriogenology course.

    Science.gov (United States)

    Root Kustritz, Margaret V

    2014-01-01

    Third-year veterinary students in a required theriogenology diagnostics course were allowed to self-select attendance at a lecture in either the evening or the next morning. One group was presented with PowerPoint slides in a traditional format (T group), and the other group was presented with PowerPoint slides in the assertion-evidence format (A-E group), which uses a single sentence and a highly relevant graphic on each slide to ensure attention is drawn to the most important points in the presentation. Students took a multiple-choice pre-test, attended lecture, and then completed a take-home assignment. All students then completed an online multiple-choice post-test and, one month later, a different online multiple-choice test to evaluate retention. Groups did not differ on pre-test, assignment, or post-test scores, and both groups showed significant gains from pre-test to post-test and from pre-test to retention test. However, the T group showed significant decline from post-test to retention test, while the A-E group did not. Short-term differences between slide designs were most likely unaffected due to required coursework immediately after lecture, but retention of material was superior with the assertion-evidence slide design.

  5. The Validity of Scores from the "GRE"® revised General Test for Forecasting Performance in Business Schools: Phase One. ETS GRE® Board Research Report. ETS GRE®-14-01. ETS Research Report. RR-14-17

    Science.gov (United States)

    Young, John W.; Klieger, David; Bochenek, Jennifer; Li, Chen; Cline, Fred

    2014-01-01

    Scores from the "GRE"® revised General Test provide important information regarding the verbal and quantitative reasoning abilities and analytical writing skills of applicants to graduate programs. The validity and utility of these scores depend upon the degree to which the scores predict success in graduate and business school in…

  6. Scoring correction for MMPI-2 Hs scale with patients experiencing a traumatic brain injury: a test of measurement invariance.

    Science.gov (United States)

    Alkemade, Nathan; Bowden, Stephen C; Salzman, Louis

    2015-02-01

    It has been suggested that MMPI-2 scoring requires removal of some items when assessing patients after a traumatic brain injury (TBI). Gass (1991. MMPI-2 interpretation and closed head injury: A correction factor. Psychological assessment, 3, 27-31) proposed a correction procedure in line with the hypothesis that MMPI-2 endorsement may be affected by symptoms of TBI. This study assessed the validity of the Gass correction procedure. A sample of patients with a TBI (n = 242), and a random subset of the MMPI-2 normative sample (n = 1,786). The correction procedure implies a failure of measurement invariance across populations. This study examined measurement invariance of one of the MMPI-2 scales (Hs) that includes TBI correction items. A four-factor model of the MMPI-2 Hs items was defined. The factor model was found to meet the criteria for partial measurement invariance. Analysis of the change in sensitivity and specificity values implied by partial measurement invariance failed to indicate significant practical impact of partial invariance. Overall, the results support continued use of all Hs items to assess psychological well-being in patients with TBI. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  7. Validation of Victoria Symptom Validity Test Cutoff Scores among Mild Traumatic Brain Injury Litigants Using a Known-Groups Design.

    Science.gov (United States)

    Silk-Eglit, Graham M; Lynch, Julie K; McCaffrey, Robert J

    2016-05-01

    The Victoria Symptom Validity Test (VSVT) is one of the most accurate performance validity tests. Previous research has recommended several cutoffs for performance invalidity classification on the VSVT. However, only one of these studies used a known groups design and no study has investigated these cutoffs in an exclusively mild traumatic brain injury (mTBI) medico-legal sample. The current study used a known groups design to validate VSVT cutoffs among mild traumatic brain injury litigants and explored the best approach for using the multiple recommended cutoffs for this test. Cutoffs of 6, and <5 items correct on any block yielded the strongest classification accuracy. Using multiple cutoffs in conjunction reduced classification accuracy. Given convergence across studies, a cutoff of <18 Hard items correct is the most appropriate for use with mTBI litigants.

  8. Identification and estimation of nonlinear models using two samples with nonclassical measurement errors

    KAUST Repository

    Carroll, Raymond J.

    2010-05-01

    This paper considers identification and estimation of a general nonlinear Errors-in-Variables (EIV) model using two samples. Both samples consist of a dependent variable, some error-free covariates, and an error-prone covariate, for which the measurement error has unknown distribution and could be arbitrarily correlated with the latent true values; and neither sample contains an accurate measurement of the corresponding true variable. We assume that the regression model of interest - the conditional distribution of the dependent variable given the latent true covariate and the error-free covariates - is the same in both samples, but the distributions of the latent true covariates vary with observed error-free discrete covariates. We first show that the general latent nonlinear model is nonparametrically identified using the two samples when both could have nonclassical errors, without either instrumental variables or independence between the two samples. When the two samples are independent and the nonlinear regression model is parameterized, we propose sieve Quasi Maximum Likelihood Estimation (Q-MLE) for the parameter of interest, and establish its root-n consistency and asymptotic normality under possible misspecification, and its semiparametric efficiency under correct specification, with easily estimated standard errors. A Monte Carlo simulation and a data application are presented to show the power of the approach.

  9. Pre-season adductor squeeze test and HAGOS function sport and recreation subscale scores predict groin injury in Gaelic football players.

    Science.gov (United States)

    Delahunt, Eamonn; Fitzpatrick, Helen; Blake, Catherine

    2017-01-01

    To determine if pre-season adductor squeeze test and HAGOS function, sport and recreation subscale scores can identify Gaelic football players at risk of developing groin injury. Prospective study. Senior inter-county Gaelic football team. Fifty-five male elite Gaelic football players (age = 24.0 ± 2.8 years, body mass = 84.48 ± 7.67 kg, height = 1.85 ± 0.06 m, BMI = 24.70 ± 1.77 kg/m(2)) from a single senior inter-county Gaelic football team. Occurrence of groin injury during the season. Ten time-loss groin injuries were registered representing 13% of all injuries. The odds ratio for sustaining a groin injury if pre-season adductor squeeze test score was below 225 mmHg, was 7.78. The odds ratio for sustaining a groin injury if pre-season HAGOS function, sport and recreation subscale score was football players at risk of developing groin injury. Copyright © 2016 Elsevier Ltd. All rights reserved.

  10. Bootstrap Score Tests for Fractional Integration in Heteroskedastic ARFIMA Models, with an Application to Price Dynamics in Commodity Spot and Futures Markets

    DEFF Research Database (Denmark)

    Cavaliere, Giuseppe; Nielsen, Morten Ørregaard; Taylor, A.M. Robert

    Empirical evidence from time series methods which assume the usual I(0)/I(1) paradigm suggests that the efficient market hypothesis, stating that spot and futures prices of a commodity should cointegrate with a unit slope on futures prices, does not hold. However, these statistical methods...... fractionally integrated model we are able to find a body of evidence in support of the efficient market hypothesis for a number of commodities. Our new tests are wild bootstrap implementations of score-based tests for the order of integration of a fractionally integrated time series. These tests are designed...... principle do. A Monte Carlo simulation study demonstrates that very significant improvements infinite sample behaviour can be obtained by the bootstrap vis-à-vis the corresponding asymptotic tests in both heteroskedastic and homoskedastic environments....

  11. Genetic analysis of somatic cell score in Danish dairy cattle using ramdom regression test-day model

    DEFF Research Database (Denmark)

    Elsaid, Reda; Sabry, Ayman; Lund, Mogens Sandø

    2011-01-01

    over first lactation, genetic correlations are near unity between any time points in first lactation, and including a Wilmink term will improve the likelihood of more than an extra order Legendre polynomial. Ten data sets, consisting of 1,190,584 test day somatic cell count (SCC) records from 149...... with fifth order LP for PE effect and genetic effect were adequate to fit the data. The average heritability differed over the lactation and was lowest at the beginning (0.098) and higher at the end of lactation (0.138 to 0.151). Genetic correlations between daily SCS were high for adjacent tests (nearly 1...

  12. Score Correlation

    OpenAIRE

    Fabián, Z. (Zdeněk)

    2010-01-01

    In this paper, we study a distribution-dependent correlation coefficient based on the concept of scalar score. This new measure of association of continuous random variables is compared by means of simulation experiments with the Pearson, Kendall and Spearman correlation coefficients.

  13. Multiple Intelligence Scores of Science Stream Students and Their Relation with Reading Competency in Malaysian University English Test (MUET)

    Science.gov (United States)

    Razak, Norizan Abdul; Zaini, Nuramirah

    2014-01-01

    Many researches have shown that different approach needed in analysing linear and non-linear reading comprehension texts and different cognitive skills are required. This research attempts to discover the relationship between Science Stream students' reading competency on linear and non-linear texts in Malaysian University English Test (MUET) with…

  14. Course Enrollments and Subsequent Success after Being Advised of or 'Blind' to Assessment Test Scores and Course Recommendations.

    Science.gov (United States)

    Jue, Penny Y.

    Beginning in fall 1991, Napa Valley College (NVC), in California, switched from essentially mandatory placement of incoming students to an advisory, self-selection system, where students receive course recommendations based on assessment test results. In order to evaluate the validity of NVC's assessment procedures, a study was conducted of…

  15. Evaluation of a weighted test in the analysis of ordinal gait scores in an additivity model for five OP pesticides.

    Science.gov (United States)

    Appropriate statistical analyses are critical for evaluating interactions of mixtures with a common mode of action, as is often the case for cumulative risk assessments. Our objective is to develop analyses for use when a response variable is ordinal, and to test for interaction...

  16. Something That Test Scores Do Not Show: Engaging in Community Diversity as a Local Response to Global Education Trends

    Science.gov (United States)

    Valdiviezo, Laura A.

    2014-01-01

    At Smith Street Elementary School, the globalizing education trends that English language learner (ELL) teachers face focus on measuring student achievement through testing and the English mainstreaming of non-dominant students as opposed to the cultivation of the students' linguistic and cultural diversity. The ELL teachers at Smith Street…

  17. The Test Matters: The Relationship between Classroom Observation Scores and Teacher Value Added on Multiple Types of Assessment

    Science.gov (United States)

    Grossman, Pam; Cohen, Julie; Ronfeldt, Matthew; Brown, Lindsay

    2014-01-01

    In this study, we examined how the relationships between one observation protocol, the Protocol for Language Arts Teaching Observation (PLATO), and value-added measures shift when different tests are used to assess student achievement. Using data from the Measures of Effective Teaching Project, we found that PLATO was more strongly related to the…

  18. Effect of Frequent Peer-Monitored Testing and Personal Goal Setting on Fitnessgram Scores of Hispanic Middle School Students

    Science.gov (United States)

    Hill, Grant; Downing, Aaron

    2015-01-01

    The purpose of this study was to determine the effects of frequent peer-monitored Fitnessgram testing, with student goal setting, on the PACER and push-up performance of middle school students. Subjects were 176 females and 189 males in 10 physical education classes at a middle school with an 83.7% Hispanic student population. Students were…

  19. THE EFFECT OF AGE AS A VARIABLE ON THE SCORES OF THE HARRIS-GOODENOUGH DRAWING TEST OF EDUCABLE RETARDATES.

    Science.gov (United States)

    LEVY, IRWIN S.

    IN ORDER TO DETERMINE THE RELIABILITY OF PERFORMANCE OF RETARDED ADOLESCENTS ON THE HARRIS REVISION OF THE GOODENOUGH DRAW-A-MAN TEST (DAM) AND WHETHER THE DECLINE IN PERFORMANCE WHICH OCCURS IN NORMAL ADOLESCENTS AT THE MID-TEENS ALSO OCCURS WITH RETARDED ADOLESCENTS, 213 MALE AND 130 FEMALE SUBJECTS, AGED 11-20 YEARS AND WITH IQ'S OF 56-72, IN…

  20. Investigating the Effect of Sympathetic Skin Response Parameters on the Psychological Test Scores in Patients with Fibromyalgia Syndrome by Using ANNS

    Directory of Open Access Journals (Sweden)

    Murat Yıldız

    2013-01-01

    Full Text Available In this study, psychological tests such as Visual Analogue Pain Scale, Verbal Pain Scale, Beck Depression Inventory, Beck Anxiety Inventory, Hamilton Depression Rating Scale and Hamilton Anxiety Scale were applied to the selected healthy subjects and patients with Fibromyalgia Syndrome (FMS in Suleyman Demirel University, Faculty of Medicine, Department of Physical Medicine and Rehabilitation and the scores were recorded. A measurement system was established in the same department of the university to measure the sympathetic skin response (SSR from the subjects. The SSR was measured and recorded. The parameters such as latency time, maximum amplitude and the elapsed time were calculated by using Matlab software from the recorded SSR data. SSR parameters were added to the scores and diagnosis accuracy percentages of the FMS calculated by using artificial neural networks (ANNs. Obtained results from the simulations showed that the specified parameters of the SSR and FMS were concerned and these parameters can be used as a diagnostic method in FMS.

  1. Interpreting Standardized Assessment Test Scores and Setting Performance Goals in the Context of Student Characteristics: The Case of the Major Field Test in Business

    Science.gov (United States)

    Bielinska-Kwapisz, Agnieszka; Brown, F. William; Semenik, Richard

    2012-01-01

    The Major Field Test in Business (MFT-B), a standardized assessment test of business knowledge among undergraduate business seniors, is widely used to measure student achievement. The Educational Testing Service, publisher of the assessment, provides data that allow institutions to compare their own MFT-B performance to national norms, but that…

  2. Exploration of Analysis Methods for Diagnostic Imaging Tests: Problems with ROC AUC and Confidence Scores in CT Colonography.

    OpenAIRE

    Susan Mallett; Steve Halligan; Gary S Collins; Altman, Doug G.

    2014-01-01

    BACKGROUND: Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC) for the evaluation of CT colonography for the detection of polyps, either with or without computer assisted detection. METHODS: In a multireader multicase study of 10 readers and 107 cases we compared sensitivity and specificity, using ...

  3. Receiver-operating characteristic curves for somatic cell scores and California mastitis test in Valle del Belice dairy sheep.

    Science.gov (United States)

    Riggio, Valentina; Pesce, Lorenzo L; Morreale, Salvatore; Portolano, Baldassare

    2013-06-01

    Using receiver-operating characteristic (ROC) curve methodology this study was designed to assess the diagnostic effectiveness of somatic cell count (SCC) and the California mastitis test (CMT) in Valle del Belice sheep, and to propose and evaluate threshold values for those tests that would optimally discriminate between healthy and infected udders. Milk samples (n=1357) were collected from 684 sheep in four flocks. The prevalence of infection, as determined by positive bacterial culture was 0.36, 87.7% of which were minor and 12.3% major pathogens. Of the culture negative samples, 83.7% had an SCCCMT results were evaluated, the estimated area under the ROC curve was greater for glands infected with major compared to minor pathogens (0.88 vs. 0.73), whereas the area under the curve considering all pathogens was similar to the one for minor pathogens (0.75). The estimated optimal thresholds were 3.00 (CMT), 2.81 (SCS for the whole sample), 2.81 (SCS for minor pathogens), and 3.33 (SCS for major pathogens). These correctly classified, respectively, 69.0%, 73.5%, 72.6% and 91.0% of infected udders in the samples. The CMT appeared only to discriminate udders infected with major pathogens. In this population, SCS appeared to be the best indirect test of the bacteriological status of the udder. Copyright © 2012 Elsevier Ltd. All rights reserved.

  4. Montreal Cognitive Assessment for screening mild cognitive impairment: variations in test performance and scores by education in Singapore.

    Science.gov (United States)

    Ng, Tze Pin; Feng, Lei; Lim, Wee Shiong; Chong, Mei Sian; Lee, Tih Shih; Yap, Keng Bee; Tsoi, Tung; Liew, Tau Ming; Gao, Qi; Collinson, Simon; Kandiah, Nagaendran; Yap, Philip

    2015-01-01

    The Montreal Cognitive Assessment (MoCA) was developed as a screening instrument for mild cognitive impairment (MCI). We evaluated the MoCA's test performance by educational groups among older Singaporean Chinese adults. The MoCA and Mini-Mental State Examination (MMSE) were evaluated in two independent studies (clinic-based sample and community-based sample) of MCI and normal cognition (NC) controls, using receiver operating characteristic curve analyses: area under the curve (AUC), sensitivity (Sn), and specificity (Sp). The MoCA modestly discriminated MCI from NC in both study samples (AUC = 0.63 and 0.65): Sn = 0.64 and Sp = 0.36 at a cut-off of 28/29 in the clinic-based sample, and Sn = 0.65 and Sp = 0.55 at a cut-off of 22/23 in the community-based sample. The MoCA's test performance was least satisfactory in the highest (>6 years) education group: AUC = 0.50 (p = 0.98), Sn = 0.54, and Sp = 0.51 at a cut-off of 27/28. Overall, the MoCA's test performance was not better than that of the MMSE. In multivariate analyses controlling for age and gender, MCI diagnosis was associated with a education was associated with a 3- to 5-point decrement (η(2) = 0.115 and η(2) = 0.162, respectively). The MoCA's ability to discriminate MCI from NC was modest in this Chinese population, because it was far more sensitive to the effect of education than MCI diagnosis. © 2015 S. Karger AG, Basel.

  5. Citizen Science: The Small World Initiative Improved Lecture Grades and California Critical Thinking Skills Test Scores of Nonscience Major Students at Florida Atlantic University.

    Science.gov (United States)

    Caruso, Joseph P; Israel, Natalie; Rowland, Kimberly; Lovelace, Matthew J; Saunders, Mary Jane

    2016-03-01

    Course-based undergraduate research is known to improve science, technology, engineering, and mathematics student achievement. We tested "The Small World Initiative, a Citizen-Science Project to Crowdsource Novel Antibiotic Discovery" to see if it also improved student performance and the critical thinking of non-science majors in Introductory Biology at Florida Atlantic University (a large, public, minority-dominant institution) in academic year 2014-15. California Critical Thinking Skills Test pre- and posttests were offered to both Small World Initiative (SWI) and control lab students for formative amounts of extra credit. SWI lab students earned significantly higher lecture grades than control lab students, had significantly fewer lecture grades of D+ or lower, and had significantly higher critical thinking posttest total scores than control students. Lastly, more SWI students were engaged while taking critical thinking tests. These results support the hypothesis that utilizing independent course-based undergraduate science research improves student achievement even in nonscience students.

  6. Citizen Science: The Small World Initiative Improved Lecture Grades and California Critical Thinking Skills Test Scores of Nonscience Major Students at Florida Atlantic University

    Directory of Open Access Journals (Sweden)

    Joseph Paul Caruso

    2015-12-01

    Full Text Available Course-based undergraduate research is known to improve science, technology, engineering, and mathematics student achievement. We tested “The Small World Initiative, a Citizen-Science Project to Crowdsource Novel Antibiotic Discovery” to see if it also improved student performance and the critical thinking of nonscience majors in Introductory Biology at Florida Atlantic University (a large, public, minority-dominant institution in academic year 2014–15. California Critical Thinking Skills Test pre- and posttests were offered to both Small World Initiative (SWI and control lab students for formative amounts of extra credit. SWI lab students earned significantly higher lecture grades than control lab students, had significantly fewer lecture grades of D+ or lower, and had significantly higher critical thinking posttest total scores than control students. Lastly, more SWI students were engaged while taking critical thinking tests. These results support the hypothesis that utilizing independent course-based undergraduate science research improves student achievement even in nonscience students.

  7. Poor visualization during direct laryngoscopy and high upper lip bite test score are predictors of difficult intubation with the GlideScope videolaryngoscope.

    Science.gov (United States)

    Tremblay, Marie-Hélène; Williams, Stephan; Robitaille, Arnaud; Drolet, Pierre

    2008-05-01

    The GlideScope videolaryngoscope allows equal or superior glottic visualization compared with direct laryngoscopy, but predictive features for difficult GlideScope intubation have not been identified. We undertook this prospective study to identify patient characteristics associated with difficult GlideScope intubation. Demographic and morphometric factors were recorded preoperatively for 400 patients undergoing anesthesia with endotracheal intubation. After induction, direct laryngoscopy was performed in all patients to assess the Cormack and Lehane grade of glottic visualization followed by GlideScope intubation. The number of attempts and time needed for intubation were recorded. Univariate and multivariate analyses were performed to identify the characteristics associated with difficult GlideScope intubation. Intubation required 1, 2, and 3 attempts in 342, 48, and 9 participants, respectively, with one failure. Mean time for intubation was 21 +/- 14 s. After univariate analysis, the following characteristics were significantly correlated (P intubate and/or multiple attempts: older age, male sex, history of snoring, high Mallampati class, small mouth opening, short sternothyroid and manubriomental distances, large neck circumference, high upper lip bite test score, and high Cormack and Lehane grade during direct laryngoscopy. However, after introducing these variables in nominal logistic and proportional hazard multiple regression models, only high Cormack and Lehane grade during direct laryngoscopy, high upper lip bite test score, and short sternothyroid distance were significantly associated with multiple attempts or lengthier intubations. Despite a high success rate, intubation with the GlideScope is likely to be more challenging in patients with high Cormack and Lehane grade during direct laryngoscopy, high upper lip bite test score, or short sternothyroid distance.

  8. Regression-Based Norms for a Bi-factor Model for Scoring the Brief Test of Adult Cognition by Telephone (BTACT).

    Science.gov (United States)

    Gurnani, Ashita S; John, Samantha E; Gavett, Brandon E

    2015-05-01

    The current study developed regression-based normative adjustments for a bi-factor model of the The Brief Test of Adult Cognition by Telephone (BTACT). Archival data from the Midlife Development in the United States-II Cognitive Project were used to develop eight separate linear regression models that predicted bi-factor BTACT scores, accounting for age, education, gender, and occupation-alone and in various combinations. All regression models provided statistically significant fit to the data. A three-predictor regression model fit best and accounted for 32.8% of the variance in the global bi-factor BTACT score. The fit of the regression models was not improved by gender. Eight different regression models are presented to allow the user flexibility in applying demographic corrections to the bi-factor BTACT scores. Occupation corrections, while not widely used, may provide useful demographic adjustments for adult populations or for those individuals who have attained an occupational status not commensurate with expected educational attainment. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  9. Regression-Based Norms for a Bi-factor Model for Scoring the Brief Test of Adult Cognition by Telephone (BTACT)

    Science.gov (United States)

    Gurnani, Ashita S.; John, Samantha E.; Gavett, Brandon E.

    2015-01-01

    The current study developed regression-based normative adjustments for a bi-factor model of the The Brief Test of Adult Cognition by Telephone (BTACT). Archival data from the Midlife Development in the United States-II Cognitive Project were used to develop eight separate linear regression models that predicted bi-factor BTACT scores, accounting for age, education, gender, and occupation-alone and in various combinations. All regression models provided statistically significant fit to the data. A three-predictor regression model fit best and accounted for 32.8% of the variance in the global bi-factor BTACT score. The fit of the regression models was not improved by gender. Eight different regression models are presented to allow the user flexibility in applying demographic corrections to the bi-factor BTACT scores. Occupation corrections, while not widely used, may provide useful demographic adjustments for adult populations or for those individuals who have attained an occupational status not commensurate with expected educational attainment. PMID:25724515

  10. A more robust predictor of ideomotor dyspraxia: study on an alternative scoring method of the Bergès-Lézine's Imitation of Gestures test.

    Science.gov (United States)

    Vaivre-Douret, L

    2002-01-01

    Use of the traditional Bergès-Lézine standardization [Test d'imitation de gestes (1963).] allowed us to confirm praxic disorders in children who are encountering obvious motor difficulties. However, in comparison to other neuropsychological assessments carried out on these children, it does not enable us to precociously pinpoint disorders in praxic organization. By means of a newly evaluated method (1997) developed on the basis of the Bergès-Lézine Imitation of Gestures test (1963), we retroactively assessed a group of children (N=10) who had been observed in a longitudinal study at the age of 3-5 years and at 7-8 years and assessed with the Bergès-Lézine version (1963) of the Imitation of Gestures test. Our revised test (1997) takes into account the quantitative factor of success, as well as the qualitative factor of movement planning. It facilitates the early detection of motor organization disorders, in correlation with other neuropsychological assessments carried out on these children. Comparative clinical findings with the same group of children tested using the Bergès-Lézine version and ours indicate that our version detects, more robustly, children encountering difficulties resulting from ideomotor dyspraxia, not identified by the Bergès-Lézine test (1963). Our alternative scoring method of Bergès-Lézine's test contributes largely to early detection of instrumental difficulties in children. Additionally, its predictive capacity makes it possible to apprehend disorders in distal and digital neuromotor functions.

  11. The Relationship between the Test of English as a Foreign Language (TOEFL), the International English Language Testing System (IELTS) Scores and Academic Success of International Master's Students

    Science.gov (United States)

    Arcuino, Cathy Lee T.

    2013-01-01

    The purpose of this study was to examine if the Test of English as a Foreign Language (TOEFL) and the International English Language Testing System (IELTS) are related to academic success defined by final cumulative grade point average (GPA). The data sample, from three Midwestern universities, was comprised of international graduate students who…

  12. The Relationship between the Test of English as a Foreign Language (TOEFL), the International English Language Testing System (IELTS) Scores and Academic Success of International Master's Students

    Science.gov (United States)

    Arcuino, Cathy Lee T.

    2013-01-01

    The purpose of this study was to examine if the Test of English as a Foreign Language (TOEFL) and the International English Language Testing System (IELTS) are related to academic success defined by final cumulative grade point average (GPA). The data sample, from three Midwestern universities, was comprised of international graduate students who…

  13. Low bone mineral density in COPD patients with osteoporosis is related to low daily physical activity and high COPD assessment test scores

    Directory of Open Access Journals (Sweden)

    Liu WT

    2015-09-01

    Full Text Available Wen-Te Liu,1,2,* Han-Pin Kuo,3,* Tien-Hua Liao,4 Ling-Ling Chiang,1 Li-Fei Chen,3 Min-Fang Hsu,5 Hsiao-Chi Chuang,1 Kang-Yun Lee,2,6 Chien-Da Huang,3 Shu-Chuan Ho11School of Respiratory Therapy, College of Medicine, Taipei Medical University, 2Division of Pulmonary Medicine, Department of Internal Medicine, Shuang Ho Hospital, Taipei Medical University, 3Department of Thoracic Medicine, Chang Gung Memorial Hospital, Chang Gung University College of Medicine, 4Department of Respiratory Therapy, Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Taipei, 5Department of Healthcare Administration, Asia University, Wufeng, Taichung, 6Department of Internal Medicine, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan*These authors contributed equally to this workAbstract: COPD patients have an increased prevalence of osteoporosis (OP compared with healthy people. Physical inactivity in COPD patients is a crucial risk factor for OP; the COPD assessment test (CAT is the newest assessment tool for the health status and daily activities of COPD patients. This study investigated the relationship among daily physical activity (DPA, CAT scores, and bone mineral density (BMD in COPD patients with or without OP. This study included 30 participants. Ambulatory DPA was measured using actigraphy and oxygen saturation by using a pulse oximeter. BMD was measured using dual-energy X-ray absorptiometry. OP was defined as a T-score (standard deviations from a young, sex-specific reference mean BMD less than or equal to -2.5 SD for the lumbar spine, total hip, and femoral neck. We quantified oxygen desaturation during DPA by using a desaturation index and recorded all DPA, except during sleep. COPD patients with OP had lower DPA and higher CAT scores than those of patients without OP. DPA was significantly positively correlated with (lumbar spine, total hip, and femoral neck BMD (r=0.399, 0.602, 0.438, respectively

  14. Questionnaire design: carry-over effects of overall acceptance question placement and pre-evaluation instructions on overall acceptance scores in central location tests.

    Science.gov (United States)

    Bastian, Mauresa; Eggett, Dennis L; Jefferies, Laura K

    2015-02-01

    Question placement and usage of pre-evaluation instructions (PEI) in questionnaires for food sensory analysis may bias consumers' scores via carry-over effects. Data from consumer sensory panels previously conducted at a central location, spanning 11 years and covering a broad range of food product categories, were compiled. Overall acceptance (OA) question placement was studied with categories designated as first (the first evaluation question following demographic questions), after nongustation questions (immediately following questions that do not require panelists to taste the product), and later (following all other hedonic and just-about-right [JAR] questions, but occasionally before ranking, open-ended comments, and/or intent to purchase questions). Each panel was categorized as having or not having PEI in the questionnaire; PEI are instructions that appear immediately before the first evaluation question and show panelists all attributes they will evaluate before receiving test samples. Postpanel surveys were administered regarding the self-reported effect of PEI on panelists' evaluation experience. OA scores were analyzed and compared (1) between OA question placement categories and (2) between panels with and without PEI. For most product categories, OA scores tended to be lower when asked later in the questionnaire, suggesting evidence of a carry-over effect. Usage of PEI increased OA scores by 0.10 of a 9-point hedonic scale point, which is not practically significant. Postpanel survey data showed that presence of PEI typically improved the panelists' experience. Using PEI does not appear to introduce a meaningful carry-over effect. © 2015 Institute of Food Technologists®

  15. In Vitro Testing of Scaffolds for Mesenchymal Stem Cell-Based Meniscus Tissue Engineering-Introducing a New Biocompatibility Scoring System.

    Science.gov (United States)

    Achatz, Felix P; Kujat, Richard; Pfeifer, Christian G; Koch, Matthias; Nerlich, Michael; Angele, Peter; Zellner, Johannes

    2016-04-07

    A combination of mesenchymal stem cells (MSCs) and scaffolds seems to be a promising approach for meniscus repair. To facilitate the search for an appropriate scaffold material a reliable and objective in vitro testing system is essential. This paper introduces a new scoring for this purpose and analyzes a hyaluronic acid (HA) gelatin composite scaffold and a polyurethane scaffold in combination with MSCs for tissue engineering of meniscus. The pore quality and interconnectivity of pores of a HA gelatin composite scaffold and a polyurethane scaffold were analyzed by surface photography and Berliner-Blau-BSA-solution vacuum filling. Further the two scaffold materials were vacuum-filled with human MSCs and analyzed by histology and immunohistochemistry after 21 days in chondrogenic media to determine cell distribution and cell survival as well as proteoglycan production, collagen type I and II content. The polyurethane scaffold showed better results than the hyaluronic acid gelatin composite scaffold, with signs of central necrosis in the HA gelatin composite scaffolds. The polyurethane scaffold showed good porosity, excellent pore interconnectivity, good cell distribution and cell survival, as well as an extensive content of proteoglycans and collagen type II. The polyurethane scaffold seems to be a promising biomaterial for a mesenchymal stem cell-based tissue engineering approach for meniscal repair. The new score could be applied as a new standard for in vitro scaffold testing.

  16. A study of the score test in discrimination poisson and zero-inflated poisson models - doi: 10.4025/actascitechnol.v35i2.15071

    Directory of Open Access Journals (Sweden)

    Vanessa Siqueira Peres da Silva

    2013-04-01

    Full Text Available In many experimental situations the sample may present excess zero observations and generally are used probabilistic models for zero inflated to represent them. However no one knows precisely the amount of zero observations that these models support. Depending on the sample size and null observations number the Poisson model can be used. Based on this question, the objective of this paper is to evaluate the properties of Type I error and power of the score test (proposed by Van Den Broek (1995 to discriminate the Poisson and Zero-inflated Poisson models and ascertain the most appropriate model to represent a sample with excess zeros without compromising the statistical inference. Through Monte Carlo simulation we concluded that when considering a sample of size at least n = 40 with 30% of the null observations, the score test had a high discriminatory power between the ZIP and Poisson model indicating that in fact is relevant  the use of the ZIP model.  

  17. In Vitro Testing of Scaffolds for Mesenchymal Stem Cell-Based Meniscus Tissue Engineering—Introducing a New Biocompatibility Scoring System

    Directory of Open Access Journals (Sweden)

    Felix P. Achatz

    2016-04-01

    Full Text Available A combination of mesenchymal stem cells (MSCs and scaffolds seems to be a promising approach for meniscus repair. To facilitate the search for an appropriate scaffold material a reliable and objective in vitro testing system is essential. This paper introduces a new scoring for this purpose and analyzes a hyaluronic acid (HA gelatin composite scaffold and a polyurethane scaffold in combination with MSCs for tissue engineering of meniscus. The pore quality and interconnectivity of pores of a HA gelatin composite scaffold and a polyurethane scaffold were analyzed by surface photography and Berliner-Blau-BSA-solution vacuum filling. Further the two scaffold materials were vacuum-filled with human MSCs and analyzed by histology and immunohistochemistry after 21 days in chondrogenic media to determine cell distribution and cell survival as well as proteoglycan production, collagen type I and II content. The polyurethane scaffold showed better results than the hyaluronic acid gelatin composite scaffold, with signs of central necrosis in the HA gelatin composite scaffolds. The polyurethane scaffold showed good porosity, excellent pore interconnectivity, good cell distribution and cell survival, as well as an extensive content of proteoglycans and collagen type II. The polyurethane scaffold seems to be a promising biomaterial for a mesenchymal stem cell-based tissue engineering approach for meniscal repair. The new score could be applied as a new standard for in vitro scaffold testing.

  18. A structured approach to control of Salmonella Dublin in 10 Danish dairy herds based on risk scoring and test-and-manage procedures

    DEFF Research Database (Denmark)

    Nielsen, Liza Rosenbaum; Nielsen, Søren Saxmose

    2012-01-01

    stock and adult cattle in 10 case herds that were followed for more than three years. The five steps in the structured approach were: 1) risk scoring to determine transmission routes within the herd and into the herd; 2) determining a plan of action; 3) performing management changes to close important...... routes of infection; 4) interpretation of repeated testing of individual animals to detect high-risk animals for special hygienic management or culling; and 5) diagnostic testing of different age groups and bulk tank milk to evaluate progress of control over time. Serology, true prevalence estimates...... and changes in herd classification in the Danish surveillance programme for Salmonella Dublin were used to assess the progress in the herds during and after the control period. Effective control of Salmonella Dublin was achieved in all participating herds through management that focused on closing infection...

  19. Test anxiety and performance-avoidance goals explain gender differences in SAT-V, SAT-M, and overall SAT scores.

    Science.gov (United States)

    Hannon, Brenda

    2012-11-01

    This study uses analysis of co-variance in order to determine which cognitive/learning (working memory, knowledge integration, epistemic belief of learning) or social/personality factors (test anxiety, performance-avoidance goals) might account for gender differences in SAT-V, SAT-M, and overall SAT scores. The results revealed that none of the cognitive/learning factors accounted for gender differences in SAT performance. However, the social/personality factors of test anxiety and performance-avoidance goals each separately accounted for all of the significant gender differences in SAT-V, SAT-M, and overall SAT performance. Furthermore, when the influences of both of these factors were statistically removed simultaneously, all non-significant gender differences reduced further to become trivial by Cohen's (1988) standards. Taken as a whole, these results suggest that gender differences in SAT-V, SAT-M, and overall SAT performance are a consequence of social/learning factors.

  20. Finite sampling inequalities: an application to two-sample Kolmogorov-Smirnov statistics.

    Science.gov (United States)

    Greene, Evan; Wellner, Jon A

    2016-12-01

    We review a finite-sampling exponential bound due to Serfling and discuss related exponential bounds for the hypergeometric distribution. We then discuss how such bounds motivate some new results for two-sample empirical processes. Our development complements recent results by Wei and Dudley (2012) concerning exponential bounds for two-sided Kolmogorov - Smirnov statistics by giving corresponding results for one-sided statistics with emphasis on "adjusted" inequalities of the type proved originally by Dvoretzky et al. (1956) and by Massart (1990) for one-sample versions of these statistics.

  1. Analysis of Two-sample Censored Data Using a Semiparametric Mixture Model

    Institute of Scientific and Technical Information of China (English)

    Gang Li; Chien-tai Lin

    2009-01-01

    In this article we study a semiparametric mixture model for the two-sample problem with right censored data. The model implies that the densities for the continuous outcomes are related by a parametric tilt but otherwise unspecified. It provides a useful alternative to the Cox (1972) proportional hazards model for the comparison of treatments based on right censored survival data. We propose an iterative algorithm for the semiparametric maximum likelihood estimates of the parametric and nonparametric components of the model. The performance of the proposed method is studied using simulation. We illustrate our method in an application to melanoma.

  2. From neural oscillations to reasoning ability: Simulating the effect of the theta-to-gamma cycle length ratio on individual scores in a figural analogy test.

    Science.gov (United States)

    Chuderski, Adam; Andrelczyk, Krzysztof

    2015-02-01

    Several existing computational models of working memory (WM) have predicted a positive relationship (later confirmed empirically) between WM capacity and the individual ratio of theta to gamma oscillatory band lengths. These models assume that each gamma cycle represents one WM object (e.g., a binding of its features), whereas the theta cycle integrates such objects into the maintained list. As WM capacity strongly predicts reasoning, it might be expected that this ratio also predicts performance in reasoning tasks. However, no computational model has yet explained how the differences in the theta-to-gamma ratio found among adult individuals might contribute to their scores on a reasoning test. Here, we propose a novel model of how WM capacity constraints figural analogical reasoning, aimed at explaining inter-individual differences in reasoning scores in terms of the characteristics of oscillatory patterns in the brain. In the model, the gamma cycle encodes the bindings between objects/features and the roles they play in the relations processed. Asynchrony between consecutive gamma cycles results from lateral inhibition between oscillating bindings. Computer simulations showed that achieving the highest WM capacity required reaching the optimal level of inhibition. When too strong, this inhibition eliminated some bindings from WM, whereas, when inhibition was too weak, the bindings became unstable and fell apart or became improperly grouped. The model aptly replicated several empirical effects and the distribution of individual scores, as well as the patterns of correlations found in the 100-people sample attempting the same reasoning task. Most importantly, the model's reasoning performance strongly depended on its theta-to-gamma ratio in same way as the performance of human participants depended on their WM capacity. The data suggest that proper regulation of oscillations in the theta and gamma bands may be crucial for both high WM capacity and effective complex

  3. Evidence of linkage of HDL level variation to APOC3 in two samples with different ascertainment.

    Science.gov (United States)

    Gagnon, France; Jarvik, Gail P; Motulsky, Arno G; Deeb, Samir S; Brunzell, John D; Wijsman, Ellen M

    2003-11-01

    The APOA1-C3-A4-A5 gene complex encodes genes whose products are implicated in the metabolism of HDL and/or triglycerides. Although the relationship between polymorphisms in this gene cluster and dyslipidemias was first reported more than 15 years ago, association and linkage results have remained inconclusive. This is due, in part, to the oligogenic and multivariate nature of dyslipidemic phenotypes. Therefore, we investigate evidence of linkage of APOC3 and HDL using two samples of dyslipidemic pedigrees: familial combined hyperlipidemia (FCHL) and isolated low-HDL (ILHDL). We used a strategy that deals with several difficulties inherent in the study of complex traits: by using a Bayesian Markov Chain Monte Carlo (MCMC) approach we allow for oligogenic trait models, as well as simultaneous incorporation of covariates, in the context of multipoint analysis. By using this approach on extended pedigrees we provide evidence of linkage of APOC3 and HDL level variation in two samples with different ascertainment. In addition to APOC3, we estimate that two to three genes, each with a substantial effect on total variance, are responsible for HDL variation in both data sets. We also provide evidence, using the FCHL data set, for a pleiotropic effect between HDL, HDL3 and triglycerides at the APOC3 locus.

  4. Impact of clinical, psychological, and social factors on decreased Tinetti test score in community-living elderly subjects: a prospective study with two-year follow-up.

    Science.gov (United States)

    Manckoundia, Patrick; Thomas, Frédérique; Buatois, Séverine; Guize, Louis; Jégo, Bertrand; Aquino, Jean-Pierre; Benetos, Athanase

    2008-06-01

    Balance and gait are essential to maintain physical autonomy, particularly in elderly people. Thus the detection of risk factors of balance and gait impairment appears necessary in order to prevent falls and dependency. The objective of this study was to analyze the impact of demographic, social, clinical, psychological, and biological parameters on the decline in balance and gait assessed by the Tinetti test (TT) after a two-year follow-up. This prospective study was conducted among community-living, young elderly volunteers in the centre "Investigations Preventives et Cliniques" and "Observatoire De l'Age" (Paris, France). Three hundred and forty-four participants aged 63.5 on average were enrolled and performed the TT twice, once at inclusion and again two years later. After the two-year follow-up, two groups were constituted according to whether or not there was a decrease in the TT score: the "TT no-deterioration" group comprised subjects with a decrease of less than two points and the "TT deterioration" group comprised those with a decrease of two points or more. Selected demographic, social, clinical, psychological, and biological parameters for the two groups were then compared. Statistical analysis showed that female sex, advanced age, high body mass index, osteoarticular pain, and a high level of anxiety all have a negative impact on TT score. Knowledge of predictive factors of the onset or worsening of balance and gait disorders could allow clinicians to detect young elderly people who should benefit from a specific prevention program.

  5. Investigating the Utility of Analytic Scoring for the TOEFL Academic Speaking Test (TAST). TOEFL iBT Research Report. TOEFL iBT-01. ETS RR-06-07

    Science.gov (United States)

    Xi, Xiaoming; Mollaun, Pam

    2006-01-01

    This study explores the utility of analytic scoring for the TOEFL® Academic Speaking Test (TAST) in providing useful and reliable diagnostic information in three aspects of candidates' performance: delivery, language use, and topic development. G studies were used to investigate the dependability of the analytic scores, the distinctness of the…

  6. Healthy adolescent performance on the MATRICS Consensus Cognitive Battery (MCCB): Developmental data from two samples of volunteers.

    Science.gov (United States)

    Stone, William S; Mesholam-Gately, Raquelle I; Giuliano, Anthony J; Woodberry, Kristen A; Addington, Jean; Bearden, Carrie E; Cadenhead, Kristin S; Cannon, Tyrone D; Cornblatt, Barbara A; Mathalon, Daniel H; McGlashan, Thomas H; Perkins, Diana O; Tsuang, Ming T; Walker, Elaine F; Woods, Scott W; McCarley, Robert W; Heinssen, Robert; Green, Michael F; Nuechterlein, Keith; Seidman, Larry J

    2016-04-01

    The MATRICS Consensus Cognitive Battery (MCCB) fills a significant need for a standardized battery of cognitive tests to use in clinical trials for schizophrenia in adults aged 20-59. A need remains, however, to develop norms for younger individuals, who also show elevated risks for schizophrenia. Toward this end, we assessed performance in healthy adolescents. Baseline MCCB, reading and IQ data were obtained from healthy controls (ages 12-19) participating in two concurrent NIMH-funded studies: North American Prodromal Longitudinal Study phase 2 (NAPLS-2; n=126) and Boston Center for Intervention Development and Applied Research (CIDAR; n=13). All MCCB tests were administered except the Managing Emotions subtest from the Mayer-Salovey-Caruso Emotional Intelligence Test. Data were collected from 8 sites across North America. MCCB scores were presented in four 2-year age cohorts as T-scores for each test and cognitive domain, and analyzed for effects of age and sex. Due to IQ differences between age-grouped subsamples, IQ served as a covariate in analyses. Overall and sex-based raw scores for individual MCCB tests are presented for each age-based cohort. Adolescents generally showed improvement with age in most MCCB cognitive domains, with the clearest linear trends in Attention/Vigilance and Working Memory. These control data show that healthy adolescence is a dynamic period for cognitive development that is marked by substantial improvement in MCCB performance through the 12-19 age range. They also provide healthy comparison raw scores to facilitate clinical evaluations of adolescents, including those at risk for developing psychiatric disorders such as schizophrenia-related conditions.

  7. Simplified clinical prediction scores to target viral load testing in adults with suspected first line treatment failure in Phnom Penh, Cambodia.

    Directory of Open Access Journals (Sweden)

    Johan van Griensven

    Full Text Available BACKGROUND: For settings with limited laboratory capacity, 2013 World Health Organization (WHO guidelines recommend targeted HIV-1 viral load (VL testing to identify virological failure. We previously developed and validated a clinical prediction score (CPS for targeted VL testing, relying on clinical, adherence and laboratory data. While outperforming the WHO failure criteria, it required substantial calculation and review of all previous laboratory tests. In response, we developed four simplified, less error-prone and broadly applicable CPS versions that can be done 'on the spot'. METHODOLOGY/PRINCIPAL: Findings From May 2010 to June 2011, we validated the original CPS in a non-governmental hospital in Phnom Penh, Cambodia applying the CPS to adults on first-line treatment >1 year. Virological failure was defined as a single VL >1000 copies/ml. The four CPSs included CPS1 with 'current CD4 count' instead of %-decline-from-peak CD4; CPS2 with hemoglobin measurements removed; CPS3 having 'decrease in CD4 count below baseline value' removed; CPS4 was purely clinical. Score development relied on the Spiegelhalter/Knill-Jones method. Variables independently associated with virological failure with a likelihood ratio ≥ 1.5 or ≤ 0.67 were retained. CPS performance was evaluated based on the area-under-the-ROC-curve (AUROC and 95% confidence intervals (CI. The CPSs were validated in an independent dataset. A total of 1490 individuals (56.6% female, median age: 38 years (interquartile range (IQR 33-44; median baseline CD4 count: 94 cells/µL (IQR 28-205, median time on antiretroviral therapy 3.6 years (IQR 2.1-5.1, were included. Forty-five 45 (3.0% individuals had virological failure. CPS1 yielded an AUROC of 0.69 (95% CI: 0.62-0.75 in validation, CPS2 an AUROC of 0.68 (95% CI: 0.62-0.74, and CPS3, an AUROC of 0.67 (95% CI: 0.61-0.73. The purely clinical CPS4 performed poorly (AUROC-0.59; 95% CI: 0.53-0.65. CONCLUSIONS: Simplified CPSs retained

  8. Comparison of the efficiency between two sampling plans for aflatoxins analysis in maize.

    Science.gov (United States)

    Mallmann, Adriano Olnei; Marchioro, Alexandro; Oliveira, Maurício Schneider; Rauber, Ricardo Hummes; Dilkin, Paulo; Mallmann, Carlos Augusto

    2014-01-01

    Variance and performance of two sampling plans for aflatoxins quantification in maize were evaluated. Eight lots of maize were sampled using two plans: manual, using sampling spear for kernels; and automatic, using a continuous flow to collect milled maize. Total variance and sampling, preparation, and analysis variance were determined and compared between plans through multifactor analysis of variance. Four theoretical distribution models were used to compare aflatoxins quantification distributions in eight maize lots. The acceptance and rejection probabilities for a lot under certain aflatoxin concentration were determined using variance and the information on the selected distribution model to build the operational characteristic curves (OC). Sampling and total variance were lower at the automatic plan. The OC curve from the automatic plan reduced both consumer and producer risks in comparison to the manual plan. The automatic plan is more efficient than the manual one because it expresses more accurately the real aflatoxin contamination in maize.

  9. Comparison of the efficiency between two sampling plans for aflatoxins analysis in maize

    Directory of Open Access Journals (Sweden)

    Adriano Olnei Mallmann

    2014-01-01

    Full Text Available Variance and performance of two sampling plans for aflatoxins quantification in maize were evaluated. Eight lots of maize were sampled using two plans: manual, using sampling spear for kernels; and automatic, using a continuous flow to collect milled maize. Total variance and sampling, preparation, and analysis variance were determined and compared between plans through multifactor analysis of variance. Four theoretical distribution models were used to compare aflatoxins quantification distributions in eight maize lots. The acceptance and rejection probabilities for a lot under certain aflatoxin concentration were determined using variance and the information on the selected distribution model to build the operational characteristic curves (OC. Sampling and total variance were lower at the automatic plan. The OC curve from the automatic plan reduced both consumer and producer risks in comparison to the manual plan. The automatic plan is more efficient than the manual one because it expresses more accurately the real aflatoxin contamination in maize.

  10. Linking English-Language Test Scores onto the Common European Framework of Reference: An Application of Standard-Setting Methodology. TOEFL iBT Research Report TOEFL iBt-06. ETS RR-08-34

    Science.gov (United States)

    Tannenbaum, Richard J.; Wylie, E. Caroline

    2008-01-01

    The Common European Framework of Reference (CEFR) describes language proficiency in reading, writing, speaking, and listening on a 6-level scale. In this study, English-language experts from across Europe linked CEFR levels to scores on three tests: the TOEFL® iBT test, the TOEIC® assessment, and the TOEIC "Bridge"™ test.…

  11. Estimating Decision Indices Based on Composite Scores

    Science.gov (United States)

    Knupp, Tawnya Lee

    2009-01-01

    The purpose of this study was to develop an IRT model that would enable the estimation of decision indices based on composite scores. The composite scores, defined as a combination of unidimensional test scores, were either a total raw score or an average scale score. Additionally, estimation methods for the normal and compound multinomial models…

  12. Effects of learning-style environmental and tactal/kinesthetic preferences on the understanding of scientific terms and attitude test scores of fifth-grade students

    Science.gov (United States)

    Sullivan, Angela Tirino

    This investigator analyzed the effects of learning-style environmental and tactual/kinesthetic preferences on the understanding of scientific terms and attitude test scores of fifth-grade students. To identify individual preferences, the Learning-Styles Inventory (Dunn, Dunn & Price, 1996) was administered to students who attended a suburban elementary school. Forty-six general education students were given instruction through the gradual establishment of an environmentally- and perceptually-responsive learning-style classroom. Instructional units were divided into three phases of two weeks each. The units of scientific terms were instructed for varied learning-style preferences and were gradually introduced during these instructional phases: Phase 1: Electricity was taught with traditional teaching methods; Phase 2: The Source of Energy was taught with accommodations for sound, light, temperature, design elements; Phase 3: Pollution was taught with accommodations for tactual/kinesthetic modalities. Pre and Post-tests, were administered in each of the three phases to determine scientific term gains. A repeated measures ANOVA and General Linear Model were employed to compare mean gains from phase to phase. Post-hoc comparisons were performed using the Bonferroni method and similar procedures were conducted on the Semantic Differential Scales (Pizzo, 1981). Correlations of relative gain scores during each phase were assessed by means of Pearson-product-moment correlations. Differences in the strengths of correlated correlations were evaluated by means of t-tests for related correlation coefficients. Significant gains were found when students were instructed employing incremental learning-styles strategies. To determine attitudinal changes toward science terms, the Semantic Differential Scale (Pizzo, 1981) was administered three times throughout this study: after Phase 1, traditional teaching; Phases 2 and 3, after learning-styles intervention. Statistically higher

  13. Bayesian inference of genetic parameters for test-day milk yield, milk quality traits, and somatic cell score in Burlina cows.

    Science.gov (United States)

    Penasa, M; Cecchinato, A; Battagin, M; De Marchi, M; Pretto, D; Cassandro, M

    2010-01-01

    The aim of the study was to infer (co)variance components for daily milk yield, fat and protein contents, and somatic cell score (SCS) in Burlina cattle (a local breed in northeast Italy). Data consisted of 13,576 monthly test-day records of 666 cows (parities 1 to 8) collected in 10 herds between 1999 and 2009. Repeatability animal models were implemented using Bayesian methods. Flat priors were assumed for systematic effects of herd test date, days in milk, and parity, as well as for permanent environmental, genetic, and residual effects. On average, Burlina cows produced 17.0 kg of milk per day, with 3.66 and 3.33 percent of fat and protein, respectively, and 358,000 cells per mL of milk. Marginal posterior medians (highest posterior density of 95%) of heritability were 0.18 (0.09-0.28), 0.28 (0.21-0.36), 0.35 (0.25-0.49), and 0.05 (0.01-0.11) for milk yield, fat content, protein content, and SCS, respectively. Marginal posterior medians of genetic correlations between the traits were low and a 95 percent Bayesian confidence region included zero, with the exception of the genetic correlation between fat and protein contents. Despite the low number of animals in the population, results suggest that genetic variance for production and quality traits exists in Burlina cattle.

  14. Effects of ankle strengthening exercises combined with motor imagery training on the timed up and go test score and weight bearing ratio in stroke patients.

    Science.gov (United States)

    Kim, Sung Shin; Lee, Hyung Jin; You, Young Youl

    2015-07-01

    [Purpose] The purpose of the present study was to compare the effects of ankle strengthening exercises combined with motor imagery training and those of ankle strengthening exercises alone in stroke patients. [Subjects and Methods] Thirty stroke patients were randomly assigned to one of the following two groups: experimental group (15 patients) and control group (15 patients). The experimental group underwent motor imagery training for 15 minutes and ankle joint strengthening exercises for 15 minutes, while the control group underwent only ankle joint strengthening exercises for 30 minutes. Each session and training program was implemented four times a week for 4 weeks. The timed up and go (TUG) test score, affected-side weight bearing ratio, and affected-side front/rear weight bearing ratio were assessed. [Results] Both groups demonstrated improvement on the TUG test, and in the affected-side weight bearing ratios, affected-side front/rear weight bearing ratios, and balance errors. The experimental group demonstrated greater improvement than the control group in all variables. [Conclusion] Motor imagery training is an effective treatment method for improving static balance ability in stroke patients.

  15. Predictive Properties of the Gesell School Readiness Screening Test within Samples from Two Treatment Contexts.

    Science.gov (United States)

    Banerji, Madhabi

    The predictive properties of the Gesell School Readiness Screening Test (GSRT) were examined, taking into account the stated purposes of the test and the context of test use. Two samples were used: (1) a control sample of 55 students (21 males and 34 females) whose GSRT scores were not used for placement or tracking; and (2) a treatment sample of…

  16. Category fluency test: effects of age, gender and education on total scores, clustering and switching in Brazilian Portuguese-speaking subjects

    Directory of Open Access Journals (Sweden)

    Brucki S.M.D.

    2004-01-01

    Full Text Available Verbal fluency tests are used as a measure of executive functions and language, and can also be used to evaluate semantic memory. We analyzed the influence of education, gender and age on scores in a verbal fluency test using the animal category, and on number of categories, clustering and switching. We examined 257 healthy participants (152 females and 105 males with a mean age of 49.42 years (SD = 15.75 and having a mean educational level of 5.58 (SD = 4.25 years. We asked them to name as many animals as they could. Analysis of variance was performed to determine the effect of demographic variables. No significant effect of gender was observed for any of the measures. However, age seemed to influence the number of category changes, as expected for a sensitive frontal measure, after being controlled for the effect of education. Educational level had a statistically significant effect on all measures, except for clustering. Subject performance (mean number of animals named according to schooling was: illiterates, 12.1; 1 to 4 years, 12.3; 5 to 8 years, 14.0; 9 to 11 years, 16.7, and more than 11 years, 17.8. We observed a decrease in performance in these five educational groups over time (more items recalled during the first 15 s, followed by a progressive reduction until the fourth interval. We conclude that education had the greatest effect on the category fluency test in this Brazilian sample. Therefore, we must take care in evaluating performance in lower educational subjects.

  17. The foot posture index, ankle lunge test, Beighton scale and the lower limb assessment score in healthy children: a reliability study

    Directory of Open Access Journals (Sweden)

    Evans Angela M

    2012-01-01

    Full Text Available Abstract Background Outcome measures are important when evaluating treatments and physiological progress in paediatric populations. Reliable, relevant measures of foot posture are important for such assessments to be accurate over time. The aim of the study was to assess the intra- and inter-rater reliability of common outcome measures for paediatric foot conditions. Methods A repeated measures, same-subject design assessed the intra- and inter-rater reliability of measures of foot posture, joint hypermobility and ankle range: the Foot Posture Index (FPI-6, the ankle lunge test, the Beighton scale and the lower limb assessment scale (LLAS, used by two examiners in 30 healthy children (aged 7 to 15 years. The Oxford Ankle Foot Questionnaire (OxAFQ-C was completed by participants and a parent, to assess the extent of foot and ankle problems. Results The OxAFQ-C demonstrated a mean (SD score of 6 (6 in adults and 7(5 for children, showing good agreement between parents and children, and which indicates mid-range (transient disability. Intra-rater reliability was good for the FPI-6 (ICC = 0.93 - 0.94, ankle lunge test (ICC = 0.85-0.95, Beighton scale (ICC = 0.96-0.98 and LLAS (ICC = 0.90-0.98. Inter-rater reliability was largely good for each of the: FPI-6 (ICC = 0.79, ankle lunge test (ICC = 0.83, Beighton scale (ICC = 0.73 and LLAS (ICC = 0.78. Conclusion The four measures investigated demonstrated adequate intra-rater and inter-rater reliability in this paediatric sample, which further justifies their use in clinical practice.

  18. Could test length or order affect scores on letter number sequencing of the WAIS-III and WMS-III? Ruling out effects of fatigue.

    Science.gov (United States)

    Tulsky, D S; Zhu, J

    2000-11-01

    The Letter Number Sequencing subtest of the WAIS-III and WMS-III was administered at the end of the standardization edition of the WMS-III. It was not administered as part of the WAIS-III standardization battery. Nevertheless, the subtest was included in the published version of the WAIS-III. This study examines differences between examinees administered the Letter Number Sequencing subtest at three different times during a psychological battery: (1) as part of the published battery, (2) as part of the WMS-III when the WMS-III was administered as the first test in a sequence, and (3) as part of the WMS-III standardization when the WAIS-III was administered immediately preceding the WMS-III. The participants were 372 examinees ( n = 124 in each condition) who were matched on key demographic variables. A repeated measures MANOVA yielded no difference in subtest scores when administered in any of these conditions. The results show no evidence of fatigue or ordering effects on the Letter Number Sequencing subtest.

  19. EXPLORACIÓN DE DIFERENCIAS NORMATIVAS EN EL SISTEMA DE CALIFICACIÓN CUALITATIVA PARA EL TEST GESTÁLTICO DE BENDER MODIFICADO/ EXPLORING NORMATIVE DIFFERENCES IN QUALITATIVE SCORING SYSTEM FOR MODIFIED THE BENDER GESTALT TEST

    Directory of Open Access Journals (Sweden)

    César Merino Soto*

    2011-09-01

    Full Text Available RESUMENEl presente estudio explora la magnitud de las diferencias en los puntajes del Sistema de Calificación Cualitativa parael Test Gestáltico de Bender Modificado, usando diferente información normativa proveniente de Perú, Estados Unidos yChina. En una muestra de 324 niños(as peruanos entre 5 y 6 años de edad, se analizaron las potenciales diferencias en ladensidad, tendencia central, dispersión y clasificaciones de rendimiento visomotor. Se hallaron grandes diferenciasnormativas, y por lo tanto, el desempeño en los participantes se vio altamente sobreestimado o subestimado dependiendode la norma usada. Se discute el impacto de estos resultados en la apropiada práctica evaluativa en niños.ABSTRACTThis study explores the magnitude of difference in scores from Qualitative Scoring System to Bender Gestalt TestModified using different normative data from Peru, USA and China. In a sample of 324 children (boys and girls between5 and 6 ages, we analyzed the potential differences in density, central tendency, dispersion and visual motor performanceclassifications. It was found large normative differences, and therefore, performance in participants was highlyoverestimated or underestimated depending on the standard used. It discusses the impact of these results in the appropriateassessment practice in children.

  20. Transitions in cognitive test scores over 5 and 10 years in elderly people: Evidence for a model of age-related deficit accumulation

    Directory of Open Access Journals (Sweden)

    Rockwood Kenneth

    2008-02-01

    Full Text Available Abstract Background On average, health worsens with age, but many people have periods of improvement. A stochastic model provides an excellent description of how such changes occur. Given that cognition also changes with age, we wondered whether the same model might also describe the accumulation of errors in cognitive test scores in community-dwelling older adults. Methods In this prospective cohort study, 8954 older people (aged 65+ at baseline from the Canadian Study of Health and Aging were followed for 10 years. Cognitive status was defined by the number of errors on the 100-point Modified Min-Mental State Examination. The error count was chosen to parallel the deficit count in the general model of aging, which is based on deficit accumulation. As with the deficit count, a Markov chain transition model was employed, with 4 parameters. Results On average, the chance of making errors increased linearly with the number of errors present at each time interval. Changes in cognitive states were described with high accuracy (R2 = 0.96 by a modified Poisson distribution, using four parameters: the background chance of accumulating additional errors, the chance of incurring more or fewer errors, given the existing number, and the corresponding background and incremental chances of dying. Conclusion The change in the number of errors in a cognitive test corresponded to a general model that also summarizes age-related changes in deficits. The model accounts for both improvement and deterioration and appears to represent a clinically relevant means of quantifying how various aspects of health status change with age.