sample test scores: Topics by WorldWideScience.org

Sample records for sample test scores

What Do Test Scores Really Mean? A Latent Class Analysis of Danish Test Score Performance

DEFF Research Database (Denmark)

Munk, Martin D.; McIntosh, James

2014-01-01

Latent class Poisson count models are used to analyze a sample of Danish test score results from a cohort of individuals born in 1954-55, tested in 1968, and followed until 2011. The procedure takes account of unobservable effects as well as excessive zeros in the data. We show that the test scores...... of intelligence explain a significant proportion of the variation in test scores. This adds to the complexity of interpreting test scores and suggests that school culture and possible incentive problems make it more di¢ cult to understand what the tests measure....
Prediction of true test scores from observed item scores and ancillary data.

Science.gov (United States)

Haberman, Shelby J; Yao, Lili; Sinharay, Sandip

2015-05-01

In many educational tests which involve constructed responses, a traditional test score is obtained by adding together item scores obtained through holistic scoring by trained human raters. For example, this practice was used until 2008 in the case of GRE(®) General Analytical Writing and until 2009 in the case of TOEFL(®) iBT Writing. With use of natural language processing, it is possible to obtain additional information concerning item responses from computer programs such as e-rater(®). In addition, available information relevant to examinee performance may include scores on related tests. We suggest application of standard results from classical test theory to the available data to obtain best linear predictors of true traditional test scores. In performing such analysis, we require estimation of variances and covariances of measurement errors, a task which can be quite difficult in the case of tests with limited numbers of items and with multiple measurements per item. As a consequence, a new estimation method is suggested based on samples of examinees who have taken an assessment more than once. Such samples are typically not random samples of the general population of examinees, so that we apply statistical adjustment methods to obtain the needed estimated variances and covariances of measurement errors. To examine practical implications of the suggested methods of analysis, applications are made to GRE General Analytical Writing and TOEFL iBT Writing. Results obtained indicate that substantial improvements are possible both in terms of reliability of scoring and in terms of assessment reliability. © 2015 The British Psychological Society.
The quantitative LOD score: test statistic and sample size for exclusion and linkage of quantitative traits in human sibships.

Science.gov (United States)

Page, G P; Amos, C I; Boerwinkle, E

1998-04-01

We present a test statistic, the quantitative LOD (QLOD) score, for the testing of both linkage and exclusion of quantitative-trait loci in randomly selected human sibships. As with the traditional LOD score, the boundary values of 3, for linkage, and -2, for exclusion, can be used for the QLOD score. We investigated the sample sizes required for inferring exclusion and linkage, for various combinations of linked genetic variance, total heritability, recombination distance, and sibship size, using fixed-size sampling. The sample sizes required for both linkage and exclusion were not qualitatively different and depended on the percentage of variance being linked or excluded and on the total genetic variance. Information regarding linkage and exclusion in sibships larger than size 2 increased as approximately all possible pairs n(n-1)/2 up to sibships of size 6. Increasing the recombination (theta) distance between the marker and the trait loci reduced empirically the power for both linkage and exclusion, as a function of approximately (1-2theta)4.
Power and sample size evaluation for the Cochran-Mantel-Haenszel mean score (Wilcoxon rank sum) test and the Cochran-Armitage test for trend.

Science.gov (United States)

Lachin, John M

2011-11-10

The power of a chi-square test, and thus the required sample size, are a function of the noncentrality parameter that can be obtained as the limiting expectation of the test statistic under an alternative hypothesis specification. Herein, we apply this principle to derive simple expressions for two tests that are commonly applied to discrete ordinal data. The Wilcoxon rank sum test for the equality of distributions in two groups is algebraically equivalent to the Mann-Whitney test. The Kruskal-Wallis test applies to multiple groups. These tests are equivalent to a Cochran-Mantel-Haenszel mean score test using rank scores for a set of C-discrete categories. Although various authors have assessed the power function of the Wilcoxon and Mann-Whitney tests, herein it is shown that the power of these tests with discrete observations, that is, with tied ranks, is readily provided by the power function of the corresponding Cochran-Mantel-Haenszel mean scores test for two and R > 2 groups. These expressions yield results virtually identical to those derived previously for rank scores and also apply to other score functions. The Cochran-Armitage test for trend assesses whether there is an monotonically increasing or decreasing trend in the proportions with a positive outcome or response over the C-ordered categories of an ordinal independent variable, for example, dose. Herein, it is shown that the power of the test is a function of the slope of the response probabilities over the ordinal scores assigned to the groups that yields simple expressions for the power of the test. Copyright © 2011 John Wiley & Sons, Ltd.
SKATE: a docking program that decouples systematic sampling from scoring.

Science.gov (United States)

Feng, Jianwen A; Marshall, Garland R

2010-11-15

SKATE is a docking prototype that decouples systematic sampling from scoring. This novel approach removes any interdependence between sampling and scoring functions to achieve better sampling and, thus, improves docking accuracy. SKATE systematically samples a ligand's conformational, rotational and translational degrees of freedom, as constrained by a receptor pocket, to find sterically allowed poses. Efficient systematic sampling is achieved by pruning the combinatorial tree using aggregate assembly, discriminant analysis, adaptive sampling, radial sampling, and clustering. Because systematic sampling is decoupled from scoring, the poses generated by SKATE can be ranked by any published, or in-house, scoring function. To test the performance of SKATE, ligands from the Asetex/CDCC set, the Surflex set, and the Vertex set, a total of 266 complexes, were redocked to their respective receptors. The results show that SKATE was able to sample poses within 2 A RMSD of the native structure for 98, 95, and 98% of the cases in the Astex/CDCC, Surflex, and Vertex sets, respectively. Cross-docking accuracy of SKATE was also assessed by docking 10 ligands to thymidine kinase and 73 ligands to cyclin-dependent kinase. 2010 Wiley Periodicals, Inc.
Relationships between narrative language samples and norm-referenced test scores in language assessments of school-age children.

Science.gov (United States)

Danahy Ebert, Kerry; Scott, Cheryl M

2014-10-01

Both narrative language samples and norm-referenced language tests can be important components of language assessment for school-age children. The present study explored the relationship between these 2 tools within a group of children referred for language assessment. The study is a retrospective analysis of clinical records from 73 school-age children. Participants had completed an oral narrative language sample and at least one norm-referenced language test. Correlations between microstructural language sample measures and norm-referenced test scores were compared for younger (6- to 8-year-old) and older (9- to 12-year-old) children. Contingency tables were constructed to compare the 2 types of tools, at 2 different cutpoints, in terms of which children were identified as having a language disorder. Correlations between narrative language sample measures and norm-referenced tests were stronger for the younger group than the older group. Within the younger group, the level of language assessed by each measure contributed to associations among measures. Contingency analyses revealed moderate overlap in the children identified by each tool, with agreement affected by the cutpoint used. Narrative language samples may complement norm-referenced tests well, but age combined with narrative task can be expected to influence the nature of the relationship.
A Human Capital Model of Educational Test Scores

DEFF Research Database (Denmark)

McIntosh, James; D. Munk, Martin

Latent class Poisson count models are used to analyze a sample of Danish test score results from a cohort of individuals born in 1954-55 and tested in 1968. The procedure takes account of unobservable effects as well as excessive zeros in the data. The bulk of unobservable effects are uncorrelated...... with observable parental attributes and, thus, are environmental rather than genetic in origin. We show that the test scores measure manifest or measured ability as it has evolved over the life of the respondent and is, thus, more a product of the human capital formation process than some latent or fundamental...... measure of pure cognitive ability. We find that variables which are not closely associated with traditional notions of intelligence explain a significant proportion of the variation in test scores. This adds to the complexity of interpreting test scores and suggests that school culture, attitudes...
The Effect of Mock Tests on Iranian EFL learners’ Test Scores

Directory of Open Access Journals (Sweden)

Hossein Khodabakhshzadeh

2016-07-01

Full Text Available The effect of using tests in test preparation courses has been subject to debate. While some scholars such as Yang and Badger (2015 believe it is a cause of positive washback effect, others argue that this issue is tentative and context-bound (Green, 2007. Therefore, this study investigated the effect of using Mock tests in International English Language Testing System (IELTS preparation courses on students’ overall IELTS scores. Fifty one IELTS students were selected non-randomly through the quota sampling approach out of 76 students at Mahan Language Institute in Birjand, Iran. These participants were distributed into Group 1 (n=25 and Group 2 (n=26. A complete IELTS test was administered to ensure that the Groups were homogeneous and to serve as pretest. After 10 sessions of intervention, a different IELTS test was administered as posttest. The results of between subject analysis through independent samples t-test revealed that using Mock tests in the IELTS preparation courses can positively affect the participants scores on IELTS exam. Pedagogical implications are discussed.
What do educational test scores really measure?

DEFF Research Database (Denmark)

McIntosh, James; D. Munk, Martin

Latent class Poisson count models are used to analyze a sample of Danish test score results from a cohort of individuals born in 1954-55 and tested in 1968. The procedure takes account of unobservable effects as well as excessive zeros in the data. The bulk of unobservable effects are uncorrelate......, and possible incentive problems make it more difficult to elicit true values of what the tests measure....
LOD score exclusion analyses for candidate QTLs using random population samples.

Science.gov (United States)

Deng, Hong-Wen

2003-11-01

While extensive analyses have been conducted to test for, no formal analyses have been conducted to test against, the importance of candidate genes as putative QTLs using random population samples. Previously, we developed an LOD score exclusion mapping approach for candidate genes for complex diseases. Here, we extend this LOD score approach for exclusion analyses of candidate genes for quantitative traits. Under this approach, specific genetic effects (as reflected by heritability) and inheritance models at candidate QTLs can be analyzed and if an LOD score is < or = -2.0, the locus can be excluded from having a heritability larger than that specified. Simulations show that this approach has high power to exclude a candidate gene from having moderate genetic effects if it is not a QTL and is robust to population admixture. Our exclusion analysis complements association analysis for candidate genes as putative QTLs in random population samples. The approach is applied to test the importance of Vitamin D receptor (VDR) gene as a potential QTL underlying the variation of bone mass, an important determinant of osteoporosis.
Do Test Scores Buy Happiness?

Science.gov (United States)

McCluskey, Neal

2017-01-01

Since at least the enactment of No Child Left Behind in 2002, standardized test scores have served as the primary measures of public school effectiveness. Yet, such scores fail to measure the ultimate goal of education: maximizing happiness. This exploratory analysis assesses nation level associations between test scores and happiness, controlling…
Group differences in the heritability of items and test scores

NARCIS (Netherlands)

Wicherts, J.M.; Johnson, W.

2009-01-01

It is important to understand potential sources of group differences in the heritability of intelligence test scores. On the basis of a basic item response model we argue that heritabilities which are based on dichotomous item scores normally do not generalize from one sample to the next. If groups
LOD score exclusion analyses for candidate genes using random population samples.

Science.gov (United States)

Deng, H W; Li, J; Recker, R R

2001-05-01

While extensive analyses have been conducted to test for, no formal analyses have been conducted to test against, the importance of candidate genes with random population samples. We develop a LOD score approach for exclusion analyses of candidate genes with random population samples. Under this approach, specific genetic effects and inheritance models at candidate genes can be analysed and if a LOD score is < or = - 2.0, the locus can be excluded from having an effect larger than that specified. Computer simulations show that, with sample sizes often employed in association studies, this approach has high power to exclude a gene from having moderate genetic effects. In contrast to regular association analyses, population admixture will not affect the robustness of our analyses; in fact, it renders our analyses more conservative and thus any significant exclusion result is robust. Our exclusion analysis complements association analysis for candidate genes in random population samples and is parallel to the exclusion mapping analyses that may be conducted in linkage analyses with pedigrees or relative pairs. The usefulness of the approach is demonstrated by an application to test the importance of vitamin D receptor and estrogen receptor genes underlying the differential risk to osteoporotic fractures.
Predicting occupational personality test scores.

Science.gov (United States)

Furnham, A; Drakeley, R

2000-01-01

The relationship between students' actual test scores and their self-estimated scores on the Hogan Personality Inventory (HPI; R. Hogan & J. Hogan, 1992), an omnibus personality questionnaire, was examined. Despite being given descriptive statistics and explanations of each of the dimensions measured, the students tended to overestimate their scores; yet all correlations between actual and estimated scores were positive and significant. Correlations between self-estimates and actual test scores were highest for sociability, ambition, and adjustment (r = .62 to r = .67). The results are discussed in terms of employers' use and abuse of personality assessment for job recruitment.
Outlier removal, sum scores, and the inflation of the Type I error rate in independent samples t tests: the power of alternatives and recommendations.

Science.gov (United States)

Bakker, Marjan; Wicherts, Jelte M

2014-09-01

In psychology, outliers are often excluded before running an independent samples t test, and data are often nonnormal because of the use of sum scores based on tests and questionnaires. This article concerns the handling of outliers in the context of independent samples t tests applied to nonnormal sum scores. After reviewing common practice, we present results of simulations of artificial and actual psychological data, which show that the removal of outliers based on commonly used Z value thresholds severely increases the Type I error rate. We found Type I error rates of above 20% after removing outliers with a threshold value of Z = 2 in a short and difficult test. Inflations of Type I error rates are particularly severe when researchers are given the freedom to alter threshold values of Z after having seen the effects thereof on outcomes. We recommend the use of nonparametric Mann-Whitney-Wilcoxon tests or robust Yuen-Welch tests without removing outliers. These alternatives to independent samples t tests are found to have nominal Type I error rates with a minimal loss of power when no outliers are present in the data and to have nominal Type I error rates and good power when outliers are present. PsycINFO Database Record (c) 2014 APA, all rights reserved.
A weighted generalized score statistic for comparison of predictive values of diagnostic tests.

Science.gov (United States)

Kosinski, Andrzej S

2013-03-15

Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations that are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we presented, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic that incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, always reduces to the score statistic in the independent samples situation, and preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe that the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the WGS test statistic in a general GEE setting. Copyright © 2012 John Wiley & Sons, Ltd.
Exploring a Source of Uneven Score Equity across the Test Score Range

Science.gov (United States)

Huggins-Manley, Anne Corinne; Qiu, Yuxi; Penfield, Randall D.

2018-01-01

Score equity assessment (SEA) refers to an examination of population invariance of equating across two or more subpopulations of test examinees. Previous SEA studies have shown that score equity may be present for examinees scoring at particular test score ranges but absent for examinees scoring at other score ranges. No studies to date have…
TOEFL iBT Speaking Test Scores as Indicators of Oral Communicative Language Proficiency

Science.gov (United States)

Bridgeman, Brent; Powers, Donald; Stone, Elizabeth; Mollaun, Pamela

2012-01-01

Scores assigned by trained raters and by an automated scoring system (SpeechRater[TM]) on the speaking section of the TOEFL iBT[TM] were validated against a communicative competence criterion. Specifically, a sample of 555 undergraduate students listened to speech samples from 184 examinees who took the Test of English as a Foreign Language…
High throughput sample processing and automated scoring

Directory of Open Access Journals (Sweden)

Gunnar eBrunborg

2014-10-01

Full Text Available The comet assay is a sensitive and versatile method for assessing DNA damage in cells. In the traditional version of the assay, there are many manual steps involved and few samples can be treated in one experiment. High throughput modifications have been developed during recent years, and they are reviewed and discussed. These modifications include accelerated scoring of comets; other important elements that have been studied and adapted to high throughput are cultivation and manipulation of cells or tissues before and after exposure, and freezing of treated samples until comet analysis and scoring. High throughput methods save time and money but they are useful also for other reasons: large-scale experiments may be performed which are otherwise not practicable (e.g., analysis of many organs from exposed animals, and human biomonitoring studies, and automation gives more uniform sample treatment and less dependence on operator performance. The high throughput modifications now available vary largely in their versatility, capacity, complexity and costs. The bottleneck for further increase of throughput appears to be the scoring.
Modeling Floor Effects in Standardized Vocabulary Test Scores in a Sample of Low SES Hispanic Preschool Children under the Multilevel Structural Equation Modeling Framework

Directory of Open Access Journals (Sweden)

Leina Zhu

2017-12-01

Full Text Available Researchers and practitioners often use standardized vocabulary tests such as the Peabody Picture Vocabulary Test-4 (PPVT-4; Dunn and Dunn, 2007 and its companion, the Expressive Vocabulary Test-2 (EVT-2; Williams, 2007, to assess English vocabulary skills as an indicator of children's school readiness. Despite their psychometric excellence in the norm sample, issues arise when standardized vocabulary tests are used to asses children from culturally, linguistically and ethnically diverse backgrounds (e.g., Spanish-speaking English language learners or delayed in some manner. One of the biggest challenges is establishing the appropriateness of these measures with non-English or non-standard English speaking children as often they score one to two standard deviations below expected levels (e.g., Lonigan et al., 2013. This study re-examines the issues in analyzing the PPVT-4 and EVT-2 scores in a sample of 4-to-5-year-old low SES Hispanic preschool children who were part of a larger randomized clinical trial on the effects of a supplemental English shared-reading vocabulary curriculum (Pollard-Durodola et al., 2016. It was found that data exhibited strong floor effects and the presence of floor effects made it difficult to differentiate the invention group and the control group on their vocabulary growth in the intervention. A simulation study is then presented under the multilevel structural equation modeling (MSEM framework and results revealed that in regular multilevel data analysis, ignoring floor effects in the outcome variables led to biased results in parameter estimates, standard error estimates, and significance tests. Our findings suggest caution in analyzing and interpreting scores of ethnically and culturally diverse children on standardized vocabulary tests (e.g., floor effects. It is recommended appropriate analytical methods that take into account floor effects in outcome variables should be considered.

Validating the Interpretations and Uses of Test Scores

Science.gov (United States)

Kane, Michael T.

2013-01-01

To validate an interpretation or use of test scores is to evaluate the plausibility of the claims based on the scores. An argument-based approach to validation suggests that the claims based on the test scores be outlined as an argument that specifies the inferences and supporting assumptions needed to get from test responses to score-based…
On the Representativeness of Norming Samples for Aptitude Test

National Research Council Canada - National Science Library

Sims, William

2003-01-01

...). We regressed aptitude test scores on demographics and concluded that: ̂ Norming sample for aptitude tests must be representative of the target population with respect to age, race"ethnicity, gender, respondent's education, and mother's...
Test/score/report: Simulation techniques for automating the test process

Science.gov (United States)

Hageman, Barbara H.; Sigman, Clayton B.; Koslosky, John T.

1994-01-01

A Test/Score/Report capability is currently being developed for the Transportable Payload Operations Control Center (TPOCC) Advanced Spacecraft Simulator (TASS) system which will automate testing of the Goddard Space Flight Center (GSFC) Payload Operations Control Center (POCC) and Mission Operations Center (MOC) software in three areas: telemetry decommutation, spacecraft command processing, and spacecraft memory load and dump processing. Automated computer control of the acceptance test process is one of the primary goals of a test team. With the proper simulation tools and user interface, the task of acceptance testing, regression testing, and repeatability of specific test procedures of a ground data system can be a simpler task. Ideally, the goal for complete automation would be to plug the operational deliverable into the simulator, press the start button, execute the test procedure, accumulate and analyze the data, score the results, and report the results to the test team along with a go/no recommendation to the test team. In practice, this may not be possible because of inadequate test tools, pressures of schedules, limited resources, etc. Most tests are accomplished using a certain degree of automation and test procedures that are labor intensive. This paper discusses some simulation techniques that can improve the automation of the test process. The TASS system tests the POCC/MOC software and provides a score based on the test results. The TASS system displays statistics on the success of the POCC/MOC system processing in each of the three areas as well as event messages pertaining to the Test/Score/Report processing. The TASS system also provides formatted reports documenting each step performed during the tests and the results of each step. A prototype of the Test/Score/Report capability is available and currently being used to test some POCC/MOC software deliveries. When this capability is fully operational it should greatly reduce the time necessary
Adaptive testing with equated number-correct scoring

NARCIS (Netherlands)

van der Linden, Willem J.

1999-01-01

A constrained CAT algorithm is presented that automatically equates the number-correct scores on adaptive tests. The algorithm can be used to equate number-correct scores across different administrations of the same adaptive test as well as to an external reference test. The constraints are derived
Evaluation of Factors Affecting Continuous Performance Test Identical Pairs Version Score of Schizophrenic Patients in a Japanese Clinical Sample

Directory of Open Access Journals (Sweden)

Takayoshi Koide

2012-01-01

Full Text Available Aim. Cognitive impairment in schizophrenia strongly relates to social outcome and is a good candidate for endophenotypes. When we accurately measure drug efficacy or effects of genes or variants relevant to schizophrenia on cognitive impairment, clinical factors that can affect scores on cognitive tests, such as age and severity of symptoms, should be considered. To elucidate the effect of clinical factors, we conducted multiple regression analysis using scores of the Continuous Performance Test Identical Pairs Version (CPT-IP, which is often used to measure attention/vigilance in schizophrenia. Methods. We conducted the CPT-IP (4-4 digit and examined clinical information (sex, age, education years, onset age, duration of illness, chlorpromazine-equivalent dose, and Positive and Negative Symptom Scale (PANSS scores in 126 schizophrenia patients in Japanese population. Multiple regression analysis was used to evaluate the effect of clinical factors. Results. Age, chlorpromazine-equivalent dose, and PANSS-negative symptom score were associated with mean d′ score in patients. These three clinical factors explained about 28% of the variance in mean d′ score. Conclusions. As conclusion, CPT-IP score in schizophrenia patients is influenced by age, chlorpromazine-equivalent dose and PANSS negative symptom score.
ITC Guidelines on Quality Control in Scoring, Test Analysis, and Reporting of Test Scores

Science.gov (United States)

Allalouf, Avi

2014-01-01

The Quality Control (QC) Guidelines are intended to increase the efficiency, precision, and accuracy of the scoring, analysis, and reporting process of testing. The QC Guidelines focus on large-scale testing operations where multiple forms of tests are created for use on set dates. However, they may also be used for a wide variety of other testing…
Summary of Score Changes (in other Tests).

Science.gov (United States)

Cleary, T. Anne; McCandless, Sam A.

Scholastic Aptitude Test (SAT) scores have declined during the last 14 years. Similar score declines have been observed in many different testing programs, many groups, and tested areas. The declines, while not large in any given year, have been consistent over time, area, and group. The period around 1965 is critical for the interpretation of…
Data-driven efficient score tests for deconvolution hypotheses

NARCIS (Netherlands)

Langovoy, M.

2008-01-01

We consider testing statistical hypotheses about densities of signals in deconvolution models. A new approach to this problem is proposed. We constructed score tests for the deconvolution density testing with the known noise density and efficient score tests for the case of unknown density. The
Validation of new prognostic and predictive scores by sequential testing approach

International Nuclear Information System (INIS)

Nieder, Carsten; Haukland, Ellinor; Pawinski, Adam; Dalhaug, Astrid

2010-01-01

Background and Purpose: For practitioners, the question arises how their own patient population differs from that used in large-scale analyses resulting in new scores and nomograms and whether such tools actually are valid at a local level and thus can be implemented. A recent article proposed an easy-to-use method for the in-clinic validation of new prediction tools with a limited number of patients, a so-called sequential testing approach. The present study evaluates this approach in scores related to radiation oncology. Material and Methods: Three different scores were used, each predicting short overall survival after palliative radiotherapy (bone metastases, brain metastases, metastatic spinal cord compression). For each scenario, a limited number of consecutive patients entered the sequential testing approach. The positive predictive value (PPV) was used for validation of the respective score and it was required that the PPV exceeded 80%. Results: For two scores, validity in the own local patient population could be confirmed after entering 13 and 17 patients, respectively. For the third score, no decision could be reached even after increasing the sample size to 30. Conclusion: In-clinic validation of new predictive tools with sequential testing approach should be preferred over uncritical adoption of tools which provide no significant benefit to local patient populations. Often the necessary number of patients can be reached within reasonable time frames even in small oncology practices. In addition, validation is performed continuously as the data are collected. (orig.)
Validation of new prognostic and predictive scores by sequential testing approach

Energy Technology Data Exchange (ETDEWEB)

Nieder, Carsten [Radiation Oncology Unit, Nordland Hospital, Bodo (Norway); Inst. of Clinical Medicine, Univ. of Tromso (Norway); Haukland, Ellinor; Pawinski, Adam; Dalhaug, Astrid [Radiation Oncology Unit, Nordland Hospital, Bodo (Norway)

2010-03-15

Background and Purpose: For practitioners, the question arises how their own patient population differs from that used in large-scale analyses resulting in new scores and nomograms and whether such tools actually are valid at a local level and thus can be implemented. A recent article proposed an easy-to-use method for the in-clinic validation of new prediction tools with a limited number of patients, a so-called sequential testing approach. The present study evaluates this approach in scores related to radiation oncology. Material and Methods: Three different scores were used, each predicting short overall survival after palliative radiotherapy (bone metastases, brain metastases, metastatic spinal cord compression). For each scenario, a limited number of consecutive patients entered the sequential testing approach. The positive predictive value (PPV) was used for validation of the respective score and it was required that the PPV exceeded 80%. Results: For two scores, validity in the own local patient population could be confirmed after entering 13 and 17 patients, respectively. For the third score, no decision could be reached even after increasing the sample size to 30. Conclusion: In-clinic validation of new predictive tools with sequential testing approach should be preferred over uncritical adoption of tools which provide no significant benefit to local patient populations. Often the necessary number of patients can be reached within reasonable time frames even in small oncology practices. In addition, validation is performed continuously as the data are collected. (orig.)
Do Standardized Tests Penalize Deep-Thinking, Creative, or Conscientious Students?: Some Personality Correlates of Graduate Record Examinations Test Scores

Science.gov (United States)

Powers, Donald E.; Kaufman, James C.

2004-01-01

The objective of the study reported here was to explore the relationship of Graduate Record Examinations (GRE) General Test scores to selected personality traits--conscientiousness, rationality, ingenuity, quickness, creativity, and depth. A sample of 342 GRE test takers completed short personality inventory scales for each trait. Analyses…
Facilitating the Interpretation of English Language Proficiency Scores: Combining Scale Anchoring and Test Score Mapping Methodologies

Science.gov (United States)

Powers, Donald; Schedl, Mary; Papageorgiou, Spiros

2017-01-01

The aim of this study was to develop, for the benefit of both test takers and test score users, enhanced "TOEFL ITP"® test score reports that go beyond the simple numerical scores that are currently reported. To do so, we applied traditional scale anchoring (proficiency scaling) to item difficulty data in order to develop performance…
The Truth about Scores Children Achieve on Tests.

Science.gov (United States)

Brown, Jonathan R.

1989-01-01

The importance of using the standard error of measurement (SEm) in determining reliability in test scores is emphasized. The SEm is compared to the hypothetical true score for standardized tests, and procedures for calculation of the SEm are explained. (JDD)
Cross-validation of the Dot Counting Test in a large sample of credible and non-credible patients referred for neuropsychological testing.

Science.gov (United States)

McCaul, Courtney; Boone, Kyle B; Ermshar, Annette; Cottingham, Maria; Victor, Tara L; Ziegler, Elizabeth; Zeller, Michelle A; Wright, Matthew

2018-01-18

To cross-validate the Dot Counting Test in a large neuropsychological sample. Dot Counting Test scores were compared in credible (n = 142) and non-credible (n = 335) neuropsychology referrals. Non-credible patients scored significantly higher than credible patients on all Dot Counting Test scores. While the original E-score cut-off of ≥17 achieved excellent specificity (96.5%), it was associated with mediocre sensitivity (52.8%). However, the cut-off could be substantially lowered to ≥13.80, while still maintaining adequate specificity (≥90%), and raising sensitivity to 70.0%. Examination of non-credible subgroups revealed that Dot Counting Test sensitivity in feigned mild traumatic brain injury (mTBI) was 55.8%, whereas sensitivity was 90.6% in patients with non-credible cognitive dysfunction in the context of claimed psychosis, and 81.0% in patients with non-credible cognitive performance in depression or severe TBI. Thus, the Dot Counting Test may have a particular role in detection of non-credible cognitive symptoms in claimed psychiatric disorders. Alternative to use of the E-score, failure on ≥1 cut-offs applied to individual Dot Counting Test scores (≥6.0″ for mean grouped dot counting time, ≥10.0″ for mean ungrouped dot counting time, and ≥4 errors), occurred in 11.3% of the credible sample, while nearly two-thirds (63.6%) of the non-credible sample failed one of more of these cut-offs. An E-score cut-off of 13.80, or failure on ≥1 individual score cut-offs, resulted in few false positive identifications in credible patients, and achieved high sensitivity (64.0-70.0%), and therefore appear appropriate for use in identifying neurocognitive performance invalidity.
Identifying genetic marker sets associated with phenotypes via an efficient adaptive score test

KAUST Repository

Cai, T.

2012-06-25

In recent years, genome-wide association studies (GWAS) and gene-expression profiling have generated a large number of valuable datasets for assessing how genetic variations are related to disease outcomes. With such datasets, it is often of interest to assess the overall effect of a set of genetic markers, assembled based on biological knowledge. Genetic marker-set analyses have been advocated as more reliable and powerful approaches compared with the traditional marginal approaches (Curtis and others, 2005. Pathways to the analysis of microarray data. TRENDS in Biotechnology 23, 429-435; Efroni and others, 2007. Identification of key processes underlying cancer phenotypes using biologic pathway analysis. PLoS One 2, 425). Procedures for testing the overall effect of a marker-set have been actively studied in recent years. For example, score tests derived under an Empirical Bayes (EB) framework (Liu and others, 2007. Semiparametric regression of multidimensional genetic pathway data: least-squares kernel machines and linear mixed models. Biometrics 63, 1079-1088; Liu and others, 2008. Estimation and testing for the effect of a genetic pathway on a disease outcome using logistic kernel machine regression via logistic mixed models. BMC bioinformatics 9, 292-2; Wu and others, 2010. Powerful SNP-set analysis for case-control genome-wide association studies. American Journal of Human Genetics 86, 929) have been proposed as powerful alternatives to the standard Rao score test (Rao, 1948. Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation. Mathematical Proceedings of the Cambridge Philosophical Society, 44, 50-57). The advantages of these EB-based tests are most apparent when the markers are correlated, due to the reduction in the degrees of freedom. In this paper, we propose an adaptive score test which up- or down-weights the contributions from each member of the marker-set based on the Z-scores of
Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF) scores generated from the MMPI-2 and MMPI-2-RF test booklets: internal structure comparability in a sample of criminal defendants.

Science.gov (United States)

Tarescavage, Anthony M; Alosco, Michael L; Ben-Porath, Yossef S; Wood, Arcangela; Luna-Jones, Lynn

2015-04-01

We investigated the internal structure comparability of Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF) scores derived from the MMPI-2 and MMPI-2-RF booklets in a sample of 320 criminal defendants (229 males and 54 females). After exclusion of invalid protocols, the final sample consisted of 96 defendants who were administered the MMPI-2-RF booklet and 83 who completed the MMPI-2. No statistically significant differences in MMPI-2-RF invalidity rates were observed between the two forms. Individuals in the final sample who completed the MMPI-2-RF did not statistically differ on demographics or referral question from those who were administered the MMPI-2 booklet. Independent t tests showed no statistically significant differences between MMPI-2-RF scores generated with the MMPI-2 and MMPI-2-RF booklets on the test's substantive scales. Statistically significant small differences were observed on the revised Variable Response Inconsistency (VRIN-r) and True Response Inconsistency (TRIN-r) scales. Cronbach's alpha and standard errors of measurement were approximately equal between the booklets for all MMPI-2-RF scales. Finally, MMPI-2-RF intercorrelations produced from the two forms yielded mostly small and a few medium differences, indicating that discriminant validity and test structure are maintained. Overall, our findings reflect the internal structure comparability of MMPI-2-RF scale scores generated from MMPI-2 and MMPI-2-RF booklets. Implications of these results and limitations of these findings are discussed. © The Author(s) 2014.
An examination of the RCMAS-2 scores across gender, ethnic background, and age in a large Asian school sample.

Science.gov (United States)

Ang, Rebecca P; Lowe, Patricia A; Yusof, Noradlin

2011-12-01

The present study investigated the factor structure, reliability, convergent and discriminant validity, and U.S. norms of the Revised Children's Manifest Anxiety Scale, Second Edition (RCMAS-2; C. R. Reynolds & B. O. Richmond, 2008a) scores in a Singapore sample of 1,618 school-age children and adolescents. Although there were small statistically significant differences in the average RCMAS-2 T scores found across various demographic groupings, on the whole, the U.S. norms appear adequate for use in the Asian Singapore sample. Results from item bias analyses suggested that biased items detected had small effects and were counterbalanced across gender and ethnicity, and hence, their relative impact on test score variation appears to be minimal. Results of factor analyses on the RCMAS-2 scores supported the presence of a large general anxiety factor, the Total Anxiety factor, and the 5-factor structure found in U.S. samples was replicated. Both the large general anxiety factor and the 5-factor solution were invariant across gender and ethnic background. Internal consistency estimates ranged from adequate to good, and 2-week test-retest reliability estimates were comparable to previous studies. Evidence providing support for convergent and discriminant validity of the RCMAS-2 scores was also found. Taken together, findings provide additional cross-cultural evidence of the appropriateness and usefulness of the RCMAS-2 as a measure of anxiety in Asian Singaporean school-age children and adolescents.
Improving personality facet scores with multidimensional computer adaptive testing

DEFF Research Database (Denmark)

Makransky, Guido; Mortensen, Erik Lykke; Glas, Cees A W

2013-01-01

personality tests contain many highly correlated facets. This article investigates the possibility of increasing the precision of the NEO PI-R facet scores by scoring items with multidimensional item response theory and by efficiently administering and scoring items with multidimensional computer adaptive...
Reporting Diagnostic Scores in Educational Testing: Temptations, Pitfalls, and Some Solutions

Science.gov (United States)

Sinharay, Sandip; Puhan, Gautam; Haberman, Shelby J.

2010-01-01

Diagnostic scores are of increasing interest in educational testing due to their potential remedial and instructional benefit. Naturally, the number of educational tests that report diagnostic scores is on the rise, as are the number of research publications on such scores. This article provides a critical evaluation of diagnostic score reporting…
The Effects of Video Game Experience on Computer-Based Air Traffic Controller Specialist, Air Traffic Scenario Test Scores.

Science.gov (United States)

1997-02-01

application with a strong resemblance to a video game , concern has been raised that prior video game experience might have a moderating effect on scores. Much...such as spatial ability. The effects of computer or video game experience on work sample scores have not been systematically investigated. The purpose...of this study was to evaluate the incremental validity of prior video game experience over that of general aptitude as a predictor of work sample test

Validity and reliability of Abbreviated Mental Test Score (AMTS) among older Iranian.

Science.gov (United States)

Foroughan, Mahshid; Wahlund, Lars-Olof; Jafari, Zahra; Rahgozar, Mehdi; Farahani, Ida G; Rashedi, Vahid

2017-11-01

Cognitive impairment is common among older people and is associated with increased morbidity and mortality. The main aim of this study was to evaluate the validity of the Persian version of the Abbreviated Mental Test Score (AMTS) as a screening tool for dementia. Data were obtained from a cross-sectional study. One hundred and one older adults who were members of Iranian Alzheimer Association and 101 of their siblings were entered into this study by convenient sampling. The Diagnostic and Statistical Manual of Mental Disorders, 4th edition, criteria for diagnosing dementia and the Mini-Mental State Examination were used as the study tools. The gathered data were analyzed by the Mann-Whitney U-test, the Kruskal-Wallis test, Spearman's rank correlation coefficient, and the receiver-operating characteristic. The AMTS could successfully differentiate the dementia group from the non-dementia group. Scores were significantly correlated with Diagnostic and Statistical Manual of Mental Disorders diagnosis for dementia and Mini-Mental State Examination scores (P < 0.001). Educational level (P < 0.001) and male sex (P = 0.015) were positively associated with AMTS, whereas (P < 0.001) was negatively associated with AMTS. Total Cronbach's α coefficient was 0.90. The scores 6 and 7 showed the optimum balance between sensitivity (99% and 94%, respectively) and specificity (85% and 86%, respectively). The Persian version of the AMTS is a valid cognitive assessment tool for older Iranian adults and can be used for dementia screening in Iran. © 2017 Japanese Psychogeriatric Society.
Clock Drawing Test and the diagnosis of amnestic mild cognitive impairment: can more detailed scoring systems do the work?

Science.gov (United States)

Rubínová, Eva; Nikolai, Tomáš; Marková, Hana; Siffelová, Kamila; Laczó, Jan; Hort, Jakub; Vyhnálek, Martin

2014-01-01

The Clock Drawing Test is a frequently used cognitive screening test with several scoring systems in elderly populations. We compare simple and complex scoring systems and evaluate the usefulness of the combination of the Clock Drawing Test with the Mini-Mental State Examination to detect patients with mild cognitive impairment. Patients with amnestic mild cognitive impairment (n = 48) and age- and education-matched controls (n = 48) underwent neuropsychological examinations, including the Clock Drawing Test and the Mini-Mental State Examination. Clock drawings were scored by three blinded raters using one simple (6-point scale) and two complex (17- and 18-point scales) systems. The sensitivity and specificity of these scoring systems used alone and in combination with the Mini-Mental State Examination were determined. Complex scoring systems, but not the simple scoring system, were significant predictors of the amnestic mild cognitive impairment diagnosis in logistic regression analysis. At equal levels of sensitivity (87.5%), the Mini-Mental State Examination showed higher specificity (31.3%, compared with 12.5% for the 17-point Clock Drawing Test scoring scale). The combination of Clock Drawing Test and Mini-Mental State Examination scores increased the area under the curve (0.72; p Drawing Test did not differentiate between healthy elderly and patients with amnestic mild cognitive impairment in our sample. Complex scoring systems were slightly more efficient, yet still were characterized by high rates of false-positive results. We found psychometric improvement using combined scores from the Mini-Mental State Examination and the Clock Drawing Test when complex scoring systems were used. The results of this study support the benefit of using combined scores from simple methods.
A process dissociation approach to objective-projective test score interrelationships.

Science.gov (United States)

Bornstein, Robert F

2002-02-01

Even when self-report and projective measures of a given trait or motive both predict theoretically related features of behavior, scores on the 2 tests correlate modestly with each other. This article describes a process dissociation framework for personality assessment, derived from research on implicit memory and learning, which can resolve these ostensibly conflicting results. Research on interpersonal dependency is used to illustrate 3 key steps in the process dissociation approach: (a) converging behavioral predictions, (b) modest test score intercorrelations, and (c) delineation of variables that differentially affect self-report and projective test scores. Implications of the process dissociation framework for personality assessment and test development are discussed.
Sample size calculation to externally validate scoring systems based on logistic regression models.

Directory of Open Access Journals (Sweden)

Antonio Palazón-Bru

Full Text Available A sample size containing at least 100 events and 100 non-events has been suggested to validate a predictive model, regardless of the model being validated and that certain factors can influence calibration of the predictive model (discrimination, parameterization and incidence. Scoring systems based on binary logistic regression models are a specific type of predictive model.The aim of this study was to develop an algorithm to determine the sample size for validating a scoring system based on a binary logistic regression model and to apply it to a case study.The algorithm was based on bootstrap samples in which the area under the ROC curve, the observed event probabilities through smooth curves, and a measure to determine the lack of calibration (estimated calibration index were calculated. To illustrate its use for interested researchers, the algorithm was applied to a scoring system, based on a binary logistic regression model, to determine mortality in intensive care units.In the case study provided, the algorithm obtained a sample size with 69 events, which is lower than the value suggested in the literature.An algorithm is provided for finding the appropriate sample size to validate scoring systems based on binary logistic regression models. This could be applied to determine the sample size in other similar cases.
A prognostic scoring system for arm exercise stress testing.

Science.gov (United States)

Xie, Yan; Xian, Hong; Chandiramani, Pooja; Bainter, Emily; Wan, Leping; Martin, Wade H

2016-01-01

Arm exercise stress testing may be an equivalent or better predictor of mortality outcome than pharmacological stress imaging for the ≥50% for patients unable to perform leg exercise. Thus, our objective was to develop an arm exercise ECG stress test scoring system, analogous to the Duke Treadmill Score, for predicting outcome in these individuals. In this retrospective observational cohort study, arm exercise ECG stress tests were performed in 443 consecutive veterans aged 64.1 (11.1) years. (mean (SD)) between 1997 and 2002. From multivariate Cox models, arm exercise scores were developed for prediction of 5-year and 12-year all-cause and cardiovascular mortality and 5-year cardiovascular mortality or myocardial infarction (MI). Arm exercise capacity in resting metabolic equivalents (METs), 1 min heart rate recovery (HRR) and ST segment depression ≥1 mm were the stress test variables independently associated with all-cause and cardiovascular mortality by step-wise Cox analysis (all pstatistic of 0.81 before and 0.88 after adjustment for significant demographic and clinical covariates. Arm exercise scores for the other outcome end points yielded C-statistic values of 0.77-0.79 before and 0.82-0.86 after adjustment for significant covariates versus 0.64-0.72 for best fit pharmacological myocardial perfusion imaging models in a cohort of 1730 veterans who were evaluated over the same time period. Arm exercise scores, analogous to the Duke Treadmill Score, have good power for prediction of mortality or MI in patients who cannot perform leg exercise.
[Relationship between unipedal stance test score and center of pressure velocity in elderly].

Science.gov (United States)

Rodrigo Antonio, Guzmán; Rony, Silvestre; Francisco Aniceto, Rodríguez; David Andrés, Arriagada; Pablo Andrés, Ortega

2011-01-01

Frequent falls are one of the most important health problems in the elderly population. The unipedal stance test (UPST), asses postural stability and is used in fall risk measures. Despite this, there is little information about its relationship with posturographic parameters (PP) that characterizes postural stability. Center of pressure velocity (CoPV) is one of the best PP that describes postural stability. The aim of this study was to analyze the relation between UST score and CoPV in elderly population. A sample of 38 healthy elderly subjects where divided in two groups according to their UPST score, low performance (LP, n=11) and high performance (HP, n=27). The correlation between UPST score and COP mean velocity (CoPmV), recorded from a posturographic test, was analyzed between both groups. An inverse correlation between UPST score and CoPmV was found in both groups. However, this was higher in the LP group (r=-0.69, P=.02) compared to the HP (r=-0.39, P=.04). Based on the results of this investigation, it may be concluded that the achievement on UPST has an inverse relationship with CoPmV, especially in subjects with low performance in the UPST. Copyright © 2010 SEGG. Published by Elsevier Espana. All rights reserved.
Increased correlation coefficient between the written test score and tutors' performance test scores after training of tutors for assessment of medical students during problem-based learning course in Malaysia.

Science.gov (United States)

Jaiprakash, Heethal; Min, Aung Ko Ko; Ghosh, Sarmishtha

2016-03-01

This paper is aimed at finding if there was a change of correlation between the written test score and tutors' performance test scores in the assessment of medical students during a problem-based learning (PBL) course in Malaysia. This is a cross-sectional observational study, conducted among 264 medical students in two groups from November 2010 to November 2012. The first group's tutors did not receive tutor training; while the second group's tutors were trained in the PBL process. Each group was divided into high, middle and low achievers based on their end-of-semester exam scores. PBL scores were taken which included written test scores and tutors' performance test scores. Pearson correlation coefficient was calculated between the two kinds of scores in each group. The correlation coefficient between the written scores and tutors' scores in group 1 was 0.099 (pcorrelation coefficient in the group where tutors received the PBL training reinforces the importance of tutor training before their participation in the PBL course.
Effects of white noise on Callsign Acquisition Test and Modified Rhyme Test scores.

Science.gov (United States)

Blue-Terry, Misty; Letowski, Tomasz

2011-02-01

The Callsign Acquisition Test (CAT) is a speech intelligibility test developed by the US Army Research Laboratory. The test has been used to evaluate speech transmission through various communication systems but has not been yet sufficiently standardised and validated. The aim of this study was to compare CAT and Modified Rhyme Test (MRT) performance in the presence of white noise across a range of signal-to-noise ratios (SNRs). A group of 16 normal-hearing listeners participated in the study. The speech items were presented at 65 dB(A) in the background of white noise at SNRs of -18, -15, -12, -9 and -6 dB. The results showed a strong positive association (75.14%) between the two tests, but significant differences between the CAT and MRT absolute scores in the range of investigated SNRs. Based on the data, a function to predict CAT scores based on existing MRT scores and vice versa was formulated. STATEMENT OF RELEVANCE: This work compares performance data of a common speech intelligibility test (MRT) with a new test (CAT) in the presence of white noise. The results here can be used as a part of the standardisation procedures and provide insights to the predictive capabilities of the CAT to quantify speech intelligibility communication in high-noise military environments.
ANOVA Analysis of Student Daily Test Scores in Multi-Day Test Periods

Science.gov (United States)

Mouritsen, Matthew L.; Davis, Jefferson T.; Jones, Steven C.

2016-01-01

Instructors are often concerned when giving multiple-day tests because students taking the test later in the exam period may have an advantage over students taking the test early in the exam period due to information leakage. However, exam scores seemed to decline as students took the same test later in a multi-day exam period (Mouritsen and…
The Effect of Pretest Exercise on Baseline Computerized Neurocognitive Test Scores.

Science.gov (United States)

Pawlukiewicz, Alec; Yengo-Kahn, Aaron M; Solomon, Gary

2017-10-01

Baseline neurocognitive assessment plays a critical role in return-to-play decision making following sport-related concussions. Prior studies have assessed the effect of a variety of modifying factors on neurocognitive baseline test scores. However, relatively little investigation has been conducted regarding the effect of pretest exercise on baseline testing. The aim of our investigation was to determine the effect of pretest exercise on baseline Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) scores in adolescent and young adult athletes. We hypothesized that athletes undergoing self-reported strenuous exercise within 3 hours of baseline testing would perform more poorly on neurocognitive metrics and would report a greater number of symptoms than those who had not completed such exercise. Cross-sectional study; Level of evidence, 3. The ImPACT records of 18,245 adolescent and young adult athletes were retrospectively analyzed. After application of inclusion and exclusion criteria, participants were dichotomized into groups based on a positive (n = 664) or negative (n = 6609) self-reported history of strenuous exercise within 3 hours of the baseline test. Participants with a positive history of exercise were then randomly matched, based on age, sex, education level, concussion history, and hours of sleep prior to testing, on a 1:2 basis with individuals who had reported no pretest exercise. The baseline ImPACT composite scores of the 2 groups were then compared. Significant differences were observed for the ImPACT composite scores of verbal memory, visual memory, reaction time, and impulse control as well as for the total symptom score. No significant between-group difference was detected for the visual motor composite score. Furthermore, pretest exercise was associated with a significant increase in the overall frequency of invalid test results. Our results suggest a statistically significant difference in ImPACT composite scores between
Spinal appearance questionnaire: factor analysis, scoring, reliability, and validity testing.

Science.gov (United States)

Carreon, Leah Y; Sanders, James O; Polly, David W; Sucato, Daniel J; Parent, Stefan; Roy-Beaudry, Marjolaine; Hopkins, Jeffrey; McClung, Anna; Bratcher, Kelly R; Diamond, Beverly E

2011-08-15

Cross sectional. This study presents the factor analysis of the Spinal Appearance Questionnaire (SAQ) and its psychometric properties. Although the SAQ has been administered to a large sample of patients with adolescent idiopathic scoliosis (AIS) treated surgically, its psychometric properties have not been fully evaluated. This study presents the factor analysis and scoring of the SAQ and evaluates its psychometric properties. The SAQ and the Scoliosis Research Society-22 (SRS-22) were administered to AIS patients who were being observed, braced or scheduled for surgery. Standard demographic data and radiographic measures including Lenke type and curve magnitude were also collected. Of the 1802 patients, 83% were female; with a mean age of 14.8 years and mean initial Cobb angle of 55.8° (range, 0°-123°). From the 32 items of the SAQ, 15 loaded on two factors with consistent and significant correlations across all Lenke types. There is an Appearance (items 1-10) and an Expectations factor (items 12-15). Responses are summed giving a range of 5 to 50 for the Appearance domain and 5 to 20 for the Expectations domain. The Cronbach's α was 0.88 for both domains and Total score with a test-retest reliability of 0.81 for Appearance and 0.91 for Expectations. Correlations with major curve magnitude were higher for the SAQ Appearance and SAQ Total scores compared to correlations between the SRS Appearance and SRS Total scores. The SAQ and SRS-22 Scores were statistically significantly different in patients who were scheduled for surgery compared to those who were observed or braced. The SAQ is a valid measure of self-image in patients with AIS with greater correlation to curve magnitude than SRS Appearance and Total score. It also discriminates between patients who require surgery from those who do not.
The Effect of Mock Tests on Iranian EFL learners’ Test Scores

OpenAIRE

Hossein Khodabakhshzadeh; Reza Zardkanloo

2016-01-01

The effect of using tests in test preparation courses has been subject to debate. While some scholars such as Yang and Badger (2015) believe it is a cause of positive washback effect, others argue that this issue is tentative and context-bound (Green, 2007). Therefore, this study investigated the effect of using Mock tests in International English Language Testing System (IELTS) preparation courses on students’ overall IELTS scores. Fifty one IELTS students were selected non-randomly through ...
Biering-Sorensen test scores in coal miners

Energy Technology Data Exchange (ETDEWEB)

Tekin, Y.; Ortancil, O.; Ankarali, H.; Basaran, A.; Sarikaya, S.; Ozdolap, S. [Zonguldak Karaelmas University, Zonguldak (Turkey)

2009-05-15

Biering-Sorensen test is an isometric back endurance test. Biering-Sorensen test scores have varied in different cultural and occupational groups. The aims of this study were to collect normative data on Biering-Sorensen holding times, to determine the discriminative ability of the Biering-Sorensen test in Turkish coal miners, and to examine the association between Biering-Sorensen test result and functional disability. One hundred and fifty male coal miners participated in this study. Trunk extensor muscle strength was measured using the Biering-Sorensen test. Oswestry disability index was used to measure the functional disability level of low back pain. The mean Biering-Sorensen holding time for the total subject group was 107.3 {+-} 22.5 s. The mean time of Biering-Sorensen test of the subjects with and without low back pain were 99.9 {+-} 19.8 and 128.6 {+-} 15.2 s, respectively. The difference between the subjects with and without low back pain was statistically significant (p < 0.001). There was a statistically significant negative correlation between Oswestry functional disability score and Biering-Sorensen holding time (R = -0.824, p < 0.001). Turkish coal miners have low mean back extensor endurance holding times. Biering-Sorensen test had a good discriminative ability in our study group. Trunk muscle strength has a significant effect on the disability level of low back pain. Thus trunk muscle endurance training exercise therapy may be effective for the reduction of disability in patients with low back pain.
Increased correlation coefficient between the written test score and tutors’ performance test scores after training of tutors for assessment of medical students during problem-based learning course in Malaysia

Directory of Open Access Journals (Sweden)

Heethal Jaiprakash

2016-03-01

Full Text Available This paper is aimed at finding if there was a change of correlation between the written test score and tutors’ performance test scores in the assessment of medical students during a problem-based learning (PBL course in Malaysia. This is a cross-sectional observational study, conducted among 264 medical students in two groups from November 2010 to November 2012. The first group’s tutors did not receive tutor training; while the second group’s tutors were trained in the PBL process. Each group was divided into high, middle and low achievers based on their end-of-semester exam scores. PBL scores were taken which included written test scores and tutors’ performance test scores. Pearson correlation coefficient was calculated between the two kinds of scores in each group. The correlation coefficient between the written scores and tutors’ scores in group 1 was 0.099 (p<0.001 and for group 2 was 0.305 (p<0.001. The higher correlation coefficient in the group where tutors received the PBL training reinforces the importance of tutor training before their participation in the PBL course.
WAIS-III index score profiles in the Canadian standardization sample.

Science.gov (United States)

Lange, Rael T

2007-01-01

Representative index score profiles were examined in the Canadian standardization sample of the Wechsler Adult Intelligence Scale-Third Edition (WAIS-III). The identification of profile patterns was based on the methodology proposed by Lange, Iverson, Senior, and Chelune (2002) that aims to maximize the influence of profile shape and minimize the influence of profile magnitude on the cluster solution. A two-step cluster analysis procedure was used (i.e., hierarchical and k-means analyses). Cluster analysis of the four index scores (i.e., Verbal Comprehension [VCI], Perceptual Organization [POI], Working Memory [WMI], Processing Speed [PSI]) identified six profiles in this sample. Profiles were differentiated by pattern of performance and were primarily characterized as (a) high VCI/POI, low WMI/PSI, (b) low VCI/POI, high WMI/PSI, (c) high PSI, (d) low PSI, (e) high VCI/WMI, low POI/PSI, and (f) low VCI, high POI. These profiles are potentially useful for determining whether a patient's WAIS-III performance is unusual in a normal population.
The effect of instructional methodology on high school students natural sciences standardized tests scores

Science.gov (United States)

Powell, P. E.

Educators have recently come to consider inquiry based instruction as a more effective method of instruction than didactic instruction. Experience based learning theory suggests that student performance is linked to teaching method. However, research is limited on inquiry teaching and its effectiveness on preparing students to perform well on standardized tests. The purpose of the study to investigate whether one of these two teaching methodologies was more effective in increasing student performance on standardized science tests. The quasi experimental quantitative study was comprised of two stages. Stage 1 used a survey to identify teaching methods of a convenience sample of 57 teacher participants and determined level of inquiry used in instruction to place participants into instructional groups (the independent variable). Stage 2 used analysis of covariance (ANCOVA) to compare posttest scores on a standardized exam by teaching method. Additional analyses were conducted to examine the differences in science achievement by ethnicity, gender, and socioeconomic status by teaching methodology. Results demonstrated a statistically significant gain in test scores when taught using inquiry based instruction. Subpopulation analyses indicated all groups showed improved mean standardized test scores except African American students. The findings benefit teachers and students by presenting data supporting a method of content delivery that increases teacher efficacy and produces students with a greater cognition of science content that meets the school's mission and goals.
Gender, Stereotype Threat and Mathematics Test Scores

OpenAIRE

Ming Tsui; Xiao Y. Xu; Edmond Venator

2011-01-01

Problem statement: Stereotype threat has repeatedly been shown to depress womens scores on difficult math tests. An attempt to replicate these findings in China found no support for the stereotype threat hypothesis. Our math test was characterized as being personally important for the student participants, an atypical condition in most stereotype threat laboratory research. Approach: To evaluate the effects of this personal demand, we conducted three experiments. Results: ...
The Five-Factor Narcissism Inventory (FFNI): a test of the convergent, discriminant, and incremental validity of FFNI scores in clinical and community samples.

Science.gov (United States)

Miller, Joshua D; Few, Lauren R; Wilson, Lauren; Gentile, Brittany; Widiger, Thomas A; Mackillop, James; Keith Campbell, W

2013-09-01

The five-factor narcissism inventory (FFNI) is a new self-report measure that was developed to assess traits associated with narcissistic personality disorder (NPD), as well as grandiose and vulnerable narcissism from a five-factor model (FFM) perspective. In the current study, the FFNI was examined in relation to Diagnostic and Statistical Manual of Mental Disorders (4th ed., text rev.; DSM-IV; American Psychiatric Association, 2000) NPD, DSM-5 (http://www.dsm5.org) NPD traits, grandiose narcissism, and vulnerable narcissism in both community (N = 287) and clinical samples (N = 98). Across the samples, the FFNI scales manifested good convergent and discriminant validity such that FFNI scales derived from FFM neuroticism were primarily related to vulnerable narcissism scores, scales derived from FFM extraversion were primarily related to grandiose scores, and FFNI scales derived from FFM agreeableness were related to both narcissism dimensions, as well as the DSM-IV and DSM-5 NPD scores. The FFNI grandiose and vulnerable narcissism composites also demonstrated incremental validity in the statistical prediction of these scores, above and beyond existing measures of DSM NPD, grandiose narcissism, and vulnerable narcissism, respectively. The FFNI is a promising measure that provides a comprehensive assessment of narcissistic pathology while maintaining ties to the significant general personality literature on the FFM.
Reformulation of the Children's Eating Attitudes Test (ChEAT): factor structure and scoring method in a non-clinical population.

Science.gov (United States)

Anton, S D; Han, H; Newton, R L; Martin, C K; York-Crowe, E; Stewart, T M; Williamson, D A

2006-12-01

The primary aims of this study were to empirically test the factor structure of the Children's Eating Attitudes Test (ChEAT) through both exploratory and confirmatory factor analyses and to interpret the factor structure of the ChEAT within the context of a new scoring method. The ChEAT was administered to 728 children in the 2nd through 6th grades (from five schools) at two different time points. Exactly half the students were male and half were female. To the best of our knowledge, this is the first study to empirically test the merits of an alternative 6-point scoring system as compared to the traditionally used 4-point scoring system. With the new scoring procedure, the skewness for all factor scores decreased, which resulted in increased variance in the item scores, as well as the total ChEAT score. Since the internal consistency of two factors in a recently proposed model was not acceptable (ChEAT reported by previous investigations. Intercorrelations among the factors suggested three higher order constructs. These findings indicate that the ChEAT subscales may be sufficiently stable to allow use in non-clinical samples of children.
Equating error in observed-score equating

NARCIS (Netherlands)

van der Linden, Willem J.

2006-01-01

Traditionally, error in equating observed scores on two versions of a test is defined as the difference between the transformations that equate the quantiles of their distributions in the sample and population of test takers. But it is argued that if the goal of equating is to adjust the scores of

Polygenic scores predict alcohol problems in an independent sample and show moderation by the environment.

Science.gov (United States)

Salvatore, Jessica E; Aliev, Fazil; Edwards, Alexis C; Evans, David M; Macleod, John; Hickman, Matthew; Lewis, Glyn; Kendler, Kenneth S; Loukola, Anu; Korhonen, Tellervo; Latvala, Antti; Rose, Richard J; Kaprio, Jaakko; Dick, Danielle M

2014-04-10

Alcohol problems represent a classic example of a complex behavioral outcome that is likely influenced by many genes of small effect. A polygenic approach, which examines aggregate measured genetic effects, can have predictive power in cases where individual genes or genetic variants do not. In the current study, we first tested whether polygenic risk for alcohol problems-derived from genome-wide association estimates of an alcohol problems factor score from the age 18 assessment of the Avon Longitudinal Study of Parents and Children (ALSPAC; n = 4304 individuals of European descent; 57% female)-predicted alcohol problems earlier in development (age 14) in an independent sample (FinnTwin12; n = 1162; 53% female). We then tested whether environmental factors (parental knowledge and peer deviance) moderated polygenic risk to predict alcohol problems in the FinnTwin12 sample. We found evidence for both polygenic association and for additive polygene-environment interaction. Higher polygenic scores predicted a greater number of alcohol problems (range of Pearson partial correlations 0.07-0.08, all p-values ≤ 0.01). Moreover, genetic influences were significantly more pronounced under conditions of low parental knowledge or high peer deviance (unstandardized regression coefficients (b), p-values (p), and percent of variance (R2) accounted for by interaction terms: b = 1.54, p = 0.02, R2 = 0.33%; b = 0.94, p = 0.04, R2 = 0.30%, respectively). Supplementary set-based analyses indicated that the individual top single nucleotide polymorphisms (SNPs) contributing to the polygenic scores were not individually enriched for gene-environment interaction. Although the magnitude of the observed effects are small, this study illustrates the usefulness of polygenic approaches for understanding the pathways by which measured genetic predispositions come together with environmental factors to predict complex behavioral outcomes.
Polygenic Scores Predict Alcohol Problems in an Independent Sample and Show Moderation by the Environment

Directory of Open Access Journals (Sweden)

Jessica E. Salvatore

2014-04-01

Full Text Available Alcohol problems represent a classic example of a complex behavioral outcome that is likely influenced by many genes of small effect. A polygenic approach, which examines aggregate measured genetic effects, can have predictive power in cases where individual genes or genetic variants do not. In the current study, we first tested whether polygenic risk for alcohol problems—derived from genome-wide association estimates of an alcohol problems factor score from the age 18 assessment of the Avon Longitudinal Study of Parents and Children (ALSPAC; n = 4304 individuals of European descent; 57% female—predicted alcohol problems earlier in development (age 14 in an independent sample (FinnTwin12; n = 1162; 53% female. We then tested whether environmental factors (parental knowledge and peer deviance moderated polygenic risk to predict alcohol problems in the FinnTwin12 sample. We found evidence for both polygenic association and for additive polygene-environment interaction. Higher polygenic scores predicted a greater number of alcohol problems (range of Pearson partial correlations 0.07–0.08, all p-values ≤ 0.01. Moreover, genetic influences were significantly more pronounced under conditions of low parental knowledge or high peer deviance (unstandardized regression coefficients (b, p-values (p, and percent of variance (R2 accounted for by interaction terms: b = 1.54, p = 0.02, R2 = 0.33%; b = 0.94, p = 0.04, R2 = 0.30%, respectively. Supplementary set-based analyses indicated that the individual top single nucleotide polymorphisms (SNPs contributing to the polygenic scores were not individually enriched for gene-environment interaction. Although the magnitude of the observed effects are small, this study illustrates the usefulness of polygenic approaches for understanding the pathways by which measured genetic predispositions come together with environmental factors to predict complex behavioral outcomes.
The Score Reliability of Draw-a-Person Intellectual Ability Test (DAP: IQ) for Rural Malawi Students

Science.gov (United States)

Khasu, Denis S.; Williams, Thomas O., Jr.

2016-01-01

In this brief article, the reliability of scores for the Draw-A-Person Intellectual Ability Test for Children, Adolescents, and Adults (DAP: IQ; Reynolds & Hickman, 2004) was examined through several analyses with a sample of 147 children from rural Malawi, Africa using a Chichewa translation of instructions. Cronbach alpha coefficients for…
The Weighted Airman Promotion System: Standardizing Test Scores

Science.gov (United States)

2008-01-01

u th o ri ze d Top 3/E6 ratio, inventory 1401206040 100 70 130 5R 2F 2G 3N 2M 2A 4J 4C 4P 4T 4B 1W 2T 3P 1T 4A 2S 5J 1A 1S1C 6F 4N 7S 4R 4E 1N 3A 3V...System: Standardizing Test Scores AFHRL convened a panel to identify the relevant factors to consider, and then sit as a promotion board and rank...Costs If the Air Force decided to standardize test scores, there would be three basic types of costs: implementation costs, marketing costs, and
Generalization of the Lord-Wingersky Algorithm to Computing the Distribution of Summed Test Scores Based on Real-Number Item Scores

Science.gov (United States)

Kim, Seonghoon

2013-01-01

With known item response theory (IRT) item parameters, Lord and Wingersky provided a recursive algorithm for computing the conditional frequency distribution of number-correct test scores, given proficiency. This article presents a generalized algorithm for computing the conditional distribution of summed test scores involving real-number item…
Online pre-race education improves test scores for volunteers at a marathon.

Science.gov (United States)

Maxwell, Shane; Renier, Colleen; Sikka, Robby; Widstrom, Luke; Paulson, William; Christensen, Trent; Olson, David; Nelson, Benjamin

2017-09-01

This study examined whether an online course would lead to increased knowledge about the medical issues volunteers encounter during a marathon. Health care professionals who volunteered to provide medical coverage for an annual marathon were eligible for the study. Demographic information about medical volunteers including profession, specialty, education level and number of marathons they had volunteered for was collected. A 15-question test about the most commonly encountered medical issues was created by the authors and administered before and after the volunteers took the online educational course and compared to a pilot study the previous year. Seventy-four subjects completed the pre-test. Those who participated in the pilot study last year (N = 15) had pre-test scores that were an average of 2.4 points higher than those who did not (mean ranks: pilot study = 51.6 vs. non-pilot = 33.9, p = 0.004). Of the 74 subjects who completed the pre-test, 54 also completed the post-test. The overall post-pre mean score difference was 3.8 ± 2.7 (t = 10.5 df = 53 p online education demonstrated a long-term (one-year) increase in test scores. Testing also continued to show short-term improvement in post-course test scores, compared to pre-course test scores. In general, marathon medical volunteers who had no volunteer experience demonstrated greater improvement than those who had prior volunteer experience.
Using Raters from India to Score a Large-Scale Speaking Test

Science.gov (United States)

Xi, Xiaoming; Mollaun, Pam

2011-01-01

We investigated the scoring of the Speaking section of the Test of English as a Foreign Language[TM] Internet-based (TOEFL iBT[R]) test by speakers of English and one or more Indian languages. We explored the extent to which raters from India, after being trained and certified, were able to score the TOEFL examinees with mixed first languages…
A GMM-Based Test for Normal Disturbances of the Heckman Sample Selection Model

Directory of Open Access Journals (Sweden)

Michael Pfaffermayr

2014-10-01

Full Text Available The Heckman sample selection model relies on the assumption of normal and homoskedastic disturbances. However, before considering more general, alternative semiparametric models that do not need the normality assumption, it seems useful to test this assumption. Following Meijer and Wansbeek (2007, the present contribution derives a GMM-based pseudo-score LM test on whether the third and fourth moments of the disturbances of the outcome equation of the Heckman model conform to those implied by the truncated normal distribution. The test is easy to calculate and in Monte Carlo simulations it shows good performance for sample sizes of 1000 or larger.
Testing the applicability of the SASS5 scoring procedure for ...

African Journals Online (AJOL)

A study was undertaken between 29th January and 17th February 2004 to test the applicability of the South African Scoring System Version 5 (SASS5) scoring and calculation procedure in nutrient-enriched palustrine wetlands in the midlands of KwaZulu-Natal, South Africa. Four reference wetlands and three dairy-effluent ...
Evaluating the Predictive Validity of Graduate Management Admission Test Scores

Science.gov (United States)

Sireci, Stephen G.; Talento-Miller, Eileen

2006-01-01

Admissions data and first-year grade point average (GPA) data from 11 graduate management schools were analyzed to evaluate the predictive validity of Graduate Management Admission Test[R] (GMAT[R]) scores and the extent to which predictive validity held across sex and race/ethnicity. The results indicated GMAT verbal and quantitative scores had…
Dimensional Structure and Measurement Invariance of the Schizotypal Personality Questionnaire - Brief Revised (SPQ-BR) Scores Across American and Spanish Samples.

Science.gov (United States)

Fonseca-Pedrero, Eduardo; Cohen, Alex; Ortuño-Sierra, Javier; de Álbeniz, Alicia Pérez; Muñiz, José

2017-08-01

The main goal of the present study was to test the measurement equivalence of the Schizotypal Personality Questionnaire - Brief Revised (SPQ-BR) scores in a large sample of Spanish and American non-clinical young adults. The sample was made up of 5,625 young adults (M = 19.65 years; SD = 2.53; 38.5% males). Study of the internal structure, using confirmatory factor analysis (CFA), revealed that SPQ-BR items were grouped in a theoretical internal structure of nine first-order factors. Moreover, three or four second-order factor and bifactor models showed adequate goodness-of-fit indices. Multigroup CFA showed that the nine lower-order factor models of the SPQ-BR had configural and weak measurement invariance and partial strong measurement invariance across country. The reliability of the SPQ-BR scores, estimated with omega, ranged from 0.67 to 0.91. Using the item response theory framework, the SPQ-BR provides more accurate information at the medium and high end of the latent trait. Statistically significant differences were found in the raw scores of the SPQ-BR subscales and dimensions across samples. The American group scored higher than the Spanish group in all SPQ-BR domains except Ideas of Reference and Suspiciousness. The finding of comparable factor structure in cross-cultural samples would lend further support to the continuum model of psychosis spectrum disorders. In addition, these results provide new information about the factor structure of schizotypal traits and support the validity and utility of this measure in cross-cultural research.
Effects of Test Media on Different EFL Test-Takers in Writing Scores and in the Cognitive Writing Process

Science.gov (United States)

Zou, Xiao-Ling; Chen, Yan-Min

2016-01-01

The effects of computer and paper test media on EFL test-takers with different computer familiarity in writing scores and in the cognitive writing process have been comprehensively explored from the learners' aspect as well as on the basis of related theories and practice. The results indicate significant differences in test scores among the…
Poisson Approximation-Based Score Test for Detecting Association of Rare Variants.

Science.gov (United States)

Fang, Hongyan; Zhang, Hong; Yang, Yaning

2016-07-01

Genome-wide association study (GWAS) has achieved great success in identifying genetic variants, but the nature of GWAS has determined its inherent limitations. Under the common disease rare variants (CDRV) hypothesis, the traditional association analysis methods commonly used in GWAS for common variants do not have enough power for detecting rare variants with a limited sample size. As a solution to this problem, pooling rare variants by their functions provides an efficient way for identifying susceptible genes. Rare variant typically have low frequencies of minor alleles, and the distribution of the total number of minor alleles of the rare variants can be approximated by a Poisson distribution. Based on this fact, we propose a new test method, the Poisson Approximation-based Score Test (PAST), for association analysis of rare variants. Two testing methods, namely, ePAST and mPAST, are proposed based on different strategies of pooling rare variants. Simulation results and application to the CRESCENDO cohort data show that our methods are more powerful than the existing methods. © 2016 John Wiley & Sons Ltd/University College London.
Does breastfeeding contribute to the racial gap in reading and math test scores?

Science.gov (United States)

Peters, Kristen E; Huang, Jin; Vaughn, Michael G; Witko, Christopher

2013-10-01

The aim of this study was to examine the impact of divergent breastfeeding practices between Caucasian and African American mothers on the lingering achievement test gap between Caucasian and African American children. The Child Development Supplement of the Panel Study of Income Dynamics, beginning in 1997, followed a cohort of 3563 children aged 0-12 years. Reading and math test scores from 2002 for 1928 children were linked with breastfeeding history. Regression analysis was used to examine associations between ever having been breastfed and duration of breastfeeding and test scores, controlling for characteristics of child, mother, and household. African American students scored significantly lower than Caucasian children by 10.6 and 10.9 points on reading and math tests, respectively. After accounting for the impact of having been breastfed during infancy, the racial test gap decreased by 17% for reading scores and 9% for math scores. Study findings indicate that breastfeeding explains 17% and 9% of the observed gaps in reading and math scores, respectively, between African Americans and Caucasians, an effect larger than most recent educational policy interventions. Renewed efforts around policies and clinical practices that promote and remove barriers for African American mothers to breastfeed should be implemented. Copyright © 2013 Elsevier Inc. All rights reserved.
Accountancy, teaching methods, sex, and American College Test scores.

Science.gov (United States)

Heritage, J; Harper, B S; Harper, J P

1990-10-01

This study examines the significance of sex, methodology, academic preparation, and age as related to development of judgmental and problem-solving skills. Sex, American College Test (ACT) Mathematics scores, Composite ACT scores, grades in course work, grade point average (GPA), and age were used in studying the effects of teaching method on 96 students' ability to analyze data in financial statements. Results reflect positively on accounting students compared to the general college population and the women students in particular.
Reduce, Reuse, Recycle: The Longitudinal Value of Local Cut Scores Using State Test Data

Science.gov (United States)

Nelson, Peter M.; Van Norman, Ethan R.; VanDerHeyden, Amanda

2017-01-01

We used existing reading (n = 1,498) and math (n = 2,260) data to evaluate state test scores for screening middle school students. In Phase 1, state test data were used to create a research-derived cut score that was optimal for predicting state test performance the following year. In Phase 2, those cut scores were applied with future cohorts.…
Refining Ovarian Cancer Test accuracy Scores (ROCkeTS): protocol for a prospective longitudinal test accuracy study to validate new risk scores in women with symptoms of suspected ovarian cancer

Science.gov (United States)

Sundar, Sudha; Rick, Caroline; Dowling, Francis; Au, Pui; Rai, Nirmala; Champaneria, Rita; Stobart, Hilary; Neal, Richard; Davenport, Clare; Mallett, Susan; Sutton, Andrew; Kehoe, Sean; Timmerman, Dirk; Bourne, Tom; Van Calster, Ben; Gentry-Maharaj, Aleksandra; Deeks, Jon

2016-01-01

Introduction Ovarian cancer (OC) is associated with non-specific symptoms such as bloating, making accurate diagnosis challenging: only 1 in 3 women with OC presents through primary care referral. National Institute for Health and Care Excellence guidelines recommends sequential testing with CA125 and routine ultrasound in primary care. However, these diagnostic tests have limited sensitivity or specificity. Improving accurate triage in women with vague symptoms is likely to improve mortality by streamlining referral and care pathways. The Refining Ovarian Cancer Test Accuracy Scores (ROCkeTS; HTA 13/13/01) project will derive and validate new tests/risk prediction models that estimate the probability of having OC in women with symptoms. This protocol refers to the prospective study only (phase III). Methods and analysis ROCkeTS comprises four parallel phases. The full ROCkeTS protocol can be found at http://www.birmingham.ac.uk/ROCKETS. Phase III is a prospective test accuracy study. The study will recruit 2450 patients from 15 UK sites. Recruited patients complete symptom and anxiety questionnaires, donate a serum sample and undergo ultrasound scored as per International Ovarian Tumour Analysis (IOTA) criteria. Recruitment is at rapid access clinics, emergency departments and elective clinics. Models to be evaluated include those based on ultrasound derived by the IOTA group and novel models derived from analysis of existing data sets. Estimates of sensitivity, specificity, c-statistic (area under receiver operating curve), positive predictive value and negative predictive value of diagnostic tests are evaluated and a calibration plot for models will be presented. ROCkeTS has received ethical approval from the NHS West Midlands REC (14/WM/1241) and is registered on the controlled trials website (ISRCTN17160843) and the National Institute of Health Research Cancer and Reproductive Health portfolios. PMID:27507231
Validity of GRE General Test scores and TOEFL scores for graduate admission to a technical university in Western Europe

Science.gov (United States)

Zimmermann, Judith; von Davier, Alina A.; Buhmann, Joachim M.; Heinimann, Hans R.

2018-01-01

Graduate admission has become a critical process in tertiary education, whereby selecting valid admissions instruments is key. This study assessed the validity of Graduate Record Examination (GRE) General Test scores for admission to Master's programmes at a technical university in Europe. We investigated the indicative value of GRE scores for the Master's programme grade point average (GGPA) with and without the addition of the undergraduate GPA (UGPA) and the TOEFL score, and of GRE scores for study completion and Master's thesis performance. GRE scores explained 20% of the variation in the GGPA, while additional 7% were explained by the TOEFL score and 3% by the UGPA. Contrary to common belief, the GRE quantitative reasoning score showed only little explanatory power. GRE scores were also weakly related to study progress but not to thesis performance. Nevertheless, GRE and TOEFL scores were found to be sensible admissions instruments. Rigorous methodology was used to obtain highly reliable results.
The Performance of the Upper Limb scores correlate with pulmonary function test measures and Egen Klassifikation scores in Duchenne muscular dystrophy.

Science.gov (United States)

Lee, Ha Neul; Sawnani, Hemant; Horn, Paul S; Rybalsky, Irina; Relucio, Lani; Wong, Brenda L

2016-01-01

The Performance of the Upper Limb scale was developed as an outcome measure specifically for ambulant and non-ambulant patients with Duchenne muscular dystrophy and is implemented in clinical trials needing longitudinal data. The aim of this study is to determine whether this novel tool correlates with functional ability using pulmonary function test, cardiac function test and Egen Klassifikation scale scores as clinical measures. In this cross-sectional study, 43 non-ambulatory Duchenne males from ages 10 to 30 years and on long-term glucocorticoid treatment were enrolled. Cardiac and pulmonary function test results were analyzed to assess cardiopulmonary function, and Egen Klassifikation scores were analyzed to assess functional ability. The Performance of the Upper Limb scores correlated with pulmonary function measures and had inverse correlation with Egen Klassifikation scores. There was no correlation with left ventricular ejection fraction and left ventricular dysfunction. Body mass index and decreased joint range of motion affected total Performance of the Upper Limb scores and should be considered in clinical trial designs. Copyright © 2016 Elsevier B.V. All rights reserved.
Relative Merits of Four Methods for Scoring Cloze Tests.

Science.gov (United States)

Brown, James Dean

1980-01-01

Describes study comparing merits of exact answer, acceptable answer, clozentropy and multiple choice methods for scoring tests. Results show differences among reliability, mean item facility, discrimination and usability, but not validity. (BK)

A comparison of likelihood ratio tests and Rao's score test for three separable covariance matrix structures.

Science.gov (United States)

Filipiak, Katarzyna; Klein, Daniel; Roy, Anuradha

2017-01-01

The problem of testing the separability of a covariance matrix against an unstructured variance-covariance matrix is studied in the context of multivariate repeated measures data using Rao's score test (RST). The RST statistic is developed with the first component of the separable structure as a first-order autoregressive (AR(1)) correlation matrix or an unstructured (UN) covariance matrix under the assumption of multivariate normality. It is shown that the distribution of the RST statistic under the null hypothesis of any separability does not depend on the true values of the mean or the unstructured components of the separable structure. A significant advantage of the RST is that it can be performed for small samples, even smaller than the dimension of the data, where the likelihood ratio test (LRT) cannot be used, and it outperforms the standard LRT in a number of contexts. Monte Carlo simulations are then used to study the comparative behavior of the null distribution of the RST statistic, as well as that of the LRT statistic, in terms of sample size considerations, and for the estimation of the empirical percentiles. Our findings are compared with existing results where the first component of the separable structure is a compound symmetry (CS) correlation matrix. It is also shown by simulations that the empirical null distribution of the RST statistic converges faster than the empirical null distribution of the LRT statistic to the limiting χ 2 distribution. The tests are implemented on a real dataset from medical studies. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
The Mediating Effect of Listening Metacognitive Awareness between Test-Taking Motivation and Listening Test Score: An Expectancy-Value Theory Approach.

Science.gov (United States)

Xu, Jian

2017-01-01

The present study investigated test-taking motivation in L2 listening testing context by applying Expectancy-Value Theory as the framework. Specifically, this study was intended to examine the complex relationships among expectancy, importance, interest, listening anxiety, listening metacognitive awareness, and listening test score using data from a large-scale and high-stakes language test among Chinese first-year undergraduates. Structural equation modeling was used to examine the mediating effect of listening metacognitive awareness on the relationship between expectancy, importance, interest, listening anxiety, and listening test score. According to the results, test takers' listening scores can be predicted by expectancy, interest, and listening anxiety significantly. The relationship between expectancy, interest, listening anxiety, and listening test score was mediated by listening metacognitive awareness. The findings have implications for test takers to improve their test taking motivation and listening metacognitive awareness, as well as for L2 teachers to intervene in L2 listening classrooms.
The Mediating Effect of Listening Metacognitive Awareness between Test-Taking Motivation and Listening Test Score: An Expectancy-Value Theory Approach

Directory of Open Access Journals (Sweden)

Jian Xu

2017-12-01

Full Text Available The present study investigated test-taking motivation in L2 listening testing context by applying Expectancy-Value Theory as the framework. Specifically, this study was intended to examine the complex relationships among expectancy, importance, interest, listening anxiety, listening metacognitive awareness, and listening test score using data from a large-scale and high-stakes language test among Chinese first-year undergraduates. Structural equation modeling was used to examine the mediating effect of listening metacognitive awareness on the relationship between expectancy, importance, interest, listening anxiety, and listening test score. According to the results, test takers’ listening scores can be predicted by expectancy, interest, and listening anxiety significantly. The relationship between expectancy, interest, listening anxiety, and listening test score was mediated by listening metacognitive awareness. The findings have implications for test takers to improve their test taking motivation and listening metacognitive awareness, as well as for L2 teachers to intervene in L2 listening classrooms.
The Dental Hygiene Aptitude Tests and the American College Testing Program Tests as Predictors of Scores on the National Board Dental Hygiene Examination.

Science.gov (United States)

Longenbecker, Sueann; Wood, Peter H.

1984-01-01

Scores from the National Board Dental Hygiene Examination (NBDHE) served as the criterion variable in a comparison of the predictive validity of the Dental Hygiene Aptitude Tests (DHAT) and the ACT Assessment tests. The DHAT-Science and Verbal tests combined to produce the highest multiple correlation with NBDHE scores. (Author/DWH)
Contributions of Hamstring Stiffness to Straight-Leg-Raise and Sit-and-Reach Test Scores.

Science.gov (United States)

Miyamoto, Naokazu; Hirata, Kosuke; Kimura, Noriko; Miyamoto-Mikami, Eri

2018-02-01

The passive straight-leg-raise (PSLR) and the sit-and-reach (SR) tests have been widely used to assess hamstring extensibility. However, it remains unclear to what extent hamstring stiffness (a measure of material properties) contributes to PSLR and SR test scores. Therefore, we aimed to clarify the relationship between hamstring stiffness and PSLR and SR scores using ultrasound shear wave elastography. Ninety-eight healthy subjects completed the study. Each subject completed PSLR testing, and classic and modified SR testing of the right leg. Muscle shear modulus of the biceps femoris, semitendinosus, and semimembranosus was quantified as an index of muscle stiffness. The relationships between shear modulus of each muscle and PSLR or SR scores were calculated using Pearson's product-moment correlation coefficients. Shear modulus of the semitendinosus and semimembranosus showed negative correlations with the two PSLR and two SR scores (absolute r value≤0.484). Shear modulus of the biceps femoris was significantly correlated with the PSLR score determined by the examiner and the modified SR score (absolute r value≤0.308). The present findings suggest that PSLR and SR test scores are strongly influenced by factors other than hamstring stiffness and therefore might not accurately evaluate hamstring stiffness. © Georg Thieme Verlag KG Stuttgart · New York.
Manual for Scoring the Test of Directed Imagination.

Science.gov (United States)

Veldman, Donald J.; And Others

A scoring manual for the Directed Imagination Test, a projective technique wherein the subject is instructed to write four fictional stories (four minutes are allowed for each) about teachers and their experiences, is presented. The manual provides detailed instructions for rating each story by fifteen dimensions relevant to teacher education…
AP Trends: Tests Soar, Scores Slip--Gaps between Groups Spur Equity Concerns

Science.gov (United States)

Cech, Scott J.

2008-01-01

More students are taking Advanced Placement tests, but the proportion of tests receiving what is deemed a passing score has dipped, and the mean score is down for the fourth year in a row. Data released here this week by the New York City-based nonprofit organization that owns the AP brand shows that a greater-than-ever proportion of students…
Validity of GRE General Test Scores and TOEFL Scores for Graduate Admission to a Technical University in Western Europe

Science.gov (United States)

Zimmermann, Judith; von Davier, Alina A.; Buhmann, Joachim M.; Heinimann, Hans R.

2018-01-01

Graduate admission has become a critical process in tertiary education, whereby selecting valid admissions instruments is key. This study assessed the validity of Graduate Record Examination (GRE) General Test scores for admission to Master's programmes at a technical university in Europe. We investigated the indicative value of GRE scores for the…
The Formalization of Fairness: Issues in Testing for Measurement Invariance Using Subtest Scores

Science.gov (United States)

Molenaar, Dylan; Borsboom, Denny

2013-01-01

Measurement invariance is an important prerequisite for the adequate comparison of group differences in test scores. In psychology, measurement invariance is typically investigated by means of linear factor analyses of subtest scores. These subtest scores typically result from summing the item scores. In this paper, we discuss 4 possible problems…
Explaining the black-white gap in cognitive test scores: Toward a theory of adverse impact.

Science.gov (United States)

Cottrell, Jonathan M; Newman, Daniel A; Roisman, Glenn I

2015-11-01

In understanding the causes of adverse impact, a key parameter is the Black-White difference in cognitive test scores. To advance theory on why Black-White cognitive ability/knowledge test score gaps exist, and on how these gaps develop over time, the current article proposes an inductive explanatory model derived from past empirical findings. According to this theoretical model, Black-White group mean differences in cognitive test scores arise from the following racially disparate conditions: family income, maternal education, maternal verbal ability/knowledge, learning materials in the home, parenting factors (maternal sensitivity, maternal warmth and acceptance, and safe physical environment), child birth order, and child birth weight. Results from a 5-wave longitudinal growth model estimated on children in the NICHD Study of Early Child Care and Youth Development from ages 4 through 15 years show significant Black-White cognitive test score gaps throughout early development that did not grow significantly over time (i.e., significant intercept differences, but not slope differences). Importantly, the racially disparate conditions listed above can account for the relation between race and cognitive test scores. We propose a parsimonious 3-Step Model that explains how cognitive test score gaps arise, in which race relates to maternal disadvantage, which in turn relates to parenting factors, which in turn relate to cognitive test scores. This model and results offer to fill a need for theory on the etiology of the Black-White ethnic group gap in cognitive test scores, and attempt to address a missing link in the theory of adverse impact. (c) 2015 APA, all rights reserved).
Performances on Rey Auditory Verbal Learning Test and Rey Complex Figure Test in a healthy, elderly Danish sample--reference data and validity issues

DEFF Research Database (Denmark)

Vogel, Asmus; Stokholm, Jette; Jørgensen, Kasper

2012-01-01

. The RCFT copy score was significantly related to age and the DART score. On RCFT recall a highly significant difference was found between persons who could make a faultless copy and persons with incomplete copy performance. Thus, this study presents separate data for RCFT recall scores according...... to the subjects' copying performance (in separate tables for age and education groups). For all measures on both RAVLT and RCFT wide distributions of scores were found and the impact of this broad score range on the tests' discriminative validity is discussed. RAVLT performances for elderly were similar...... to previous published meta-norms, but the included sample of elderly Danes performed better on RCFT (copy and recall) than elderly from the United States....
A Maturing Global Testing Regime Meets the World Economy: Test Scores and Economic Growth, 1960-2012

Science.gov (United States)

Kamens, David H.

2015-01-01

This article considers the growth of the international testing regime. It discusses sources of growth and empirically examines two related sets of issues: (1) the stability of countries' achievement scores, and (2) the influence of those national scores on subsequent economic development over different time lags. The article suggests that…
Psychometric Properties of Raw and Scale Scores on Mixed-Format Tests

Science.gov (United States)

Kolen, Michael J.; Lee, Won-Chan

2011-01-01

This paper illustrates that the psychometric properties of scores and scales that are used with mixed-format educational tests can impact the use and interpretation of the scores that are reported to examinees. Psychometric properties that include reliability and conditional standard errors of measurement are considered in this paper. The focus is…
Effects of Classroom Ventilation Rate and Temperature on Students' Test Scores.

Science.gov (United States)

Haverinen-Shaughnessy, Ulla; Shaughnessy, Richard J

2015-01-01

Using a multilevel approach, we estimated the effects of classroom ventilation rate and temperature on academic achievement. The analysis is based on measurement data from a 70 elementary school district (140 fifth grade classrooms) from Southwestern United States, and student level data (N = 3109) on socioeconomic variables and standardized test scores. There was a statistically significant association between ventilation rates and mathematics scores, and it was stronger when the six classrooms with high ventilation rates that were indicated as outliers were filtered (> 7.1 l/s per person). The association remained significant when prior year test scores were included in the model, resulting in less unexplained variability. Students' mean mathematics scores (average 2286 points) were increased by up to eleven points (0.5%) per each liter per second per person increase in ventilation rate within the range of 0.9-7.1 l/s per person (estimated effect size 74 points). There was an additional increase of 12-13 points per each 1°C decrease in temperature within the observed range of 20-25°C (estimated effect size 67 points). Effects of similar magnitude but higher variability were observed for reading and science scores. In conclusion, maintaining adequate ventilation and thermal comfort in classrooms could significantly improve academic achievement of students.
Optimal Scoring Methods of Hand-Strength Tests in Patients with Stroke

Science.gov (United States)

Huang, Sheau-Ling; Hsieh, Ching-Lin; Lin, Jau-Hong; Chen, Hui-Mei

2011-01-01

The purpose of this study was to determine the optimal scoring methods for measuring strength of the more-affected hand in patients with stroke by examining the effect of reducing measurement errors. Three hand-strength tests of grip, palmar pinch, and lateral pinch were administered at two sessions in 56 patients with stroke. Five scoring methods…
Refining Ovarian Cancer Test accuracy Scores (ROCkeTS): protocol for a prospective longitudinal test accuracy study to validate new risk scores in women with symptoms of suspected ovarian cancer.

Science.gov (United States)

Sundar, Sudha; Rick, Caroline; Dowling, Francis; Au, Pui; Snell, Kym; Rai, Nirmala; Champaneria, Rita; Stobart, Hilary; Neal, Richard; Davenport, Clare; Mallett, Susan; Sutton, Andrew; Kehoe, Sean; Timmerman, Dirk; Bourne, Tom; Van Calster, Ben; Gentry-Maharaj, Aleksandra; Menon, Usha; Deeks, Jon

2016-08-09

Ovarian cancer (OC) is associated with non-specific symptoms such as bloating, making accurate diagnosis challenging: only 1 in 3 women with OC presents through primary care referral. National Institute for Health and Care Excellence guidelines recommends sequential testing with CA125 and routine ultrasound in primary care. However, these diagnostic tests have limited sensitivity or specificity. Improving accurate triage in women with vague symptoms is likely to improve mortality by streamlining referral and care pathways. The Refining Ovarian Cancer Test Accuracy Scores (ROCkeTS; HTA 13/13/01) project will derive and validate new tests/risk prediction models that estimate the probability of having OC in women with symptoms. This protocol refers to the prospective study only (phase III). ROCkeTS comprises four parallel phases. The full ROCkeTS protocol can be found at http://www.birmingham.ac.uk/ROCKETS. Phase III is a prospective test accuracy study. The study will recruit 2450 patients from 15 UK sites. Recruited patients complete symptom and anxiety questionnaires, donate a serum sample and undergo ultrasound scored as per International Ovarian Tumour Analysis (IOTA) criteria. Recruitment is at rapid access clinics, emergency departments and elective clinics. Models to be evaluated include those based on ultrasound derived by the IOTA group and novel models derived from analysis of existing data sets. Estimates of sensitivity, specificity, c-statistic (area under receiver operating curve), positive predictive value and negative predictive value of diagnostic tests are evaluated and a calibration plot for models will be presented. ROCkeTS has received ethical approval from the NHS West Midlands REC (14/WM/1241) and is registered on the controlled trials website (ISRCTN17160843) and the National Institute of Health Research Cancer and Reproductive Health portfolios. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted
Testing statistical significance scores of sequence comparison methods with structure similarity

Directory of Open Access Journals (Sweden)

Leunissen Jack AM

2006-10-01

Full Text Available Abstract Background In the past years the Smith-Waterman sequence comparison algorithm has gained popularity due to improved implementations and rapidly increasing computing power. However, the quality and sensitivity of a database search is not only determined by the algorithm but also by the statistical significance testing for an alignment. The e-value is the most commonly used statistical validation method for sequence database searching. The CluSTr database and the Protein World database have been created using an alternative statistical significance test: a Z-score based on Monte-Carlo statistics. Several papers have described the superiority of the Z-score as compared to the e-value, using simulated data. We were interested if this could be validated when applied to existing, evolutionary related protein sequences. Results All experiments are performed on the ASTRAL SCOP database. The Smith-Waterman sequence comparison algorithm with both e-value and Z-score statistics is evaluated, using ROC, CVE and AP measures. The BLAST and FASTA algorithms are used as reference. We find that two out of three Smith-Waterman implementations with e-value are better at predicting structural similarities between proteins than the Smith-Waterman implementation with Z-score. SSEARCH especially has very high scores. Conclusion The compute intensive Z-score does not have a clear advantage over the e-value. The Smith-Waterman implementations give generally better results than their heuristic counterparts. We recommend using the SSEARCH algorithm combined with e-values for pairwise sequence comparisons.
Sampling analytical tests and destructive tests for quality assurance

International Nuclear Information System (INIS)

Saas, A.; Pasquini, S.; Jouan, A.; Angelis, de; Hreen Taywood, H.; Odoj, R.

1990-01-01

In the context of the third programme of the European Communities on the monitoring of radioactive waste, various methods have been developed for the performance of sampling and measuring tests on encapsulated waste of low and medium level activity, on the one hand, and of high level activity, on the other hand. The purpose was to provide better quality assurance for products to be stored on an interim or long-term basis. Various testing sampling means are proposed such as: - sampling of raw waste before conditioning and determination of the representative aliquot, - sampling of encapsulated waste on process output, - sampling of core specimens subjected to measurement before and after cutting. Equipment suitable for these sampling procedures have been developed and, in the case of core samples, a comparison of techniques has been made. The results are described for the various analytical tests carried out on the samples such as: - mechanical tests, - radiation resistance, - fire resistance, - lixiviation, - determination of free water, - biodegradation, - water resistance, - chemical and radiochemical analysis. Every time it was possible, these tests were compared with non-destructive tests on full-scale packages and some correlations are given. This word has made if possible to improve and clarify sample optimization, with fine sampling techniques and methodologies and draw up characterization procedures. It also provided an occasion for a first collaboration between the laboratories responsible for these studies and which will be furthered in the scope of the 1990-1994 programme
Test-retest reliability and minimal detectable change scores for sit-to-stand-to-sit tests, the six-minute walk test, the one-leg heel-rise test, and handgrip strength in people undergoing hemodialysis.

Science.gov (United States)

Segura-Ortí, Eva; Martínez-Olmos, Francisco José

2011-08-01

Determining the relative and absolute reliability of outcomes of physical performance tests for people undergoing hemodialysis is necessary to discriminate between the true effects of exercise interventions and the inherent variability of this cohort. The aims of this study were to assess the relative reliability of sit-to-stand-to-sit tests (the STS-10, which measures the time [in seconds] required to complete 10 full stands from a sitting position, and the STS-60, which measures the number of repetitions achieved in 60 seconds), the Six-Minute Walk Test (6MWT), the one-leg heel-rise test, and the handgrip strength test and to calculate minimal detectable change (MDC) scores in people undergoing hemodialysis. This study was a prospective, nonexperimental investigation. Thirty-nine people undergoing hemodialysis at 2 clinics in Spain were contacted. Study participants performed the STS-10 (n=37), the STS-60 (n=37), and the 6MWT (n=36). At one of the settings, the participants also performed the one-leg heel-rise test (n=21) and the handgrip strength test (n=12) on both the right and the left sides. Participants attended 2 testing sessions 1 to 2 weeks apart. High intraclass correlation coefficients (≥.88) were found for all tests, suggesting good relative reliability. The MDC scores at 90% confidence intervals were as follows: 8.4 seconds for the STS-10, 4 repetitions for the STS-60, 66.3 m for the 6MWT, 3.4 kg for handgrip strength (force-generating capacity), 3.7 repetitions for the one-leg heel-rise test with the right leg, and 5.2 repetitions for the one-leg heel-rise test with the left leg. Limitations A limited sample of patients was used in this study. The STS-16, STS-60, 6MWT, one-leg heel rise test, and handgrip strength test are reliable outcome measures. The MDC scores at 90% confidence intervals for these tests will help to determine whether a change is due to error or to an intervention.
High Test Scores: The Wrong Road to National Economic Success

Science.gov (United States)

Baker, Keith

2011-01-01

A widely held view is that good schools are essential to a nation's international economic success and that high test scores on international tests of academic skills and knowledge indicate how good a nation's schools are. The widespread belief that good schools are an important contributor to a nation's economic success in the world is supported…

Relationships between spatial activities and scores on the mental rotation test as a function of sex.

Science.gov (United States)

Ginn, Sheryl R; Pickens, Stefanie J

2005-06-01

Previous results suggested that female college students' scores on the Mental Rotations Test might be related to their prior experience with spatial tasks. For example, women who played video games scored better on the test than their non-game-playing peers, whereas playing video games was not related to men's scores. The present study examined whether participation in different types of spatial activities would be related to women's performance on the Mental Rotations Test. 31 men and 59 women enrolled at a small, private church-affiliated university and majoring in art or music as well as students who participated in intercollegiate athletics completed the Mental Rotations Test. Women's scores on the Mental Rotations Test benefitted from experience with spatial activities; the more types of experience the women had, the better their scores. Thus women who were athletes, musicians, or artists scored better than those women who had no experience with these activities. The opposite results were found for the men. Efforts are currently underway to assess how length of experience and which types of experience are related to scores.
Effects of Classroom Ventilation Rate and Temperature on Students' Test Scores.

Directory of Open Access Journals (Sweden)

Ulla Haverinen-Shaughnessy

Full Text Available Using a multilevel approach, we estimated the effects of classroom ventilation rate and temperature on academic achievement. The analysis is based on measurement data from a 70 elementary school district (140 fifth grade classrooms from Southwestern United States, and student level data (N = 3109 on socioeconomic variables and standardized test scores. There was a statistically significant association between ventilation rates and mathematics scores, and it was stronger when the six classrooms with high ventilation rates that were indicated as outliers were filtered (> 7.1 l/s per person. The association remained significant when prior year test scores were included in the model, resulting in less unexplained variability. Students' mean mathematics scores (average 2286 points were increased by up to eleven points (0.5% per each liter per second per person increase in ventilation rate within the range of 0.9-7.1 l/s per person (estimated effect size 74 points. There was an additional increase of 12-13 points per each 1°C decrease in temperature within the observed range of 20-25°C (estimated effect size 67 points. Effects of similar magnitude but higher variability were observed for reading and science scores. In conclusion, maintaining adequate ventilation and thermal comfort in classrooms could significantly improve academic achievement of students.
Effects of Classroom Ventilation Rate and Temperature on Students’ Test Scores

Science.gov (United States)

2015-01-01

Using a multilevel approach, we estimated the effects of classroom ventilation rate and temperature on academic achievement. The analysis is based on measurement data from a 70 elementary school district (140 fifth grade classrooms) from Southwestern United States, and student level data (N = 3109) on socioeconomic variables and standardized test scores. There was a statistically significant association between ventilation rates and mathematics scores, and it was stronger when the six classrooms with high ventilation rates that were indicated as outliers were filtered (> 7.1 l/s per person). The association remained significant when prior year test scores were included in the model, resulting in less unexplained variability. Students’ mean mathematics scores (average 2286 points) were increased by up to eleven points (0.5%) per each liter per second per person increase in ventilation rate within the range of 0.9–7.1 l/s per person (estimated effect size 74 points). There was an additional increase of 12–13 points per each 1°C decrease in temperature within the observed range of 20–25°C (estimated effect size 67 points). Effects of similar magnitude but higher variability were observed for reading and science scores. In conclusion, maintaining adequate ventilation and thermal comfort in classrooms could significantly improve academic achievement of students. PMID:26317643
A knowledge-based theory of rising scores on "culture-free" tests.

Science.gov (United States)

Fox, Mark C; Mitchum, Ainsley L

2013-08-01

Secular gains in intelligence test scores have perplexed researchers since they were documented by Flynn (1984, 1987). Gains are most pronounced on abstract, so-called culture-free tests, prompting Flynn (2007) to attribute them to problem-solving skills availed by scientifically advanced cultures. We propose that recent-born individuals have adopted an approach to analogy that enables them to infer higher level relations requiring roles that are not intrinsic to the objects that constitute initial representations of items. This proposal is translated into item-specific predictions about differences between cohorts in pass rates and item-response patterns on the Raven's Matrices (Flynn, 1987), a seemingly culture-free test that registers the largest Flynn effect. Consistent with predictions, archival data reveal that individuals born around 1940 are less able to map objects at higher levels of relational abstraction than individuals born around 1990. Polytomous Rasch models verify predicted violations of measurement invariance, as raw scores are found to underestimate the number of analogical rules inferred by members of the earlier cohort relative to members of the later cohort who achieve the same overall score. The work provides a plausible cognitive account of the Flynn effect, furthers understanding of the cognition of matrix reasoning, and underscores the need to consider how test-takers select item responses. PsycINFO Database Record (c) 2013 APA, all rights reserved.
A Latent Class Approach to Estimating Test-Score Reliability

Science.gov (United States)

van der Ark, L. Andries; van der Palm, Daniel W.; Sijtsma, Klaas

2011-01-01

This study presents a general framework for single-administration reliability methods, such as Cronbach's alpha, Guttman's lambda-2, and method MS. This general framework was used to derive a new approach to estimating test-score reliability by means of the unrestricted latent class model. This new approach is the latent class reliability…
Computerized scoring algorithms for the Autobiographical Memory Test.

Science.gov (United States)

Takano, Keisuke; Gutenbrunner, Charlotte; Martens, Kris; Salmon, Karen; Raes, Filip

2018-02-01

Reduced specificity of autobiographical memories is a hallmark of depressive cognition. Autobiographical memory (AM) specificity is typically measured by the Autobiographical Memory Test (AMT), in which respondents are asked to describe personal memories in response to emotional cue words. Due to this free descriptive responding format, the AMT relies on experts' hand scoring for subsequent statistical analyses. This manual coding potentially impedes research activities in big data analytics such as large epidemiological studies. Here, we propose computerized algorithms to automatically score AM specificity for the Dutch (adult participants) and English (youth participants) versions of the AMT by using natural language processing and machine learning techniques. The algorithms showed reliable performances in discriminating specific and nonspecific (e.g., overgeneralized) autobiographical memories in independent testing data sets (area under the receiver operating characteristic curve > .90). Furthermore, outcome values of the algorithms (i.e., decision values of support vector machines) showed a gradient across similar (e.g., specific and extended memories) and different (e.g., specific memory and semantic associates) categories of AMT responses, suggesting that, for both adults and youth, the algorithms well capture the extent to which a memory has features of specific memories. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Test sample handling apparatus

International Nuclear Information System (INIS)

1981-01-01

A test sample handling apparatus using automatic scintillation counting for gamma detection, for use in such fields as radioimmunoassay, is described. The apparatus automatically and continuously counts large numbers of samples rapidly and efficiently by the simultaneous counting of two samples. By means of sequential ordering of non-sequential counting data, it is possible to obtain precisely ordered data while utilizing sample carrier holders having a minimum length. (U.K.)
Robust joint score tests in the application of DNA methylation data analysis.

Science.gov (United States)

Li, Xuan; Fu, Yuejiao; Wang, Xiaogang; Qiu, Weiliang

2018-05-18

Recently differential variability has been showed to be valuable in evaluating the association of DNA methylation to the risks of complex human diseases. The statistical tests based on both differential methylation level and differential variability can be more powerful than those based only on differential methylation level. Anh and Wang (2013) proposed a joint score test (AW) to simultaneously detect for differential methylation and differential variability. However, AW's method seems to be quite conservative and has not been fully compared with existing joint tests. We proposed three improved joint score tests, namely iAW.Lev, iAW.BF, and iAW.TM, and have made extensive comparisons with the joint likelihood ratio test (jointLRT), the Kolmogorov-Smirnov (KS) test, and the AW test. Systematic simulation studies showed that: 1) the three improved tests performed better (i.e., having larger power, while keeping nominal Type I error rates) than the other three tests for data with outliers and having different variances between cases and controls; 2) for data from normal distributions, the three improved tests had slightly lower power than jointLRT and AW. The analyses of two Illumina HumanMethylation27 data sets GSE37020 and GSE20080 and one Illumina Infinium MethylationEPIC data set GSE107080 demonstrated that three improved tests had higher true validation rates than those from jointLRT, KS, and AW. The three proposed joint score tests are robust against the violation of normality assumption and presence of outlying observations in comparison with other three existing tests. Among the three proposed tests, iAW.BF seems to be the most robust and effective one for all simulated scenarios and also in real data analyses.
America's Mediocre Test Scores: Education Crisis or Poverty Crisis?

Science.gov (United States)

Petrilli, Michael J.; Wright, Brandon L.

2016-01-01

At a time when the national conversation is focused on lagging upward mobility, it is no surprise that many educators point to poverty as the explanation for mediocre test scores among U.S. students compared to those of students in other countries. If American teachers in struggling U.S. schools taught in Finland, says Finnish educator Pasi…
Comparison of physical therapy anatomy performance and anxiety scores in timed and untimed practical tests.

Science.gov (United States)

Schwartz, Sarah M; Evans, Cathy; Agur, Anne M R

2015-01-01

Students in health care professional programs face many stressful tests that determine successful completion of their program. Test anxiety during these high stakes examinations can affect working memory and lead to poor outcomes. Methods of decreasing test anxiety include lengthening the time available to complete examinations or evaluating students using untimed examinations. There is currently no consensus in the literature regarding whether untimed examinations provide a benefit to test performance in clinical anatomy. This study aimed to determine the impact of timed versus untimed practical tests on Master of Physical Therapy student anatomy performance and test anxiety. Test anxiety was measured using the State-Trait Anxiety Inventory (STAI). Differences in performance, anxiety scores, and time taken were compared using paired sample Student's t-tests. Eighty-one of the 84 students completed the study and provided feedback. Students performed significantly higher on the untimed test (P = 0.005), with a significant reduction in test anxiety (P anxiety. If the intended goal of evaluating health care professional students is to determine fundamental competencies, these factors should be considered when designing future curricula. © 2014 American Association of Anatomists.
The Implications of Family Size and Birth Order for Test Scores and Behavioral Development

Science.gov (United States)

Silles, Mary A.

2010-01-01

This article, using longitudinal data from the National Child Development Study, presents new evidence on the effects of family size and birth order on test scores and behavioral development at age 7, 11 and 16. Sibling size is shown to have an adverse causal effect on test scores and behavioral development. For any given family size, first-borns…
A score based on screening tests to differentiate mild cognitive impairment from subjective memory complaints

Directory of Open Access Journals (Sweden)

Fábio Henrique de Gobbi Porto

2013-09-01

Full Text Available It is not easy to differentiate patients with mild cognitive impairment (MCI from subjective memory complainers (SMC. Assessments with screening cognitive tools are essential, particularly in primary care where most patients are seen. The objective of this study was to evaluate the diagnostic accuracy of screening cognitive tests and to propose a score derived from screening tests. Elderly subjects with memory complaints were evaluated using the Mini Mental State Examination (MMSE and the Brief Cognitive Battery (BCB. We added two delayed recalls in the MMSE (a delayed recall and a late-delayed recall, LDR, and also a phonemic fluency test of letter P fluency (LPF. A score was created based on these tests. The diagnoses were made on the basis of clinical consensus and neuropsychological testing. Receiver operating characteristic curve analyses were used to determine area under the curve (AUC, the sensitivity and specificity for each test separately and for the final proposed score. MMSE, LDR, LPF and delayed recall of BCB scores reach statistically significant differences between groups (P=0.000, 0.03, 0.001 and 0.01, respectively. Sensitivity, specificity and AUC were MMSE: 64%, 79% and 0.75 (cut off <29; LDR: 56%, 62% and 0.62 (cut off <3; LPF: 71%, 71% and 0.71 (cut off <14; delayed recall of BCB: 56%, 82% and 0.68 (cut off <9. The proposed score reached a sensitivity of 88% and 76% and specificity of 62% and 75% for cut off over 1 and over 2, respectively. AUC were 0.81. In conclusion, a score created from screening tests is capable of discriminating MCI from SMC with moderate to good accurancy.
A physical function test for use in the intensive care unit: validity, responsiveness, and predictive utility of the physical function ICU test (scored).

Science.gov (United States)

Denehy, Linda; de Morton, Natalie A; Skinner, Elizabeth H; Edbrooke, Lara; Haines, Kimberley; Warrillow, Stephen; Berney, Sue

2013-12-01

Several tests have recently been developed to measure changes in patient strength and functional outcomes in the intensive care unit (ICU). The original Physical Function ICU Test (PFIT) demonstrates reliability and sensitivity. The aims of this study were to further develop the original PFIT, to derive an interval score (the PFIT-s), and to test the clinimetric properties of the PFIT-s. A nested cohort study was conducted. One hundred forty-four and 116 participants performed the PFIT at ICU admission and discharge, respectively. Original test components were modified using principal component analysis. Rasch analysis examined the unidimensionality of the PFIT, and an interval score was derived. Correlations tested validity, and multiple regression analyses investigated predictive ability. Responsiveness was assessed using the effect size index (ESI), and the minimal clinically important difference (MCID) was calculated. The shoulder lift component was removed. Unidimensionality of combined admission and discharge PFIT-s scores was confirmed. The PFIT-s displayed moderate convergent validity with the Timed "Up & Go" Test (r=-.60), the Six-Minute Walk Test (r=.41), and the Medical Research Council (MRC) sum score (rho=.49). The ESI of the PFIT-s was 0.82, and the MCID was 1.5 points (interval scale range=0-10). A higher admission PFIT-s score was predictive of: an MRC score of ≥48, increased likelihood of discharge home, reduced likelihood of discharge to inpatient rehabilitation, and reduced acute care hospital length of stay. Scoring of sit-to-stand assistance required is subjective, and cadence cutpoints used may not be generalizable. The PFIT-s is a safe and inexpensive test of physical function with high clinical utility. It is valid, responsive to change, and predictive of key outcomes. It is recommended that the PFIT-s be adopted to test physical function in the ICU.
Do candidate reactions relate to job performance or affect criterion-related validity? A multistudy investigation of relations among reactions, selection test scores, and job performance.

Science.gov (United States)

McCarthy, Julie M; Van Iddekinge, Chad H; Lievens, Filip; Kung, Mei-Chuan; Sinar, Evan F; Campion, Michael A

2013-09-01

Considerable evidence suggests that how candidates react to selection procedures can affect their test performance and their attitudes toward the hiring organization (e.g., recommending the firm to others). However, very few studies of candidate reactions have examined one of the outcomes organizations care most about: job performance. We attempt to address this gap by developing and testing a conceptual framework that delineates whether and how candidate reactions might influence job performance. We accomplish this objective using data from 4 studies (total N = 6,480), 6 selection procedures (personality tests, job knowledge tests, cognitive ability tests, work samples, situational judgment tests, and a selection inventory), 5 key candidate reactions (anxiety, motivation, belief in tests, self-efficacy, and procedural justice), 2 contexts (industry and education), 3 continents (North America, South America, and Europe), 2 study designs (predictive and concurrent), and 4 occupational areas (medical, sales, customer service, and technological). Consistent with previous research, candidate reactions were related to test scores, and test scores were related to job performance. Further, there was some evidence that reactions affected performance indirectly through their influence on test scores. Finally, in no cases did candidate reactions affect the prediction of job performance by increasing or decreasing the criterion-related validity of test scores. Implications of these findings and avenues for future research are discussed. PsycINFO Database Record (c) 2013 APA, all rights reserved
Gleeble Testing of Tungsten Samples

Science.gov (United States)

2013-02-01

temperature on an Instron load frame with a 222.41 kN (50 kip) load cell . The samples were compressed at the same strain rate as on the Gleeble...ID % RE Initial Density (cm 3 ) Density after Compression (cm 3 ) % Change in Density Test Temperature NT1 0 18.08 18.27 1.06 1000 NT3 0...4.1 Nano-Tungsten The results for the compression of the nano-tungsten samples are shown in tables 2 and 3 and figure 5. During testing, sample NT1
Norm Block Sample Sizes: A Review of 17 Individually Administered Intelligence Tests

Science.gov (United States)

Norfolk, Philip A.; Farmer, Ryan L.; Floyd, Randy G.; Woods, Isaac L.; Hawkins, Haley K.; Irby, Sarah M.

2015-01-01

The representativeness, recency, and size of norm samples strongly influence the accuracy of inferences drawn from their scores. Inadequate norm samples may lead to inflated or deflated scores for individuals and poorer prediction of developmental and academic outcomes. The purpose of this study was to apply Kranzler and Floyd's method for…
Validation of the Cognition Test Battery for Spaceflight in a Sample of Highly Educated Adults.

Science.gov (United States)

Moore, Tyler M; Basner, Mathias; Nasrini, Jad; Hermosillo, Emanuel; Kabadi, Sushila; Roalf, David R; McGuire, Sarah; Ecker, Adrian J; Ruparel, Kosha; Port, Allison M; Jackson, Chad T; Dinges, David F; Gur, Ruben C

2017-10-01

Neuropsychological changes that may occur due to the environmental and psychological stressors of prolonged spaceflight motivated the development of the Cognition Test Battery. The battery was designed to assess multiple domains of neurocognitive functions linked to specific brain systems. Tests included in Cognition have been validated, but not in high-performing samples comparable to astronauts, which is an essential step toward ensuring their usefulness in long-duration space missions. We administered Cognition (on laptop and iPad) and the WinSCAT, counterbalanced for order and version, in a sample of 96 subjects (50% women; ages 25-56 yr) with at least a Master's degree in science, technology, engineering, or mathematics (STEM). We assessed the associations of age, sex, and administration device with neurocognitive performance, and compared the scores on the Cognition battery with those of WinSCAT. Confirmatory factor analysis compared the structure of the iPad and laptop administration methods using Wald tests. Age was associated with longer response times (mean β = 0.12) and less accurate (mean β = -0.12) performance, women had longer response times on psychomotor (β = 0.62), emotion recognition (β = 0.30), and visuo-spatial (β = 0.48) tasks, men outperformed women on matrix reasoning (β = -0.34), and performance on an iPad was generally faster (mean β = -0.55). The WinSCAT appeared heavily loaded with tasks requiring executive control, whereas Cognition assessed a larger variety of neurocognitive domains. Overall results supported the interpretation of Cognition scores as measuring their intended constructs in high performing astronaut analog samples.Moore TM, Basner M, Nasrini J, Hermosillo E, Kabadi S, Roalf DR, McGuire S, Ecker AJ, Ruparel K, Port AM, Jackson CT, Dinges DF, Gur RC. Validation of the Cognition Test Battery for spaceflight in a sample of highly educated adults. Aerosp Med Hum Perform. 2017; 88(10):937-946.
Allele-sharing models: LOD scores and accurate linkage tests.

Science.gov (United States)

Kong, A; Cox, N J

1997-11-01

Starting with a test statistic for linkage analysis based on allele sharing, we propose an associated one-parameter model. Under general missing-data patterns, this model allows exact calculation of likelihood ratios and LOD scores and has been implemented by a simple modification of existing software. Most important, accurate linkage tests can be performed. Using an example, we show that some previously suggested approaches to handling less than perfectly informative data can be unacceptably conservative. Situations in which this model may not perform well are discussed, and an alternative model that requires additional computations is suggested.
Comparing the Effects of Elementary Music and Visual Arts Lessons on Standardized Mathematics Test Scores

Science.gov (United States)

King, Molly Elizabeth

2016-01-01

The purpose of this quantitative, causal-comparative study was to compare the effect elementary music and visual arts lessons had on third through sixth grade standardized mathematics test scores. Inferential statistics were used to compare the differences between test scores of students who took in-school, elementary, music instruction during the…
Factor structure and invariance test of the alcohol use disorder identification test (AUDIT): Comparison and further validation in a U.S. and Philippines college student sample.

Science.gov (United States)

Tuliao, Antover P; Landoy, Bernice Vania N; McChargue, Dennis E

2016-01-01

The Alcohol Use Disorder Identification Test's factor structure varies depending on population and culture. Because of this inconsistency, this article examined the factor structure of the test and conducted a factorial invariance test between a U.S. and a Philippines college sample. Confirmatory factor analyses indicated that a three-factor solution outperforms the one- and two-factor solution in both samples. Factorial invariance analyses further supports the confirmatory findings by showing that factor loadings were generally invariant across groups; however, item intercepts show non-invariance. Country differences between factors show that Filipino consumption factor mean scores were significantly lower than their U.S. counterparts.

REPRODUCIBILITY OF THE MODIFIED STAR EXCURSION BALANCE TEST COMPOSITE AND SPECIFIC REACH DIRECTION SCORES.

Science.gov (United States)

van Lieshout, Remko; Reijneveld, Elja A E; van den Berg, Sandra M; Haerkens, Gijs M; Koenders, Niek H; de Leeuw, Arina J; van Oorsouw, Roel G; Paap, Davy; Scheffer, Else; Weterings, Stijn; Stukstette, Mirelle J

2016-06-01

The mSEBT is a screening tool used to evaluate dynamic balance. Most research investigating measurement properties focused on intrarater reliability and was done in small samples. To know whether the mSEBT is useful to discriminate dynamic balance between persons and to evaluate changes in dynamic balance, more research into intra- and interrater reliability and smallest detectable change (synonymous with minimal detectable change) is needed. To estimate intra- and interrater reliability and smallest detectable change of the mSEBT in adults at risk for ankle sprain. Cross-sectional, test-retest design. Fifty-five healthy young adults participating in sports at risk for ankle sprain participated (mean ± SD age, 24.0 ± 2.9 years). Each participant performed three test sessions within one hour and was rated by two physical therapists (session 1, rater 1; session 2, rater 2; session 3, rater 1). Participants and raters were blinded for previous measurements. Normalized composite and reach direction scores for the right and left leg were collected. Analysis of variance was used to calculate intraclass correlation coefficient values for intra- and interrater reliability. Smallest detectable change values were calculated based on the standard error of measurement. Intra- and interrater reliability for both legs was good to excellent (intraclass correlation coefficient ranging from 0.87 to 0.94). The intrarater smallest detectable change for the composite score of the right leg was 7.2% and for the left 6.2%. The interrater smallest detectable change for the composite score of the right leg was 6.9% and for the left 5.0%. The mSEBT is a reliable measurement instrument to discriminate dynamic balance between persons. Most smallest detectable change values of the mSEBT appear to be large. More research is needed to investigate if the mSEBT is usable for evaluative purposes. Level 2.
Test and Score Data Summary for TOEFL[R] Internet-Based and Paper-Based Tests. January 2008-December 2008 Test Data

Science.gov (United States)

Educational Testing Service, 2008

2008-01-01

The Test of English as a Foreign Language[TM], better known as TOEFL[R], is designed to measure the English-language proficiency of people whose native language is not English. TOEFL scores are accepted by more than 6,000 colleges, universities, and licensing agencies in 130 countries. The test is also used by governments, and scholarship and…
Use of Standardized Test Scores to Predict Success in a Computer Applications Course

Science.gov (United States)

Harris, Robert V.; King, Stephanie B.

2016-01-01

The purpose of this study was to see if a relationship existed between American College Testing (ACT) scores (i.e., English, reading, mathematics, science reasoning, and composite) and student success in a computer applications course at a Mississippi community college. The study showed that while the ACT scores were excellent predictors of…
Similar predictions of etravirine sensitivity regardless of genotypic testing method used: comparison of available scoring systems.

Science.gov (United States)

Vingerhoets, Johan; Nijs, Steven; Tambuyzer, Lotke; Hoogstoel, Annemie; Anderson, David; Picchio, Gaston

2012-01-01

The aims of this study were to compare various genotypic scoring systems commonly used to predict virological outcome to etravirine, and examine their concordance with etravirine phenotypic susceptibility. Six etravirine genotypic scoring systems were assessed: Tibotec 2010 (based on 20 mutations; TBT 20), Monogram, Stanford HIVdb, ANRS, Rega (based on 37, 30, 27 and 49 mutations, respectively) and virco(®)TYPE HIV-1 (predicted fold change based on genotype). Samples from treatment-experienced patients who participated in the DUET trials and with both genotypic and phenotypic data (n=403) were assessed using each scoring system. Results were retrospectively correlated with virological response in DUET. κ coefficients were calculated to estimate the degree of correlation between the different scoring systems. Correlation between the five scoring systems and the TBT 20 system was approximately 90%. Virological response by etravirine susceptibility was comparable regardless of which scoring system was utilized, with 70-74% of DUET patients determined as susceptible to etravirine by the different scoring systems achieving plasma viral load <50 HIV-1 RNA copies/ml. In samples classed as phenotypically susceptible to etravirine (fold change in 50% effective concentration ≤3), correlations with genotypic score were consistently high across scoring systems (≥70%). In general, the etravirine genotypic scoring systems produced similar results, and genotype-phenotype concordance was high. As such, phenotypic interpretations, and in their absence all genotypic scoring systems investigated, may be used to reliably predict the activity of etravirine.
A Comparison of the Approaches of Generalizability Theory and Item Response Theory in Estimating the Reliability of Test Scores for Testlet-Composed Tests

Science.gov (United States)

Lee, Guemin; Park, In-Yong

2012-01-01

Previous assessments of the reliability of test scores for testlet-composed tests have indicated that item-based estimation methods overestimate reliability. This study was designed to address issues related to the extent to which item-based estimation methods overestimate the reliability of test scores composed of testlets and to compare several…
Dichotomous scoring of Trails B in patients referred for a dementia evaluation.

Science.gov (United States)

Schmitt, Andrew L; Livingston, Ronald B; Smernoff, Eric N; Waits, Bethany L; Harris, James B; Davis, Kent M

2010-04-01

The Trail Making Test is a popular neuropsychological test and its interpretation has traditionally used time-based scores. This study examined an alternative approach to scoring that is simply based on the examinees' ability to complete the test. If an examinee is able to complete Trails B successfully, they are coded as "completers"; if not, they are coded as "noncompleters." To assess this approach to scoring Trails B, the performance of 97 diagnostically heterogeneous individuals referred for a dementia evaluation was examined. In this sample, 55 individuals successfully completed Trails B and 42 individuals were unable to complete it. Point-biserial correlations indicated a moderate-to-strong association (r(pb)=.73) between the Trails B completion variable and the Total Scale score of the Repeatable Battery for the Assessment of Neurological Status (RBANS), which was larger than the correlation between the Trails B time-based score and the RBANS Total Scale score (r(pb)=.60). As a screen for dementia status, Trails B completion showed a sensitivity of 69% and a specificity of 100% in this sample. These results suggest that dichotomous scoring of Trails B might provide a brief and clinically useful measure of dementia status.
Racial Differences in Mathematics Test Scores for Advanced Mathematics Students

Science.gov (United States)

Minor, Elizabeth Covay

2016-01-01

Research on achievement gaps has found that achievement gaps are larger for students who take advanced mathematics courses compared to students who do not. Focusing on the advanced mathematics student achievement gap, this study found that African American advanced mathematics students have significantly lower test scores and are less likely to be…
Association between the gait pattern characteristics of older people and their two-step test scores.

Science.gov (United States)

Kobayashi, Yoshiyuki; Ogata, Toru

2018-04-27

The Two-Step test is one of three official tests authorized by the Japanese Orthopedic Association to evaluate the risk of locomotive syndrome (a condition of reduced mobility caused by an impairment of the locomotive organs). It has been reported that the Two-Step test score has a good correlation with one's walking ability; however, its association with the gait pattern of older people during normal walking is still unknown. Therefore, this study aims to clarify the associations between the gait patterns of older people observed during normal walking and their Two-Step test scores. We analyzed the whole waveforms obtained from the lower-extremity joint angles and joint moments of 26 older people in various stages of locomotive syndrome using principal component analysis (PCA). The PCA was conducted using a 260 × 2424 input matrix constructed from the participants' time-normalized pelvic and right-lower-limb-joint angles along three axes (ten trials of 26 participants, 101 time points, 4 angles, 3 axes, and 2 variable types per trial). The Pearson product-moment correlation coefficient between the scores of the principal component vectors (PCVs) and the scores of the Two-Step test revealed that only one PCV (PCV 2) among the 61 obtained relevant PCVs is significantly related to the score of the Two-Step test. We therefore concluded that the joint angles and joint moments related to PCV 2-ankle plantar-flexion, ankle plantar-flexor moments during the late stance phase, ranges of motion and moments on the hip, knee, and ankle joints in the sagittal plane during the entire stance phase-are the motions associated with the Two-Step test.
Acceptance test procedure for core sample trucks

International Nuclear Information System (INIS)

Smalley, J.L.

1995-01-01

The purpose of this Acceptance Test Procedure is to provide instruction and documentation for acceptance testing of the rotary mode core sample trucks, HO-68K-4600 and HO-68K-4647. The rotary mode core sample trucks were based upon the design of the second core sample truck (HO-68K-4345) which was constructed to implement rotary mode sampling of the waste tanks at Hanford. Acceptance testing of the rotary mode core sample trucks will verify that the design requirements have been met. All testing will be non-radioactive and stand-in materials shall be used to simulate waste tank conditions. Compressed air will be substituted for nitrogen during the majority of testing, with nitrogen being used only for flow characterization
A general equation to obtain multiple cut-off scores on a test from multinomial logistic regression.

Science.gov (United States)

Bersabé, Rosa; Rivas, Teresa

2010-05-01

The authors derive a general equation to compute multiple cut-offs on a total test score in order to classify individuals into more than two ordinal categories. The equation is derived from the multinomial logistic regression (MLR) model, which is an extension of the binary logistic regression (BLR) model to accommodate polytomous outcome variables. From this analytical procedure, cut-off scores are established at the test score (the predictor variable) at which an individual is as likely to be in category j as in category j+1 of an ordinal outcome variable. The application of the complete procedure is illustrated by an example with data from an actual study on eating disorders. In this example, two cut-off scores on the Eating Attitudes Test (EAT-26) scores are obtained in order to classify individuals into three ordinal categories: asymptomatic, symptomatic and eating disorder. Diagnoses were made from the responses to a self-report (Q-EDD) that operationalises DSM-IV criteria for eating disorders. Alternatives to the MLR model to set multiple cut-off scores are discussed.
School accountability and the black-white test score gap.

Science.gov (United States)

Gaddis, S Michael; Lauen, Douglas Lee

2014-03-01

Since at least the 1960s, researchers have closely examined the respective roles of families, neighborhoods, and schools in producing the black-white achievement gap. Although many researchers minimize the ability of schools to eliminate achievement gaps, the No Child Left Behind Act (NCLB) increased pressure on schools to do so by 2014. In this study, we examine the effects of NCLB's subgroup-specific accountability pressure on changes in black-white math and reading test score gaps using a school-level panel dataset on all North Carolina public elementary and middle schools between 2001 and 2009. Using difference-in-difference models with school fixed effects, we find that accountability pressure reduces black-white achievement gaps by raising mean black achievement without harming mean white achievement. We find no differential effects of accountability pressure based on the racial composition of schools, but schools with more affluent populations are the most successful at reducing the black-white math achievement gap. Thus, our findings suggest that school-based interventions have the potential to close test score gaps, but differences in school composition and resources play a significant role in the ability of schools to reduce racial inequality. Copyright © 2013 Elsevier Inc. All rights reserved.
Source Country Differences in Test Score Gaps: Evidence from Denmark

Science.gov (United States)

Rangvid, Beatrice Schindler

2010-01-01

We combine data from three studies for Denmark in the PISA 2000 framework to investigate differences in the native-immigrant test score gap by country of origin. In addition to the controls available from PISA data sources, we use student-level data on home background and individual migration histories linked from administrative registers. We find…
Short communication prevalence of susceptibility to etravirine by genotype and phenotype in samples received for routine HIV type 1 resistance testing in the United States.

Science.gov (United States)

Picchio, Gaston; Vingerhoets, Johan; Tambuyzer, Lotke; Coakley, Eoin; Haddad, Mojgan; Witek, James

2011-12-01

Abstract The prevalence of susceptibility to etravirine was investigated among clinical samples submitted for routine clinical testing in the United States using two separate weighted genotypic scoring systems. The presence of etravirine mutations and susceptibility to etravirine by phenotype of clinical samples from HIV-1-infected patients, submitted to Monogram Biosciences for routine resistance testing between June 2008 and June 2009, were analyzed. Susceptibility by genotype was determined using the Monogram and Tibotec etravirine-weighted genotypic scoring systems, with scores of ≤3 and ≤2, respectively, indicating full susceptibility. Susceptibility by phenotype was determined using the PhenoSense HIV assay, with lower and higher clinical cut-offs of 2.9 and 10, respectively. The frequency of individual etravirine mutations and the impact of the K103N mutation on susceptibility to etravirine by genotype were also determined. Among the 5482 samples with ≥1 defined nonnucleoside reverse transcriptase inhibitor (NNRTI) mutations associated with resistance, 67% were classed as susceptible to etravirine by genotype by both scoring systems. Susceptibility to etravirine by phenotype was higher (76%). The proportion of first-generation NNRTI-resistant samples with (n=3598) and without (n=1884) K103N with susceptibility to etravirine by genotype was 77% and 49%, respectively. Among samples susceptible to first-generation NNRTIs (n=9458), >99% of samples were susceptible to etravirine by phenotype (FC <2.9); the remaining samples had FC ≥2.9-10. In summary, among samples submitted for routine clinical testing in the United States, a high proportion of samples with first-generation NNRTI resistance was susceptible to etravirine by genotype and phenotype. A higher proportion of NNRTI-resistant samples with K103N than without was susceptible to etravirine.
Effects of correcting for prematurity on cognitive test scores in childhood.

Science.gov (United States)

Wilson-Ching, Michelle; Pascoe, Leona; Doyle, Lex W; Anderson, Peter J

2014-03-01

The American Academy of Pediatrics recommends that test scores should be corrected for prematurity up to 3 years of age, but this practice varies greatly in both clinical and research settings. The aim of this study was to contrast the effects of using chronological age and those of using corrected age on measures of cognitive outcome across childhood. A theoretical model was constructed using norms from the Bayley Scales of Infant and Toddler Development, Third Edition; the Wechsler Preschool and Primary Scale of Intelligence, Third Edition Australian; and the Wechsler Intelligence Scales for Children, Fourth Edition Australian. Baseline scores representing different levels of functioning (70, below average; 85, borderline; and 100, average) were recalculated using the normative data for ages 6 months to 16 years to account for 1, 2, 3 and 4 months of prematurity. The model created depicted the difference in standardised scores between chronological and corrected age. Compared with scores corrected for prematurity, the absolute reduction in scores using chronological age was greater for increasing degree of prematurity, younger ages at assessment and higher baseline scores and was substantial even beyond 3 years of age. However, the pattern was erratic, with considerable fluctuation evident across different ages and baseline scores. Chronological age results in a lowering of scores at all ages for preterm-born subjects that is greater in the first few years and in those born at earlier gestational ages. Whether or not to correct for prematurity depends upon the context of the assessment. © 2014 The Authors. Journal of Paediatrics and Child Health © 2014 Paediatrics and Child Health Division (Royal Australasian College of Physicians).
The Effects of Group Members' Personalities on a Test Taker's L2 Group Oral Discussion Test Scores

Science.gov (United States)

Ockey, Gary J.

2009-01-01

The second language group oral is a test of second language speaking proficiency, in which a group of three or more English language learners discuss an assigned topic without interaction with interlocutors. Concerns expressed about the extent to which test takers' personal characteristics affect the scores of others in the group have limited its…
Standardised test protocol (Constant Score) for evaluation of functionality in patients with shoulder disorders

DEFF Research Database (Denmark)

Ban, Ilija; Troelsen, Anders; Christiansen, David Høyrup

2013-01-01

INTRODUCTION: The Constant Score (CS), developed as a scoring system to evaluate overall functionality of patients with shoulder disorders, is widely used but has been criticised for relying on an imprecise terminology and for lack of a standardised methodology. A modified guideline was therefore...... differences. One of the authors of the modified CS approved both the English and the Danish test protocol. CONCLUSION: A simple test protocol of the modified CS was developed in both English and Danish. With precise terminology and definitions, the test protocol is the first of its kind. We suggest its use...
Psychometric Quality of the Dutch Version of the Children's Eating Attitude Test in a Community Sample and a Sample of Overweight Youngsters

Directory of Open Access Journals (Sweden)

Lotte Theuwis

2010-12-01

Full Text Available Introduction. Disturbed eating attitudes may be important precursors of pathological eating patterns and, therefore need to be researched adequately. The Children's Eating Attitude Test (ChEAT is indicated for detecting at-risk attitudes and concerns in youngsters. Method. The present study was designed to provide a preliminary psychometric evaluation of the Dutch version of the ChEAT, by examining reliability and validity in a sample of 166 youngsters. Results. Generally the ChEAT seems to be a reliable instrument. Concurrent validity was demonstrated by positive correlations with measures assessing pathological eating behaviour and with related psychological problems. The discriminant validity was good. Based on ChEAT scores we can distinguish overweight youngsters from the community sample and “dieters” from “non dieters”. Divergent validity and factor structure show still shortcomings. Discussion. The Dutch version of the ChEAT seems to be a promising screening- and research instrument. Future prospective research could focus on a cut-off score for identifying at-risk youngsters.
Opportunity to learn: Investigating possible predictors for pre-course Test Of Astronomy STandards TOAST scores

Science.gov (United States)

Berryhill, Katie J.

As astronomy education researchers become more interested in experimentally testing innovative teaching strategies to enhance learning in introductory astronomy survey courses ("ASTRO 101"), scholars are placing increased attention toward better understanding factors impacting student gain scores on the widely used Test Of Astronomy STandards (TOAST). Usually used in a pre-test and post-test research design, one might naturally assume that the pre-course differences observed between high- and low-scoring college students might be due in large part to their pre-existing motivation, interest, experience in science, and attitudes about astronomy. To explore this notion, 11 non-science majoring undergraduates taking ASTRO 101 at west coast community colleges were interviewed in the first few weeks of the course to better understand students' pre-existing affect toward learning astronomy with an eye toward predicting student success. In answering this question, we hope to contribute to our understanding of the incoming knowledge of students taking undergraduate introductory astronomy classes, but also gain insight into how faculty can best meet those students' needs and assist them in achieving success. Perhaps surprisingly, there was only weak correlation between students' motivation toward learning astronomy and their pre-test scores. Instead, the most fruitful predictor of TOAST pre-test scores was the quantity of pre-existing, informal, self-directed astronomy learning experiences.
Individual Differences in Digit Span, Susceptibility to Proactive Interference, and Aptitude/Achievement Test Scores.

Science.gov (United States)

Dempster, Frank N.; Cooney, John B.

1982-01-01

Individual differences in digit span, susceptibility to proactive interference, and various aptitude/achievement test scores were investigated in two experiments with college students. Results indicated that digit span was strongly correlated with aptitude/achievement scores, but did not indicate that susceptibility to proactive interference…
Tests on standard concrete samples

CERN Multimedia

CERN PhotoLab

1973-01-01

Compression and tensile tests on standard concrete samples. The use of centrifugal force in tensile testing has been developed by the SB Division and the instruments were built in the Central workshops.

The Alzheimer's prevention initiative composite cognitive test score: sample size estimates for the evaluation of preclinical Alzheimer's disease treatments in presenilin 1 E280A mutation carriers.

Science.gov (United States)

Ayutyanont, Napatkamon; Langbaum, Jessica B S; Hendrix, Suzanne B; Chen, Kewei; Fleisher, Adam S; Friesenhahn, Michel; Ward, Michael; Aguirre, Camilo; Acosta-Baena, Natalia; Madrigal, Lucìa; Muñoz, Claudia; Tirado, Victoria; Moreno, Sonia; Tariot, Pierre N; Lopera, Francisco; Reiman, Eric M

2014-06-01

To identify a cognitive composite that is sensitive to tracking preclinical Alzheimer's disease decline to be used as a primary end point in treatment trials. We capitalized on longitudinal data collected from 1995 to 2010 from cognitively unimpaired presenilin 1 (PSEN1) E280A mutation carriers from the world's largest known early-onset autosomal dominant Alzheimer's disease kindred to identify a composite cognitive test with the greatest statistical power to track preclinical Alzheimer's disease decline and estimate the number of carriers age 30 years and older needed to detect a treatment effect in the Alzheimer's Prevention Initiative's (API) preclinical Alzheimer's disease treatment trial. The mean-to-standard-deviation ratios (MSDRs) of change over time were calculated in a search for the optimal combination of 1 to 7 cognitive tests/subtests drawn from the neuropsychological test battery in cognitively unimpaired mutation carriers during a 2- and 5-year follow-up period (n = 78 and 57), using data from noncarriers (n = 31 and 56) during the same time period to correct for aging and practice effects. Combinations that performed well were then evaluated for robustness across follow-up years, occurrence of selected items within top-performing combinations, and representation of relevant cognitive domains. The optimal test combination included Consortium to Establish a Registry for Alzheimer's Disease (CERAD) Word List Recall, CERAD Boston Naming Test (high frequency items), Mini-Mental State Examination (MMSE) Orientation to Time, CERAD Constructional Praxis, and Raven's Progressive Matrices (Set A), with an MSDR of 1.62. This composite is more sensitive than using either the CERAD Word List Recall (MSDR = 0.38) or the entire CERAD-Col battery (MSDR = 0.76). A sample size of 75 cognitively normal PSEN1 E280A mutation carriers aged 30 years and older per treatment arm allows for a detectable treatment effect of 29% in a 60-month trial (80% power, P = .05). We
Construction of an Exome-Wide Risk Score for Schizophrenia Based on a Weighted Burden Test.

Science.gov (United States)

Curtis, David

2018-01-01

Polygenic risk scores obtained as a weighted sum of associated variants can be used to explore association in additional data sets and to assign risk scores to individuals. The methods used to derive polygenic risk scores from common SNPs are not suitable for variants detected in whole exome sequencing studies. Rare variants, which may have major effects, are seen too infrequently to judge whether they are associated and may not be shared between training and test subjects. A method is proposed whereby variants are weighted according to their frequency, their annotations and the genes they affect. A weighted sum across all variants provides an individual risk score. Scores constructed in this way are used in a weighted burden test and are shown to be significantly different between schizophrenia cases and controls using a five-way cross-validation procedure. This approach represents a first attempt to summarise exome sequence variation into a summary risk score, which could be combined with risk scores from common variants and from environmental factors. It is hoped that the method could be developed further. © 2017 John Wiley & Sons Ltd/University College London.
Pediatric residents' learning styles and temperaments and their relationships to standardized test scores.

Science.gov (United States)

Tuli, Sanjeev Y; Thompson, Lindsay A; Saliba, Heidi; Black, Erik W; Ryan, Kathleen A; Kelly, Maria N; Novak, Maureen; Mellott, Jane; Tuli, Sonal S

2011-12-01

Board certification is an important professional qualification and a prerequisite for credentialing, and the Accreditation Council for Graduate Medical Education (ACGME) assesses board certification rates as a component of residency program effectiveness. To date, research has shown that preresidency measures, including National Board of Medical Examiners scores, Alpha Omega Alpha Honor Medical Society membership, or medical school grades poorly predict postresidency board examination scores. However, learning styles and temperament have been identified as factors that 5 affect test-taking performance. The purpose of this study is to characterize the learning styles and temperaments of pediatric residents and to evaluate their relationships to yearly in-service and postresidency board examination scores. This cross-sectional study analyzed the learning styles and temperaments of current and past pediatric residents by administration of 3 validated tools: the Kolb Learning Style Inventory, the Keirsey Temperament Sorter, and the Felder-Silverman Learning Style test. These results were compared with known, normative, general and medical population data and evaluated for correlation to in-service examination and postresidency board examination scores. The predominant learning style for pediatric residents was converging 44% (33 of 75 residents) and the predominant temperament was guardian 61% (34 of 56 residents). The learning style and temperament distribution of the residents was significantly different from published population data (P = .002 and .04, respectively). Learning styles, with one exception, were found to be unrelated to standardized test scores. The predominant learning style and temperament of pediatric residents is significantly different than that of the populations of general and medical trainees. However, learning styles and temperament do not predict outcomes on standardized in-service and board examinations in pediatric residents.
A Summary Score for the Framingham Heart Study Neuropsychological Battery.

Science.gov (United States)

Downer, Brian; Fardo, David W; Schmitt, Frederick A

2015-10-01

To calculate three summary scores of the Framingham Heart Study neuropsychological battery and determine which score best differentiates between subjects classified as having normal cognition, test-based impaired learning and memory, test-based multidomain impairment, and dementia. The final sample included 2,503 participants. Three summary scores were assessed: (a) composite score that provided equal weight to each subtest, (b) composite score that provided equal weight to each cognitive domain assessed by the neuropsychological battery, and (c) abbreviated score comprised of subtests for learning and memory. Receiver operating characteristic analysis was used to determine which summary score best differentiated between the four cognitive states. The summary score that provided equal weight to each subtest best differentiated between the four cognitive states. A summary score that provides equal weight to each subtest is an efficient way to utilize all of the cognitive data collected by a neuropsychological battery. © The Author(s) 2015.
Experimental and Sampling Design for the INL-2 Sample Collection Operational Test

Energy Technology Data Exchange (ETDEWEB)

Piepel, Gregory F.; Amidan, Brett G.; Matzke, Brett D.

2009-02-16

This report describes the experimental and sampling design developed to assess sampling approaches and methods for detecting contamination in a building and clearing the building for use after decontamination. An Idaho National Laboratory (INL) building will be contaminated with BG (Bacillus globigii, renamed Bacillus atrophaeus), a simulant for Bacillus anthracis (BA). The contamination, sampling, decontamination, and re-sampling will occur per the experimental and sampling design. This INL-2 Sample Collection Operational Test is being planned by the Validated Sampling Plan Working Group (VSPWG). The primary objectives are: 1) Evaluate judgmental and probabilistic sampling for characterization as well as probabilistic and combined (judgment and probabilistic) sampling approaches for clearance, 2) Conduct these evaluations for gradient contamination (from low or moderate down to absent or undetectable) for different initial concentrations of the contaminant, 3) Explore judgment composite sampling approaches to reduce sample numbers, 4) Collect baseline data to serve as an indication of the actual levels of contamination in the tests. A combined judgmental and random (CJR) approach uses Bayesian methodology to combine judgmental and probabilistic samples to make clearance statements of the form "X% confidence that at least Y% of an area does not contain detectable contamination” (X%/Y% clearance statements). The INL-2 experimental design has five test events, which 1) vary the floor of the INL building on which the contaminant will be released, 2) provide for varying the amount of contaminant released to obtain desired concentration gradients, and 3) investigate overt as well as covert release of contaminants. Desirable contaminant gradients would have moderate to low concentrations of contaminant in rooms near the release point, with concentrations down to zero in other rooms. Such gradients would provide a range of contamination levels to challenge the sampling
Use of Verbal Descriptors, Thermal Scores and Electrical Pulp Testing Scores as Predictors of Tooth Pain Before and After Application of Benzocaine Gels into Cavities of Teeth with Pulpitis

Science.gov (United States)

Gangarosa, Louis P.; Ciarlone, Alfred E.; Neaverth, Elmer J.; Johnston, Carey A.; Snowden, J. Douglas; Thompson, William O.

1989-01-01

A double-blind pilot study was conducted on 27 consenting human volunteers who had irreversible pulpitis associated with persistent toothache pain from open carious lesions. Formulations tested contained either 0, 10%, or 20% benzocaine and were identified only by a numbered code. Before the experiment started, a small amount of a known 5% benzocaine gel was placed for 1 minute on the tongue of each patient to assure a sensation of numbness within the oral cavity. Then the test tooth was washed with a gentle stream of warm water and dried with gauze. A randomly selected test medication was placed into the open cavity and around the gingival margins for 5 minutes. Pre- and posttreatment tests were conducted at the following timed intervals: 0, 5, 15, 30, 45, 60, 75 and 90 minutes. The tests included degree of pain (rated: 0 = none, 1 = mild, 2 = moderate, 3 = severe); electrical pulp testing (EPT) by a modified, voltage-ramping instrument; and ice water testing (0.5 mL directed quickly onto sound enamel of the tooth and rated: 0 to 4, with 4 being intolerable). After testing, or when pain returned to baseline, endodontic procedures were performed. There was a significant increase (p pulpitis and control teeth, 3) there were no correlations between direction of EPT scores and pain relief, 4) cold water testing was a good predictor of whether or not a tooth had pulpitis, and 5) changes in cold water testing scores after treatment could not be correlated to relief of pain according to verbal descriptors. The effectiveness of benzocaine in relieving toothache pain verifies previous studies; however, a difference between 10% and 20% benzocaine could not be demonstrated probably because of two factors: 1) the present experiment had a small sample size, and 2) there was no direct measurement of duration of local anesthesia. PMID:2490060
¿Exito en California? A Validity Critique of Language Program Evaluations and Analysis of English Learner Test Scores

Directory of Open Access Journals (Sweden)

Marilyn S. Thompson

2002-01-01

Full Text Available Several states have recently faced ballot initiatives that propose to functionally eliminate bilingual education in favor of English-only approaches. Proponents of these initiatives have argued an overall rise in standardized achievement scores of California's limited English proficient (LEP students is largely due to the implementation of English immersion programs mandated by Proposition 227 in 1998, hence, they claim Exito en California (Success in California. However, many such arguments presented in the media were based on flawed summaries of these data. We first discuss the background, media coverage, and previous research associated with California's Proposition 227. We then present a series of validity concerns regarding use of Stanford-9 achievement data to address policy for educating LEP students; these concerns include the language of the test, alternative explanations, sample selection, and data analysis decisions. Finally, we present a comprehensive summary of scaled-score achievement means and trajectories for California's LEP and non-LEP students for 1998-2000. Our analyses indicate that although scores have risen overall, the achievement gap between LEP and EP students does not appear to be narrowing.
ACER Mathematics Profile Series: Number Test. (Test Booklet, Answer and Record Sheet, Score Key, and Teachers Handbook).

Science.gov (United States)

Cornish, Greg; Wines, Robin

The Number Test of the ACER Mathematics Profile Series, contains 30 items, for each of three suggested grade levels: 7-8, 8-9, and 9-10. Raw scores on all tests in the ACER Mathematics Profile Series (Number, Operations, Space and Measurement) are converted to a common scale called MAPS, a major feature of the Series. Based on the Rasch Model,…
Sample Size Determination for One- and Two-Sample Trimmed Mean Tests

Science.gov (United States)

Luh, Wei-Ming; Olejnik, Stephen; Guo, Jiin-Huarng

2008-01-01

Formulas to determine the necessary sample sizes for parametric tests of group comparisons are available from several sources and appropriate when population distributions are normal. However, in the context of nonnormal population distributions, researchers recommend Yuen's trimmed mean test, but formulas to determine sample sizes have not been…
The Alzheimer’s Prevention Initiative composite cognitive test score: Sample size estimates for the evaluation of preclinical Alzheimer’s disease treatments in presenilin 1 E280A mutation carriers

Science.gov (United States)

Ayutyanont, Napatkamon; Langbaum, Jessica B.; Hendrix, Suzanne B.; Chen, Kewei; Fleisher, Adam S.; Friesenhahn, Michel; Ward, Michael; Aguirre, Camilo; Acosta-Baena, Natalia; Madrigal, Lucìa; Muñoz, Claudia; Tirado, Victoria; Moreno, Sonia; Tariot, Pierre N.; Lopera, Francisco; Reiman, Eric M.

2014-01-01

Objective There is a need to identify a cognitive composite that is sensitive to tracking preclinical AD decline to be used as a primary endpoint in treatment trials. Method We capitalized on longitudinal data, collected from 1995 to 2010, from cognitively unimpaired presenilin 1 (PSEN1) E280A mutation carriers from the world’s largest known early-onset autosomal dominant AD (ADAD) kindred to identify a composite cognitive test with the greatest statistical power to track preclinical AD decline and estimate the number of carriers age 30 and older needed to detect a treatment effect in the Alzheimer’s Prevention Initiative’s (API) preclinical AD treatment trial. The mean-to-standard-deviation ratios (MSDRs) of change over time were calculated in a search for the optimal combination of one to seven cognitive tests/sub-tests drawn from the neuropsychological test battery in cognitively unimpaired mutation carriers during a two and five year follow-up period, using data from non-carriers during the same time period to correct for aging and practice effects. Combinations that performed well were then evaluated for robustness across follow-up years, occurrence of selected items within top performing combinations and representation of relevant cognitive domains. Results This optimal test combination included CERAD Word List Recall, CERAD Boston Naming Test (high frequency items), MMSE Orientation to Time, CERAD Constructional Praxis and Ravens Progressive Matrices (Set A) with an MSDR of 1.62. This composite is more sensitive than using either the CERAD Word List Recall (MSDR=0.38) or the entire CERAD-Col battery (MSDR=0.76). A sample size of 75 cognitively normal PSEN1-E280A mutation carriers age 30 and older per treatment arm allows for a detectable treatment effect of 29% in a 60-month trial (80% power, p=0.05). Conclusions We have identified a composite cognitive test score representing multiple cognitive domains that has improved power compared to the most
Mini mental Parkinson test: standardization and normative data on an Italian sample.

Science.gov (United States)

Costa, Alberto; Bagoj, Eriola; Monaco, Marco; Zabberoni, Silvia; De Rosa, Salvatore; Mundi, Ciro; Caltagirone, Carlo; Carlesimo, Giovanni Augusto

2013-10-01

The mini mental Parkinson (MMP) is a test built to overcome the limits of the mini mental state examination (MMSE) in the short-time screening of cognitive disorders in individuals with Parkinson's disease (PD). In fact, in this scale, items tapping executive functioning are included to better capture PD-related cognitive changes. Some data sustain the sensitivity and validity of the MMP in the short neuropsychological screening of these individuals. Here, we report normative data on the MMP we collected on a sample of 307 Italian healthy subjects ranging from 40 to 91 years. The results document a detrimental effect of age and an ameliorative effect of education on the MMP total performance score. We provide for correction grids for age and literacy that derive from results of the regression analyses. Moreover, we also computed equivalent scores in order to allow a direct and fast comparison between the performance on the MMP and on other psychometric measures that can be administered to the subjects.
Linear-rank testing of a non-binary, responder-analysis, efficacy score to evaluate pharmacotherapies for substance use disorders.

Science.gov (United States)

Holmes, Tyson H; Li, Shou-Hua; McCann, David J

2016-11-23

The design of pharmacological trials for management of substance use disorders is shifting toward outcomes of successful individual-level behavior (abstinence or no heavy use). While binary success/failure analyses are common, McCann and Li (CNS Neurosci Ther 2012; 18: 414-418) introduced "number of beyond-threshold weeks of success" (NOBWOS) scores to avoid dichotomized outcomes. NOBWOS scoring employs an efficacy "hurdle" with values reflecting duration of success. Here, we evaluate NOBWOS scores rigorously. Formal analysis of mathematical structure of NOBWOS scores is followed by simulation studies spanning diverse conditions to assess operating characteristics of five linear-rank tests on NOBWOS scores. Simulations include assessment of Fisher's exact test applied to hurdle component. On average, statistical power was approximately equal for five linear-rank tests. Under none of conditions examined did Fisher's exact test exhibit greater statistical power than any of the linear-rank tests. These linear-rank tests provide good Type I and Type II error control for comparing distributions of NOBWOS scores between groups (e.g. active vs. placebo). All methods were applied to re-analyses of data from four clinical trials of differing lengths and substances of abuse. These linear-rank tests agreed across all trials in rejecting (or not) their null (equality of distributions) at ≤ 0.05. © The Author(s) 2016.
Estimating Sample Size for Usability Testing

Directory of Open Access Journals (Sweden)

Alex Cazañas

2017-02-01

Full Text Available One strategy used to assure that an interface meets user requirements is to conduct usability testing. When conducting such testing one of the unknowns is sample size. Since extensive testing is costly, minimizing the number of participants can contribute greatly to successful resource management of a project. Even though a significant number of models have been proposed to estimate sample size in usability testing, there is still not consensus on the optimal size. Several studies claim that 3 to 5 users suffice to uncover 80% of problems in a software interface. However, many other studies challenge this assertion. This study analyzed data collected from the user testing of a web application to verify the rule of thumb, commonly known as the “magic number 5”. The outcomes of the analysis showed that the 5-user rule significantly underestimates the required sample size to achieve reasonable levels of problem detection.
Decision making under internal uncertainty: the case of multiple-choice tests with different scoring rules.

Science.gov (United States)

Bereby-Meyer, Yoella; Meyer, Joachim; Budescu, David V

2003-02-01

This paper assesses framing effects on decision making with internal uncertainty, i.e., partial knowledge, by focusing on examinees' behavior in multiple-choice (MC) tests with different scoring rules. In two experiments participants answered a general-knowledge MC test that consisted of 34 solvable and 6 unsolvable items. Experiment 1 studied two scoring rules involving Positive (only gains) and Negative (only losses) scores. Although answering all items was the dominating strategy for both rules, the results revealed a greater tendency to answer under the Negative scoring rule. These results are in line with the predictions derived from Prospect Theory (PT) [Econometrica 47 (1979) 263]. The second experiment studied two scoring rules, which allowed respondents to exhibit partial knowledge. Under the Inclusion-scoring rule the respondents mark all answers that could be correct, and under the Exclusion-scoring rule they exclude all answers that might be incorrect. As predicted by PT, respondents took more risks under the Inclusion rule than under the Exclusion rule. The results illustrate that the basic process that underlies choice behavior under internal uncertainty and especially the effect of framing is similar to the process of choice under external uncertainty and can be described quite accurately by PT. Copyright 2002 Elsevier Science B.V.
An Analysis of Cross Racial Identity Scale Scores Using Classical Test Theory and Rasch Item Response Models

Science.gov (United States)

Sussman, Joshua; Beaujean, A. Alexander; Worrell, Frank C.; Watson, Stevie

2013-01-01

Item response models (IRMs) were used to analyze Cross Racial Identity Scale (CRIS) scores. Rasch analysis scores were compared with classical test theory (CTT) scores. The partial credit model demonstrated a high goodness of fit and correlations between Rasch and CTT scores ranged from 0.91 to 0.99. CRIS scores are supported by both methods.…
The Impact of Correction for Guessing Formula on MC and Yes/No Vocabulary Tests' Scores

Directory of Open Access Journals (Sweden)

abdollah baradaran

2009-10-01

Full Text Available A standard correction for random guessing (cfg formula on multiple-choice and Yes/Noexaminations was examined retrospectively in the scores of the intermediate female EFL learners in an English language school. The correctionwas a weighting formula for points awarded for correct answers,incorrect answers, and unanswered questions so that the expectedvalue of the increase in test score due to guessing was zero. The researcher compared uncorrected and corrected scores on examinationsusing multiple-choice and Yes/No formats. These short-answer formats eliminatedor at least greatly reduced the potential for guessing the correctanswer. The expectation for students to improve their grade by guessingon multiple-choice and Yes/No format examinations is well known. The researcher examined a method for correcting for random guessing (cfg " no knowledge" on multiple- choice and Yes/No vocabulary examinations by comparing application and non-application of correction for guessing (cfg formula on scores on these examinations. It was done to determine whether the test takers really knew the correct answer, or they had resorted to a kind of guessing. This study represented a unique opportunity to compare scores from multiple-choice and Yes/No examinations in a settingin which students were given the same number of questions ineach of the two format types testing their knowledge over thesame subject matter. The results of this study indicated that the significant differences were highlighted between the subjects' scores when cfg formula was applied and when it was not.
Sample Selectivity and the Validity of International Student Achievement Tests in Economic Research. NBER Working Paper No. 15867

Science.gov (United States)

Hanushek, Eric A.; Woessmann, Ludger

2010-01-01

Critics of international student comparisons argue that results may be influenced by differences in the extent to which countries adequately sample their entire student populations. In this research note, we show that larger exclusion and non-response rates are related to better country average scores on international tests, as are larger…
The Fagerström Test for Nicotine Dependence in a Dutch sample of daily smokers and ex-smokers

NARCIS (Netherlands)

Vink, Jacqueline M.; Willemsen, Gonneke; Beem, A. Leo; Boomsma, Dorret I.

2005-01-01

We explored the performance of the Fagerström Test for Nicotine Dependence (FTND) in a sample of 1378 daily smokers and 1058 ex-smokers who participated in a survey study of the Netherlands Twin Register. FTND scores were higher for smokers than for ex-smokers. Nicotine dependence level was not
Association testing for next-generation sequencing data using score statistics

DEFF Research Database (Denmark)

Skotte, Line; Korneliussen, Thorfinn Sand; Albrechtsen, Anders

2012-01-01

computationally feasible due to the use of score statistics. As part of the joint likelihood, we model the distribution of the phenotypes using a generalized linear model framework, which works for both quantitative and discrete phenotypes. Thus, the method presented here is applicable to case-control studies...... of genotype calls into account have been proposed; most require numerical optimization which for large-scale data is not always computationally feasible. We show that using a score statistic for the joint likelihood of observed phenotypes and observed sequencing data provides an attractive approach...... to association testing for next-generation sequencing data. The joint model accounts for the genotype classification uncertainty via the posterior probabilities of the genotypes given the observed sequencing data, which gives the approach higher power than methods based on called genotypes. This strategy remains...
Stability of Scores on Super's Work Values Inventory-Revised

Science.gov (United States)

Leuty, Melanie E.

2013-01-01

Test-retest data on Super's Work Values Inventory-Revised for a group of predominantly White ("N" = 995) women (mean age = 23.5 years, SD = 8.07) and men (mean age = 21.5 years, SD = 5.80) showed stability in mean-level scores over a period of 1 year for the sample as a whole. However, low raw score and rank order stability coefficients…

A Prorating Method for Estimating MMPI-2-RF Scores From MMPI Responses: Examination of Score Fidelity and Illustration of Empirical Utility in the PERSEREC Police Integrity Study Sample.

Science.gov (United States)

Tarescavage, Anthony M; Corey, David M; Ben-Porath, Yossef S

2016-04-01

The purpose of the current study was to identify Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF) correlates of police officer integrity violations and other problem behaviors in an archival database with original MMPI item responses and collateral information regarding integrity violations obtained for 417 male officers. In Study 1, we estimated MMPI-2-RF scores from the MMPI item pool (which includes approximately 80% of the MMPI-2-RF items) in a normative sample, a psychiatric inpatient sample, and a police officer sample, and conducted analyses that demonstrated the comparability of estimated and full scale scores for 41 of the 51 MMPI-2-RF scales. In Study 2, we correlated estimated MMPI-2-RF scores with information about subsequent integrity violations and problem behaviors from the integrity violation data set. Several meaningful associations were obtained, predominately with scales from the emotional, thought, and behavioral dysfunction domains of the MMPI-2-RF. Application of a correction for range restriction yielded substantially improved validity estimates. Finally, we calculated relative risk ratios for the statistically significant findings using cutoffs lower than 65T, which is traditionally used to identify clinically significant elevations, and found several meaningful relative risk ratios. © The Author(s) 2015.
Effects of Analytical and Holistic Scoring Patterns on Scorer Reliability in Biology Essay Tests

Science.gov (United States)

Ebuoh, Casmir N.

2018-01-01

Literature revealed that the patterns/methods of scoring essay tests had been criticized for not being reliable and this unreliability is more likely to be more in internal examinations than in the external examinations. The purpose of this study is to find out the effects of analytical and holistic scoring patterns on scorer reliability in…
The Dysexecutive Questionnaire advanced: item and test score characteristics, 4-factor solution, and severity classification.

Science.gov (United States)

Bodenburg, Sebastian; Dopslaff, Nina

2008-01-01

The Dysexecutive Questionnaire (DEX, , Behavioral assessment of the dysexecutive syndrome, 1996) is a standardized instrument to measure possible behavioral changes as a result of the dysexecutive syndrome. Although initially intended only as a qualitative instrument, the DEX has also been used increasingly to address quantitative problems. Until now there have not been more fundamental statistical analyses of the questionnaire's testing quality. The present study is based on an unselected sample of 191 patients with acquired brain injury and reports on the data relating to the quality of the items, the reliability and the factorial structure of the DEX. Item 3 displayed too great an item difficulty, whereas item 11 was not sufficiently discriminating. The DEX's reliability in self-rating is r = 0.85. In addition to presenting the statistical values of the tests, a clinical severity classification of the overall scores of the 4 found factors and of the questionnaire as a whole is carried out on the basis of quartile standards.
Linkage analysis in nuclear families. 2: Relationship between affected sib-pair tests and lod score analysis.

Science.gov (United States)

Knapp, M; Seuchter, S A; Baur, M P

1994-01-01

It is believed that the main advantage of affected sib-pair tests is that their application requires no information about the underlying genetic mechanism of the disease. However, here it is proved that the mean test, which can be considered the most prominent of the affected sib-pair tests, is equivalent to lod score analysis for an assumed recessive mode of inheritance, irrespective of the true mode of the disease. Further relationships of certain sib-pair tests and lod score analysis under specific assumed genetic modes are investigated.
Association of Health Sciences Reasoning Test scores with academic and experiential performance.

Science.gov (United States)

Cox, Wendy C; McLaughlin, Jacqueline E

2014-05-15

To assess the association of scores on the Health Sciences Reasoning Test (HSRT) with academic and experiential performance in a doctor of pharmacy (PharmD) curriculum. The HSRT was administered to 329 first-year (P1) PharmD students. Performance on the HSRT and its subscales was compared with academic performance in 29 courses throughout the curriculum and with performance in advanced pharmacy practice experiences (APPEs). Significant positive correlations were found between course grades in 8 courses and HSRT overall scores. All significant correlations were accounted for by pharmaceutical care laboratory courses, therapeutics courses, and a law and ethics course. There was a lack of moderate to strong correlation between HSRT scores and academic and experiential performance. The usefulness of the HSRT as a tool for predicting student success may be limited.
Impact of Answer-Switching Behavior on Multiple-Choice Test Scores in Higher Education

Directory of Open Access Journals (Sweden)

Ramazan BAŞTÜRK

2011-06-01

Full Text Available The multiple- choice format is one of the most popular selected-response item formats used in educational testing. Researchers have shown that Multiple-choice type test is a useful vehicle for student assessment in core university subjects that usually have large student numbers. Even though the educators, test experts and different test recourses maintain the idea that the first answer should be retained, many researchers argued that this argument is not dependent with empirical findings. The main question of this study is to examine how the answer switching behavior affects the multiple-choice test score. Additionally, gender differences and relationship between number of answer switching behavior and item parameters (item difficulty and item discrimination were investigated. The participants in this study consisted of 207 upper-level College of Education students from mid-sized universities. A Midterm exam consisted of 20 multiple-choice questions was used. According to the result of this study, answer switching behavior statistically increase test scores. On the other hand, there is no significant gender difference in answer-switching behavior. Additionally, there is a significant negative relationship between answer switching behavior and item difficulties.
Testing a groundwater sampling tool: Are the samples representative?

International Nuclear Information System (INIS)

Kaback, D.S.; Bergren, C.L.; Carlson, C.A.; Carlson, C.L.

1989-01-01

A ground water sampling tool, the HydroPunch trademark, was tested at the Department of Energy's Savannah River Site in South Carolina to determine if representative ground water samples could be obtained without installing monitoring wells. Chemical analyses of ground water samples collected with the HydroPunch trademark from various depths within a borehole were compared with chemical analyses of ground water from nearby monitoring wells. The site selected for the test was in the vicinity of a large coal storage pile and a coal pile runoff basin that was constructed to collect the runoff from the coal storage pile. Existing monitoring wells in the area indicate the presence of a ground water contaminant plume that: (1) contains elevated concentrations of trace metals; (2) has an extremely low pH; and (3) contains elevated concentrations of major cations and anions. Ground water samples collected with the HydroPunch trademark provide in excellent estimate of ground water quality at discrete depths. Groundwater chemical data collected from various depths using the HydroPunch trademark can be averaged to simulate what a screen zone in a monitoring well would sample. The averaged depth-discrete data compared favorably with the data obtained from the nearby monitoring wells
CaPTHUS scoring model in primary hyperparathyroidism: can it eliminate the need for ioPTH testing?

Science.gov (United States)

Elfenbein, Dawn M; Weber, Sara; Schneider, David F; Sippel, Rebecca S; Chen, Herbert

2015-04-01

The CaPTHUS model was reported to have a positive predictive value of 100 % to correctly predict single-gland disease in patients with primary hyperparathyroidism, thus obviating the need for intraoperative parathyroid hormone (ioPTH) testing. We sought to apply the CaPTHUS scoring model in our patient population and assess its utility in predicting long-term biochemical cure. We retrospective reviewed all parathyroidectomies for primary hyperparathyroidism performed at our university hospital from 2003 to 2012. We routinely perform ioPTH testing. Biochemical cure was defined as a normal calcium level at 6 months. A total of 1,421 patients met the inclusion criteria: 78 % of patients had a single adenoma at the time of surgery, 98 % had a normal serum calcium at 1 week postoperatively, and 96 % had a normal serum calcium level 6 months postoperatively. Using the CaPTHUS scoring model, 307 patients (22.5 %) had a score of ≥ 3, with a positive predictive value of 91 % for single adenoma. A CaPTHUS score of ≥ 3 had a positive predictive value of 98 % for biochemical cure at 1 week as well as at 6 months. In our population, where ioPTH testing is used routinely to guide use of bilateral exploration, patients with a preoperative CaPTHUS score of ≥ 3 had good long-term biochemical cure rates. However, the model only predicted adenoma in 91 % of cases. If minimally invasive parathyroidectomy without ioPTH testing had been done for these patients, the cure rate would have dropped from 98 % to an unacceptable 89 %. Even in these patients with high CaPTHUS scores, multigland disease is present in almost 10 %, and ioPTH testing is necessary.
[The Amsterdam Dementia Screening Test in cognitively healthy and clinical samples. An update of normative data].

Science.gov (United States)

van Toutert, Meta; Diesfeldt, Han; Hoek, Dirk

2016-10-01

The six tests in the Amsterdam Dementia Screening Test (ADST) examine the cognitive domains of episodic memory (delayed picture recognition, word learning), orientation, category fluency (animals and occupations), constructional ability (figure copying) and executive function (alternating sequences). New normative data were collected in a sample of 102 elderly volunteers (aged 65-94), including subjects with medical or other health conditions, except dementia or frank cognitive impairment (MMSE > 24). Included subjects were independent in complex instrumental activities of daily living.Fluency, not the other tests, needed adjustment for age and education. A deficit score (0-1) was computed for each test. Summation (range 0-6) proved useful in differentiating patients with dementia (N = 741) from normal elderly (N = 102).Positive and negative predictive power across a range of summed deficit scores and base rates are displayed in Bayesian probability tables.In the normal elderly, delayed recall for eight words was tested and adjusted for initial recall. A recognition test mixed the target words with eight distractors. Delayed recognition was adjusted for immediate and delayed recall.The ADST and the normative data in this paper help the clinical neuropsychologist to make decisions concerning the presence or absence of neurocognitive disorder in individual elderly examinees.
Score Gains on g-loaded Tests: No g

NARCIS (Netherlands)

te Nijenhuis, J.; van Vianen, A.E.M.; van der Flier, H.

2007-01-01

IQ scores provide the best general predictor of success in education, job training, and work. However, there are many ways in which IQ scores can be increased, for instance by means of retesting or participation in learning potential training programs. What is the nature of these score gains? Jensen
Optimization of Sample Preparation for the Identification and Quantification of Saxitoxin in Proficiency Test Mussel Sample using Liquid Chromatography-Tandem Mass Spectrometry

Directory of Open Access Journals (Sweden)

Kirsi Harju

2015-11-01

Full Text Available Saxitoxin (STX and some selected paralytic shellfish poisoning (PSP analogues in mussel samples were identified and quantified with liquid chromatography-tandem mass spectrometry (LC-MS/MS. Sample extraction and purification methods of mussel sample were optimized for LC-MS/MS analysis. The developed method was applied to the analysis of the homogenized mussel samples in the proficiency test (PT within the EQuATox project (Establishment of Quality Assurance for the Detection of Biological Toxins of Potential Bioterrorism Risk. Ten laboratories from eight countries participated in the STX PT. Identification of PSP toxins in naturally contaminated mussel samples was performed by comparison of product ion spectra and retention times with those of reference standards. The quantitative results were obtained with LC-MS/MS by spiking reference standards in toxic mussel extracts. The results were within the z-score of ±1 when compared to the results measured with the official AOAC (Association of Official Analytical Chemists method 2005.06, pre-column oxidation high-performance liquid chromatography with fluorescence detection (HPLC-FLD.
The Addenbrooke's Cognitive Examination Revised (ACE-R) and its sub-scores: normative values in an Italian population sample.

Science.gov (United States)

Siciliano, Mattia; Raimo, Simona; Tufano, Dario; Basile, Giuseppe; Grossi, Dario; Santangelo, Franco; Trojano, Luigi; Santangelo, Gabriella

2016-03-01

The Addenbrooke's Cognitive Examination Revised (ACE-R) is a rapid screening battery, including five sub-scales to explore different cognitive domains: attention/orientation, memory, fluency, language and visuospatial. ACE-R is considered useful in discriminating cognitively normal subjects from patients with mild dementia. The aim of present study was to provide normative values for ACE-R total score and sub-scale scores in a large sample of Italian healthy subjects. Five hundred twenty-six Italian healthy subjects (282 women and 246 men) of different ages (age range 20-93 years) and educational level (from primary school to university) underwent ACE-R and Montreal Cognitive Assessment (MoCA). Multiple linear regression analysis revealed that age and education significantly influenced performance on ACE-R total score and sub-scale scores. A significant effect of gender was found only in sub-scale attention/orientation. From the derived linear equation, a correction grid for raw scores was built. Inferential cut-offs score were estimated using a non-parametric technique and equivalent scores (ES) were computed. Correlation analysis showed a good significant correlation between ACE-R adjusted scores with MoCA adjusted scores (r = 0.612, p < 0.001). The present study provided normative data for the ACE-R in an Italian population useful for both clinical and research purposes.
Effect on intelligence test score of prenatal exposure to ionizing radiation in Hiroshima and Nagasaki

International Nuclear Information System (INIS)

Schull, W.J.; Otake, Masanori; Yoshimaru, Hiroshi.

1988-10-01

Analyses of intelligence test scores (Koga) at 10-11 years of age of individuals exposed prenatally to the atomic bombing of Hiroshima and Nagasaki using estimates of the uterine absorbed dose based on the recently introduced system of dosimetry, the Dosimetry System 1986 (DS86), reveal the following: 1) there is no evidence of a radiation-related effect on intelligence among those individuals exposed within 0-7 weeks after fertilization or in the 26th or subsequent weeks; 2) for individuals exposed at 8-15 weeks after fertilization, and to a lesser extent those exposed at 16-25 weeks, the mean tests scores but not the variances are significantly heterogeneous among exposure categories; 3) the cumulative distribution of test scores suggests a progressive shift downwards in individual scores with increasing exposure; and 4) within the group most sensitive to the occurrence of clinically recognizable severe mental retardation, individuals exposed 8 through 15 weeks after fertilization, the regression of intelligence score on estimated DS86 uterine absorbed dose is more linear than with T65DR fetal dose, the diminution in intelligence score under the linear model is 21-29 points at 1Gy. The effect is somewhat greater when the controls receiving less than 0.01 Gy are excluded, 24-33 points at 1 Gy. These findings are discussed in the light of the earlier analysis of the frequency of occurrence of mental retardation among the prenatally exposed survivors of the A-bombing of Hiroshima and Nagasaki. It is suggested that both are the consequences of the same underlying biological process or processes. (author)
Associations of maximal strength and muscular endurance test scores with cardiorespiratory fitness and body composition.

Science.gov (United States)

Vaara, Jani P; Kyröläinen, Heikki; Niemi, Jaakko; Ohrankämmen, Olli; Häkkinen, Arja; Kocay, Sheila; Häkkinen, Keijo

2012-08-01

The purpose of the present study was to assess the relationships between maximal strength and muscular endurance test scores additionally to previously widely studied measures of body composition and maximal aerobic capacity. 846 young men (25.5 ± 5.0 yrs) participated in the study. Maximal strength was measured using isometric bench press, leg extension and grip strength. Muscular endurance tests consisted of push-ups, sit-ups and repeated squats. An indirect graded cycle ergometer test was used to estimate maximal aerobic capacity (V(O2)max). Body composition was determined with bioelectrical impedance. Moreover, waist circumference (WC) and height were measured and body mass index (BMI) calculated. Maximal bench press was positively correlated with push-ups (r = 0.61, p strength (r = 0.34, p strength correlated positively (r = 0.36-0.44, p test scores were related to maximal aerobic capacity and body fat content, while fat free mass was associated with maximal strength test scores and thus is a major determinant for maximal strength. A contributive role of maximal strength to muscular endurance tests could be identified for the upper, but not the lower extremities. These findings suggest that push-up test is not only indicative of body fat content and maximal aerobic capacity but also maximal strength of upper body, whereas repeated squat test is mainly indicative of body fat content and maximal aerobic capacity, but not maximal strength of lower extremities.
Gaze Stabilization Test Asymmetry Score as an Indicator of Previous Concussion in a Cohort of Collegiate Football Players.

Science.gov (United States)

Honaker, Julie A; Criter, Robin E; Patterson, Jessie N; Jones, Sherri M

2015-07-01

Vestibular dysfunction may lead to decreased visual acuity with head movements, which may impede athletic performance and result in injury. The purpose of this study was to test the hypothesis that athletes with history of concussion would have differences in gaze stabilization test (GST) as compared with those without a history of concussion. Cross-sectional, descriptive. University Athletic Medicine Facility. Fifteen collegiate football players with a history of concussion, 25 collegiate football players without a history of concussion. Participants completed the dizziness handicap inventory (DHI), static visual acuity, perception time test, active yaw plane GST, stability evaluation test (SET), and a bedside oculomotor examination. Independent samples t test was used to compare GST, SET, and DHI scores per group, with Bonferroni-adjusted alpha at P history of concussion. The results support further research on the use of GST for sport-related concussion evaluation and monitoring. Inclusion of objective vestibular tests in the concussion protocol may reveal the presence of peripheral vestibular or visual-vestibular deficits. Therefore, the GST may add an important perspective on the effects of concussion.
Testing measurement invariance of the schizotypal personality questionnaire-brief scores across Spanish and Swiss adolescents.

Directory of Open Access Journals (Sweden)

Javier Ortuño-Sierra

Full Text Available BACKGROUND: Schizotypy is a complex construct intimately related to psychosis. Empirical evidence indicates that participants with high scores on schizotypal self-report are at a heightened risk for the later development of psychotic disorders. Schizotypal experiences represent the behavioural expression of liability for psychotic disorders. Previous factorial studies have shown that schizotypy is a multidimensional construct similar to that found in patients with schizophrenia. Specifically, using the Schizotypal Personality Questionnaire-Brief (SPQ-B, the three-dimensional model has been widely replicated. However, there has been no in-depth investigation of whether the dimensional structure underlying the SPQ-B scores is invariant across countries. METHODS: The main goal of this study was to examine the measurement invariance of the SPQ-B scores across Spanish and Swiss adolescents. The final sample was made up of 261 Spanish participants (51.7% men; M = 16.04 years and 241 Swiss participants (52.3% men; M = 15.94 years. RESULTS: The results indicated that Raine et al.'s three-factor model presented adequate goodness-of-fit indices. Moreover, the results supported the measurement invariance (configural and partial strong invariance of the SPQ-B scores across the two samples. Spanish participants scored higher on Interpersonal dimension than Swiss when latent means were compared. DISCUSSION: The study of measurement equivalence across countries provides preliminary evidence for the Raine et al.'s three-factor model and of the cross-cultural validity of the SPQ-B scores in adolescent population. Future studies should continue to examine the measurement invariance of the schizotypy and psychosis-risk syndromes across cultures.
Student Test Scores: How the Sausage Is Made and Why You Should Care. Evidence Speaks Reports, Vol 1, #25

Science.gov (United States)

Jacob, Brian A.

2016-01-01

Contrary to popular belief, modern cognitive assessments--including the new Common Core tests--produce test scores based on sophisticated statistical models rather than the simple percent of items a student answers correctly. While there are good reasons for this, it means that reported test scores depend on many decisions made by test designers,…
7 CFR 28.952 - Testing of samples.

Science.gov (United States)

2010-01-01

... 7 Agriculture 2 2010-01-01 2010-01-01 false Testing of samples. 28.952 Section 28.952 Agriculture Regulations of the Department of Agriculture AGRICULTURAL MARKETING SERVICE (Standards, Inspections, Marketing... processing tests of the properties of cotton samples and report the results thereof to the persons from whom...
Critique of the Watson-Glaser Critical Thinking Appraisal Test: The More You Know, the Lower Your Score

Directory of Open Access Journals (Sweden)

Kevin Possin

2014-12-01

Full Text Available The Watson-Glaser Critical Thinking Appraisal Test is one of the oldest, most frequently used, multiple-choice critical-thinking tests on the market in business, government, and legal settings for purposes of hiring and promotion. I demonstrate, however, that the test has serious construct-validity issues, stemming primarily from its ambiguous, unclear, misleading, and sometimes mysterious instructions, which have remained unaltered for decades. Erroneously scored items further diminish the test’s validity. As a result, having enhanced knowledge of formal and informal logic could well result in test subjects receiving lower scores on the test. That’s not how things should work for a CT assessment test.
Test Scores, Class Rank and College Performance: Lessons for Broadening Access and Promoting Success.

Science.gov (United States)

Niu, Sunny X; Tienda, Marta

2012-04-01

Using administrative data for five Texas universities that differ in selectivity, this study evaluates the relative influence of two key indicators for college success-high school class rank and standardized tests. Empirical results show that class rank is the superior predictor of college performance and that test score advantages do not insulate lower ranked students from academic underperformance. Using the UT-Austin campus as a test case, we conduct a simulation to evaluate the consequences of capping students admitted automatically using both achievement metrics. We find that using class rank to cap the number of students eligible for automatic admission would have roughly uniform impacts across high schools, but imposing a minimum test score threshold on all students would have highly unequal consequences by greatly reduce the admission eligibility of the highest performing students who attend poor high schools while not jeopardizing admissibility of students who attend affluent high schools. We discuss the implications of the Texas admissions experiment for higher education in Europe.

Relationships between the handball-specific complex test, non-specific field tests and the match performance score in elite professional handball players.

Science.gov (United States)

Hermassi, Souhail; Chelly, Mohamed-Souhaiel; Wollny, Rainer; Hoffmeyer, Birgit; Fieseler, Georg; Schulze, Stephan; Irlenbusch, Lars; Delank, Karl-Stefan; Shephard, Roy J; Bartels, Thomas; Schwesig, René

2018-06-01

This study assessed the validity of the handball-specific complex test (HBCT) and two non-specific field tests in professional elite handball athletes, using the match performance score (MPS) as the gold standard of performance. Thirteen elite male handball players (age: 27.4±4.8 years; premier German league) performed the HBCT, the Yo-Yo Intermittent Recovery (YYIR) test and a repeated shuttle sprint ability (RSA) test at the beginning of pre-season training. The RSA results were evaluated in terms of best time, total time, and fatigue decrement. Heart rates (HR) were assessed at selected times throughout all tests; the recovery HR was measured immediately post-test and 10 minutes later. The match performance score was based on various handball specific parameters (e.g., field goals, assists, steals, blocks, and technical mistakes) as seen during all matches of the immediately subsequent season (2015/2016). The parameters of run 1, run 2, and HR recovery at minutes 6 and 10 of the RSA test all showed a variance of more than 10% (range: 11-15%). However, the variance of scores for the YYIR test was much smaller (range: 1-7%). The resting HR (r2=0.18), HR recovery at minute 10 (r2=0.10), lactate concentration at rest (r2=0.17), recovery of heart rate from 0 to 10 minutes (r2=0.15), and velocity of second throw at first trial (r2=0.37) were the most valid HBCT parameters. Much effort is necessary to assess MPS and to develop valid tests. Speed and the rate of functional recovery seem the best predictors of competitive performance for elite handball players.
College Math Assessment: SAT Scores vs. College Math Placement Scores

Science.gov (United States)

Foley-Peres, Kathleen; Poirier, Dawn

2008-01-01

Many colleges and university's use SAT math scores or math placement tests to place students in the appropriate math course. This study compares the use of math placement scores and SAT scores for 188 freshman students. The student's grades and faculty observations were analyzed to determine if the SAT scores and/or college math assessment scores…
A Comparison of Scores on the WISC-R and Lorge-Thorndike Intelligence Test for Disadvantaged Black Elementary School Children

Science.gov (United States)

Lowe, James D.; Karnes, Frances A.

1976-01-01

It is indicated that, although the scores [obtained on both tests] are significantly correlated, the tests yield significantly different scores with the Lorge-Thorndike consistently overestimating the WISC-R full scale I.Q. (Author)
The Mediating Effect of Listening Metacognitive Awareness between Test-Taking Motivation and Listening Test Score: An Expectancy-Value Theory Approach

OpenAIRE

Xu, Jian

2017-01-01

The present study investigated test-taking motivation in L2 listening testing context by applying Expectancy-Value Theory as the framework. Specifically, this study was intended to examine the complex relationships among expectancy, importance, interest, listening anxiety, listening metacognitive awareness, and listening test score using data from a large-scale and high-stakes language test among Chinese first-year undergraduates. Structural equation modeling was used to examine the mediating...
International Test Score Comparisons and Educational Policy: A Review of the Critiques

Science.gov (United States)

Carnoy, Martin

2015-01-01

Stanford education professor Martin Carnoy examines four main critiques of how international test results are used in policymaking. Of particular interest are critiques of the policy analyses published by the Program for International Student Assessment (PISA). Using average PISA scores as a comparative measure of student achievement is misleading…
Parent Ratings of Impulsivity and Inhibition Predict State Testing Scores

Directory of Open Access Journals (Sweden)

Rebecca A. Lundwall

2018-03-01

Full Text Available One principle of cognitive development is that earlier intervention for educational difficulties tends to improve outcomes such as future educational and career success. One possible way to help students who struggle is to determine if they process information differently. Such determination might lead to clues for interventions. For example, early information processing requires attention before the information can be identified, encoded, and stored. The aim of the present study was to investigate whether parent ratings of inattention, inhibition, and impulsivity, and whether error rate on a reflexive attention task could be used to predict child scores on state standardized tests. Finding such an association could provide assistance to educators in identifying academically struggling children who might require targeted educational interventions. Children (N = 203 were invited to complete a peripheral cueing task (which measures the automatic reorienting of the brain’s attentional resources from one location to another. While the children completed the task, their parents completed a questionnaire. The questionnaire gathered information on broad indicators of child functioning, including observable behaviors of impulsivity, inattention, and inhibition, as well as state academic scores (which the parent retrieved online from their school. We used sequential regression to analyze contributions of error rate and parent-rated behaviors in predicting six academic scores. In one of the six analyses (for science, we found that the improvement was significant from the simplified model (with only family income, child age, and sex as predictors to the full model (adding error rate and three parent-rated behaviors. Two additional analyses (reading and social studies showed near significant improvement from simplified to full models. Parent-rated behaviors were significant predictors in all three of these analyses. In the reading score analysis
The Impact of the Use of Hierarchical Teaching on Test Scores of Students’ Technology

Directory of Open Access Journals (Sweden)

Zhao Guorong

2015-01-01

Full Text Available Test scores of students’ technology is the main basis for physical examination of college students’ physical, fitness evaluation based on test results. To change the view by the stratified teaching method consistent system of teaching mode, special movement technical level of students is improved significantly.
The Effects of Listening to Music Just Before Reading Test on Students’ Test Score

OpenAIRE

MAHDAVI, Mojtaba

2015-01-01

Abstract. In this study the researcher examined the effect of music on reading comprehension played just before the test . Because the emotional consequences of music listening are evident in stress and anxiety removal, it was used as a tool to pacify the mind of the tastes and boost their memory and the related cognitive processes. Experimental group did well with the mean score of) and control group (). This study confirmed that using multimedia devices such as music can not only i...
The Apgar score has survived the test of time.

Science.gov (United States)

Finster, Mieczyslaw; Wood, Margaret

2005-04-01

In 1953, Virginia Apgar, M.D. published her proposal for a new method of evaluation of the newborn infant. The avowed purpose of this paper was to establish a simple and clear classification of newborn infants which can be used to compare the results of obstetric practices, types of maternal pain relief and the results of resuscitation. Having considered several objective signs pertaining to the condition of the infant at birth she selected five that could be evaluated and taught to the delivery room personnel without difficulty. These signs were heart rate, respiratory effort, reflex irritability, muscle tone and color. Sixty seconds after the complete birth of the baby a rating of zero, one or two was given to each sign, depending on whether it was absent or present. Virginia Apgar reviewed anesthesia records of 1025 infants born alive at Columbia Presbyterian Medical Center during the period of this report. All had been rated by her method. Infants in poor condition scored 0-2, infants in fair condition scored 3-7, while scores 8-10 were achieved by infants in good condition. The most favorable score 1 min after birth was obtained by infants delivered vaginally with the occiput the presenting part (average 8.4). Newborns delivered by version and breech extraction had the lowest score (average 6.3). Infants delivered by cesarean section were more vigorous (average score 8.0) when spinal was the method of anesthesia versus an average score of 5.0 when general anesthesia was used. Correlating the 60 s score with neonatal mortality, Virginia found that mature infants receiving 0, 1 or 2 scores had a neonatal death rate of 14%; those scoring 3, 4, 5, 6 or 7 had a death rate of 1.1%; and those in the 8-10 score group had a death rate of 0.13%. She concluded that the prognosis of an infant is excellent if he receives one of the upper three scores, and poor if one of the lowest three scores.
Identifying genetic marker sets associated with phenotypes via an efficient adaptive score test

KAUST Repository

Cai, T.; Lin, X.; Carroll, R. J.

2012-01-01

the overall effect of a marker-set have been actively studied in recent years. For example, score tests derived under an Empirical Bayes (EB) framework (Liu and others, 2007. Semiparametric regression of multidimensional genetic pathway data: least
Gender Gaps in High School GPA and ACT Scores: High School Grade Point Average and ACT Test Score by Subject and Gender. Information Brief 2014-12

Science.gov (United States)

ACT, Inc., 2014

2014-01-01

Female students who graduated from high school in 2013 averaged higher grades than their male counterparts in all subjects, but male graduates earned higher scores on the math and science sections of the ACT. This information brief looks at high school grade point average and ACT test score by subject and gender
High Baseline Postconcussion Symptom Scores and Concussion Outcomes in Athletes.

Science.gov (United States)

Custer, Aimee; Sufrinko, Alicia; Elbin, R J; Covassin, Tracey; Collins, Micky; Kontos, Anthony

2016-02-01

Some healthy athletes report high levels of baseline concussion symptoms, which may be attributable to several factors (eg, illness, personality, somaticizing). However, the role of baseline symptoms in outcomes after sport-related concussion (SRC) has not been empirically examined. To determine if athletes with high symptom scores at baseline performed worse than athletes without baseline symptoms on neurocognitive testing after SRC. Cohort study. High school and collegiate athletic programs. A total of 670 high school and collegiate athletes participated in the study. Participants were divided into groups with either no baseline symptoms (Postconcussion Symptom Scale [PCSS] score = 0, n = 247) or a high level of baseline symptoms (PCSS score > 18 [top 10% of sample], n = 68). Participants were evaluated at baseline and 2 to 7 days after SRC with the Immediate Post-concussion Assessment and Cognitive Test and PCSS. Outcome measures were Immediate Post-concussion Assessment and Cognitive Test composite scores (verbal memory, visual memory, visual motor processing speed, and reaction time) and total symptom score on the PCSS. The groups were compared using repeated-measures analyses of variance with Bonferroni correction to assess interactions between group and time for symptoms and neurocognitive impairment. The no-symptoms group represented 38% of the original sample, whereas the high-symptoms group represented 11% of the sample. The high-symptoms group experienced a larger decline from preinjury to postinjury than the no-symptoms group in verbal (P = .03) and visual memory (P = .05). However, total concussion-symptom scores increased from preinjury to postinjury for the no-symptoms group (P = .001) but remained stable for the high-symptoms group. Reported baseline symptoms may help identify athletes at risk for worse outcomes after SRC. Clinicians should examine baseline symptom levels to better identify patients for earlier referral and treatment for their
Airflow Test of Acoustic Board Samples

DEFF Research Database (Denmark)

Jensen, Rasmus Lund; Jensen, Lise Mellergaard

In the laboratory of Indoor Environmental Engineering, Department of Civil Engineering, Aalborg University an airflow test on 2x10 samples of acoustic board were carried out the 2nd of June 2012. The tests were carried out for Rambøll and STO AG. The test includes connected values of volume flow...
From Test Scores to Language Use: Emergent Bilinguals Using English to Accomplish Academic Tasks

Science.gov (United States)

Rodriguez-Mojica, Claudia

2018-01-01

Prominent discourses about emergent bilinguals' academic abilities tend to focus on performance as measured by test scores and perpetuate the message that emergent bilinguals trail far behind their peers. When we remove the constraints of formal testing situations, what can emergent bilinguals do in English as they engage in naturally occurring…
Assessing Exhaustiveness of Stochastic Sampling for Integrative Modeling of Macromolecular Structures.

Science.gov (United States)

Viswanath, Shruthi; Chemmama, Ilan E; Cimermancic, Peter; Sali, Andrej

2017-12-05

Modeling of macromolecular structures involves structural sampling guided by a scoring function, resulting in an ensemble of good-scoring models. By necessity, the sampling is often stochastic, and must be exhaustive at a precision sufficient for accurate modeling and assessment of model uncertainty. Therefore, the very first step in analyzing the ensemble is an estimation of the highest precision at which the sampling is exhaustive. Here, we present an objective and automated method for this task. As a proxy for sampling exhaustiveness, we evaluate whether two independently and stochastically generated sets of models are sufficiently similar. The protocol includes testing 1) convergence of the model score, 2) whether model scores for the two samples were drawn from the same parent distribution, 3) whether each structural cluster includes models from each sample proportionally to its size, and 4) whether there is sufficient structural similarity between the two model samples in each cluster. The evaluation also provides the sampling precision, defined as the smallest clustering threshold that satisfies the third, most stringent test. We validate the protocol with the aid of enumerated good-scoring models for five illustrative cases of binary protein complexes. Passing the proposed four tests is necessary, but not sufficient for thorough sampling. The protocol is general in nature and can be applied to the stochastic sampling of any set of models, not just structural models. In addition, the tests can be used to stop stochastic sampling as soon as exhaustiveness at desired precision is reached, thereby improving sampling efficiency; they may also help in selecting a model representation that is sufficiently detailed to be informative, yet also sufficiently coarse for sampling to be exhaustive. Copyright © 2017 Biophysical Society. Published by Elsevier Inc. All rights reserved.
46 CFR 160.050-5 - Sampling, tests, and inspection.

Science.gov (United States)

2010-10-01

... one from which any sample ring life buoy failed the buoyancy or strength test, the sample shall... ring life buoys with this subpart. The manufacturer shall provide means to secure any test that is not... procedures. Table 160.050-5(e)—Sampling for Buoyancy Tests Lot size Number of life buoys in sample 100 and...
A Case Study About Why It Can Be Difficult To Test Whether Propensity Score Analysis Works in Field Experiments

Directory of Open Access Journals (Sweden)

Thomas D. Cook

2012-01-01

Full Text Available Peikes, Moreno and Orzol (2008 sensibly caution researchers that propensity score analysis may not lead to valid causal inference in field applications. But at the same time, they made the far stronger claim to have performed an ideal test of whether propensity score matching in quasi-experimental data is capable of approximating the results of a randomized experiment in their dataset, and that this ideal test showed that such matching could not do so. In this article we show that their study does not support that conclusion because it failed to meet a number of basic criteria for an ideal test. By implication, many other purported tests of the effectiveness of propensity score analysis probably also fail to meet these criteria, and are therefore questionable contributions to the literature on the effects of propensity score analysis.
The Effects of Teacher and Teacher-librarian High-end Collaboration on Inquiry-based Project Reports and School Monthly Test Scores of Fifth-grade Students

Directory of Open Access Journals (Sweden)

Hai-Hon Chen

2015-07-01

Full Text Available The purpose of this study was twofold. The first purpose was to establish the high level collaboration of integrated instruction model between social studies teacher and teacher-librarian. The second purpose was to investigate the effects of high-end collaboration on the individual and groups’ inquiry-based project reports, as well as monthly test scores of fifth-grade students. A quasi-experimental method was adopted, two classes of elementary school fifth graders in Tainan Municipal city, Taiwan were used as samples. Students were randomly assigned to experimental conditions by class. Twenty eight students of the experimental group were taught by the collaboration of social studies teacher and teacher-librarian; while 27 students of the controlled group were taught separately by teacher in didactic teaching method. Inquiry-Based Project Record, Inquiry-Based Project Rubrics, and school monthly test scores were used as instruments for collecting data. A t-test and correlation were used to analyze the data. The results indicate that: (1 High-end collaboration model between social studies teacher and teacher-librarian was established and implemented well in the classroom. (2There was a significant difference between the experimental group and the controlled group in individual and groups’ inquiry-based project reports. Students that were taught by the collaborative teachers got both higher inquiry-based project reports’ scores than those that were taught separately by the teachers. Experimental group’s students got higher school monthly test scores than controlled groups. Suggestions for teachers’ high-end collaboration and future researcher are provided in this paper.
The Disaggregation of Value-Added Test Scores to Assess Learning Outcomes in Economics Courses

Science.gov (United States)

Walstad, William B.; Wagner, Jamie

2016-01-01

This study disaggregates posttest, pretest, and value-added or difference scores in economics into four types of economic learning: positive, retained, negative, and zero. The types are derived from patterns of student responses to individual items on a multiple-choice test. The micro and macro data from the "Test of Understanding in College…
The TSCA interagency testing committee`s approaches to screening and scoring chemicals and chemical groups: 1977-1983

Energy Technology Data Exchange (ETDEWEB)

Walker, J.D. [Environmental Protection Agency, Washington, DC (United States)

1990-12-31

This paper describes the TSCA interagency testing committee`s (ITC) approaches to screening and scoring chemicals and chemical groups between 1977 and 1983. During this time the ITC conducted five scoring exercises to select chemicals and chemical groups for detailed review and to determine which of these chemicals and chemical groups should be added to the TSCA Section 4(e) Priority Testing List. 29 refs., 1 fig., 2 tabs.

Analysis of the Raven CPM Subtest Scores for a Sample of Gifted Children.

Science.gov (United States)

Kluever, Raymond C.; Green, Kathy E.

The inter-subject/intra-subject subtest patterns (profiles) of the same sample of gifted children were examined based on factors found in a previous study of the Raven Coloured Progressive Matrices Test (CPM) that investigated structural properties with specific application to a sample of gifted children. The sample consisted of 166 children (78…
Comparing the MMPI-2 Scale Scores of Parents Involved in Parental Competency and Child Custody Assessments

Science.gov (United States)

Resendes, John; Lecci, Len

2012-01-01

MMPI-2 scores from a parent competency sample (N = 136 parents) are compared with a previously published data set of MMPI-2 scores for child custody litigants (N = 508 parents; Bathurst et al., 1997). Independent samples t tests yielded significant and in some cases substantial differences on the standard MMPI-2 clinical scales (especially Scales…
A Case Study About Why It Can Be Difficult To Test Whether Propensity Score Analysis Works in Field Experiments

Directory of Open Access Journals (Sweden)

William R. Shadish

2013-02-01

Full Text Available Peikes, Moreno and Orzol (2008 sensibly caution researchers that propensity score analysis may not lead to valid causal inference in field applications. But at the same time, they made the far stronger claim to have performed an ideal test of whether propensity score matching in quasi-experimental data is capable of approximating the results of a randomized experiment in their dataset, and that this ideal test showed that such matching could not do so. In this article we show that their study does not support that conclusion because it failed to meet a number of basic criteria for an ideal test. By implication, many other purported tests of the effectiveness of propensity score analysis probably also fail to meet these criteria, and are therefore questionable contributions to the literature on the effects of propensity score analysis. DOI: 10.2458/azu_jmmss.v3i2.16475
The Alcohol Use Disorders Identification Scale (AUDIT) normative scores for a multiracial sample of Rhodes University residence students.

Science.gov (United States)

Young, Charles; Mayson, Tamara

2010-06-01

The objective of this research is to obtain accurate drinking norms for students living in the university residences in preparation for future social norms interventions that would allow individual students to compare their drinking to an appropriate reference group. Random cluster sampling was used to obtain data from 318 residence students who completed the Alcohol Use Disorders Identification Test (AUDIT), a brief, reliable and valid screening measure designed by the World Health Organisation (Babor et al. 2001). The Cronbach alpha coefficient of 0.83 reported for this multicultural sample is high, suggesting that the AUDIT may be reliably used in this and similar contexts. Normative scores are reported in the form of percentiles. Comparisons between the portions of students drinking safely and hazardously according to race and gender indicate that while male students are drinking no more hazardously than female students, white students drink far more hazardously than black students. These differences suggest that both race- and gender-specific norms would be essential for an effective social norms intervention in this multicultural South African context. Finally, the racialised drinking patterns might reflect an informal segregation of social space at Rhodes University.
Are students' impressions of improved learning through active learning methods reflected by improved test scores?

Science.gov (United States)

Everly, Marcee C

2013-02-01

To report the transformation from lecture to more active learning methods in a maternity nursing course and to evaluate whether student perception of improved learning through active-learning methods is supported by improved test scores. The process of transforming a course into an active-learning model of teaching is described. A voluntary mid-semester survey for student acceptance of the new teaching method was conducted. Course examination results, from both a standardized exam and a cumulative final exam, among students who received lecture in the classroom and students who had active learning activities in the classroom were compared. Active learning activities were very acceptable to students. The majority of students reported learning more from having active-learning activities in the classroom rather than lecture-only and this belief was supported by improved test scores. Students who had active learning activities in the classroom scored significantly higher on a standardized assessment test than students who received lecture only. The findings support the use of student reflection to evaluate the effectiveness of active-learning methods and help validate the use of student reflection of improved learning in other research projects. Copyright © 2011 Elsevier Ltd. All rights reserved.
Empirical Correlates of Low Scores on MMPI-2/MMPI-2-RF Restructured Clinical Scales in a Sample of University Students

Science.gov (United States)

Avdeyeva, Tatyana V.; Tellegen, Auke; Ben-Porath, Yossef S.

2012-01-01

In the present study, the authors explored the meaning of low scores on the MMPI-2/MMPI-2-RF Restructured Clinical (RC) scales. Using responses of a sample of university students (N = 811), the authors examined whether low (T less than 39), within-normal-limits (T = 39-64), and high (T greater than 65) score levels on the RC scales are…
Changes in Student Populations and Average Test Scores of Dutch Primary Schools

Science.gov (United States)

Luyten, Hans; de Wolf, Inge

2011-01-01

This article focuses on the relation between student population characteristics and average test scores per school in the final grade of primary education from a dynamic perspective. Aggregated data of over 5,000 Dutch primary schools covering a 6-year period were used to study the relation between changes in school populations and shifts in mean…
Effect of Mindfulness Meditation on Perceived Stress Scores and Autonomic Function Tests of Pregnant Indian Women.

Science.gov (United States)

Muthukrishnan, Shobitha; Jain, Reena; Kohli, Sangeeta; Batra, Swaraj

2016-04-01

Various pregnancy complications like hypertension, preeclampsia have been strongly correlated with maternal stress. One of the connecting links between pregnancy complications and maternal stress is mind-body intervention which can be part of Complementary and Alternative Medicine (CAM). Biologic measures of stress during pregnancy may get reduced by such interventions. To evaluate the effect of Mindfulness meditation on perceived stress scores and autonomic function tests of pregnant Indian women. Pregnant Indian women of 12 weeks gestation were randomised to two treatment groups: Test group with Mindfulness meditation and control group with their usual obstetric care. The effect of Mindfulness meditation on perceived stress scores and cardiac sympathetic functions and parasympathetic functions (Heart rate variation with respiration, lying to standing ratio, standing to lying ratio and respiratory rate) were evaluated on pregnant Indian women. There was a significant decrease in perceived stress scores, a significant decrease of blood pressure response to cold pressor test and a significant increase in heart rate variability in the test group (pwomen. The results of this study suggest that mindfulness meditation improves parasympathetic functions in pregnant women and is a powerful modulator of the sympathetic nervous system during pregnancy.
The Impact of the 2004 Hurricanes on Florida Comprehensive Assessment Test Scores: Implications for School Counselors

Science.gov (United States)

Baggerly, Jennifer; Ferretti, Larissa K.

2008-01-01

What is the impact of natural disasters on students' statewide assessment scores? To answer this question, Florida Comprehensive Assessment Test (FCAT) scores of 55,881 students in grades 4 through 10 were analyzed to determine if there were significant decreases after the 2004 hurricanes. Results reveal that there was statistical but no practical…
Lower Quarter Y-Balance Test Scores and Lower Extremity Injury in NCAA Division I Athletes.

Science.gov (United States)

Lai, Wilson C; Wang, Dean; Chen, James B; Vail, Jeremy; Rugg, Caitlin M; Hame, Sharon L

2017-08-01

Functional movement tests that are predictive of injury risk in National Collegiate Athletic Association (NCAA) athletes are useful tools for sports medicine professionals. The Lower Quarter Y-Balance Test (YBT-LQ) measures single-leg balance and reach distances in 3 directions. To assess whether the YBT-LQ predicts the laterality and risk of sports-related lower extremity (LE) injury in NCAA athletes. Case-control study; Level of evidence, 3. The YBT-LQ was administered to 294 NCAA Division I athletes from 21 sports during preparticipation physical examinations at a single institution. Athletes were followed prospectively over the course of the corresponding season. Correlation analysis was performed between the laterality of reach asymmetry and composite scores (CS) versus the laterality of injury. Receiver operating characteristic (ROC) analysis was used to determine the optimal asymmetry cutoff score for YBT-LQ. A multivariate regression analysis adjusting for sex, sport type, body mass index, and history of prior LE surgery was performed to assess predictors of earlier and higher rates of injury. Neither the laterality of reach asymmetry nor the CS correlated with the laterality of injury. ROC analysis found optimal cutoff scores of 2, 9, and 3 cm for anterior, posteromedial, and posterolateral reach, respectively. All of these potential cutoff scores, along with a cutoff score of 4 cm used in the majority of prior studies, were associated with poor sensitivity and specificity. Furthermore, none of the asymmetric cutoff scores were associated with earlier or increased rate of injury in the multivariate analyses. YBT-LQ scores alone do not predict LE injury in this collegiate athlete population. Sports medicine professionals should be cautioned against using the YBT-LQ alone to screen for injury risk in collegiate athletes.
Operability test procedure for PFP wastewater sampling facility

International Nuclear Information System (INIS)

Hirzel, D.R.

1995-01-01

Document provides instructions for performing the Operability Test of the 225-WC Wastewater Sampling Station which monitors the discharge to the Treated Effluent Disposal Facility from the Plutonium Finishing Plant. This Operability Test Procedure (OTP) has been prepared to verify correct configuration and performance of the PFP Wastewater sampling system installed in Building 225-WC located outside the perimeter fence southeast of the Plutonium Finishing Plant (PFP). The objective of this test is to ensure the equipment in the sampling facility operates in a safe and reliable manner. The sampler consists of two Manning Model S-5000 units which are rate controlled by the Milltronics Ultrasonic flowmeter at manhole No.C4 and from a pH measuring system with the sensor in the stream adjacent to the sample point. The intent of the dual sampling system is to utilize one unit to sample continuously at a rate proportional to the wastewater flow rate so that the aggregate tests are related to the overall flow and thereby eliminate isolated analyses. The second unit will only operate during a high or low pH excursion of the stream (hence the need for a pH control). The major items in this OTP include testing of the Manning Sampler System and associated equipment including the pH measuring and control system, the conductivity monitor, and the flow meter
Mixing and sampling tests for Radiochemical Plant

International Nuclear Information System (INIS)

Ehinger, M.N.; Marfin, H.R.; Hunt, B.

1999-01-01

The paper describes results and test procedures used to evaluate uncertainly and basis effects introduced by the sampler systems of a radiochemical plant, and similar parameters associated with mixing. This report will concentrate on experiences at the Barnwell Nuclear Fuels Plant. Mixing and sampling tests can be conducted to establish the statistical parameters for those activities related to overall measurement uncertainties. Density measurements by state-of-the art, commercially availability equipment is the key to conducting those tests. Experience in the U.S. suggests the statistical contribution of mixing and sampling can be controlled to less than 0.01 % and with new equipment and new tests in operating facilities might be controlled to better accuracy [ru
Depressive status explains a significant amount of the variance in COPD assessment test (CAT) scores.

Science.gov (United States)

Miravitlles, Marc; Molina, Jesús; Quintano, José Antonio; Campuzano, Anna; Pérez, Joselín; Roncero, Carlos

2018-01-01

COPD assessment test (CAT) is a short, easy-to-complete health status tool that has been incorporated into the multidimensional assessment of COPD in order to guide therapy; therefore, it is important to understand the factors determining CAT scores. This is a post hoc analysis of a cross-sectional, observational study conducted in respiratory medicine departments and primary care centers in Spain with the aim of identifying the factors determining CAT scores, focusing particularly on the cognitive status measured by the Mini-Mental State Examination (MMSE) and levels of depression measured by the short Beck Depression Inventory (BDI). A total of 684 COPD patients were analyzed; 84.1% were men, the mean age of patients was 68.7 years, and the mean forced expiratory volume in 1 second (%) was 55.1%. Mean CAT score was 21.8. CAT scores correlated with the MMSE score (Pearson's coefficient r =-0.371) and the BDI ( r =0.620), both p CAT scores and explained 45% of the variability. However, a model including only MMSE and BDI scores explained up to 40% and BDI alone explained 38% of the CAT variance. CAT scores are associated with clinical variables of severity of COPD. However, cognitive status and, in particular, the level of depression explain a larger percentage of the variance in the CAT scores than the usual COPD clinical severity variables.
Bootstrap Score Tests for Fractional Integration in Heteroskedastic ARFIMA Models, with an Application to Price Dynamics in Commodity Spot and Futures Markets

DEFF Research Database (Denmark)

Cavaliere, Giuseppe; Nielsen, Morten Ørregaard; Taylor, A.M. Robert

Empirical evidence from time series methods which assume the usual I(0)/I(1) paradigm suggests that the efficient market hypothesis, stating that spot and futures prices of a commodity should cointegrate with a unit slope on futures prices, does not hold. However, these statistical methods...... fractionally integrated model we are able to find a body of evidence in support of the efficient market hypothesis for a number of commodities. Our new tests are wild bootstrap implementations of score-based tests for the order of integration of a fractionally integrated time series. These tests are designed...... principle do. A Monte Carlo simulation study demonstrates that very significant improvements infinite sample behaviour can be obtained by the bootstrap vis-à-vis the corresponding asymptotic tests in both heteroskedastic and homoskedastic environments....
Advantages of micronuclei analysis through images autocapturing and screen scoring

International Nuclear Information System (INIS)

González, J.E.; Martínez-López, W.

2015-01-01

The cytokinesis-block micronucleus (CBMN) test is a quantitative assay for genetic toxicity assessment. One of the advantages of the MN assay is its amenability for automation. Different type of cells has been used to evaluate genetic damage through MN assay, such as, human lymphocytes and rodent cell lines (i.e. CHO, V79, CHL and L5178Y). The MN quantification is a time consuming process and several efforts has been conducted for its automation. Some of them include an operator checking step, like PathFinder CellScan System, or are fully automated such as MNScore from MetaSytems. Usually, fully automated systems detect two or three times less MN than visual scoring. In some cases, the impact of false positive detection is reduced with a visual detection step. In the present work we have tested a combination of image autocapturing of CHOK1 cells previously treated with bleomycin (0, 2.5, 5.0 and 10.0 μg/ml) or UVC (0, 4, 8 and 16 J/m”2 ) with a screen scoring. Capturing images using the AutoCapture option from Metafer 4 from MetaSystems (GmbH, Germany) plus screen scoring render similar results in terms of MN cells frequency than microscopic live scoring. The resultant bias from the Bland–Altman analysis was -1.1% with confidence intervals between -2.2% and -0.1%, indicating an acceptable agreement between both MN scoring method. However, the mean time devoted to live microscope scoring per sample was 159 minutes compared to 39 minutes for microscope images autocapturing and screen scoring. Therefore, it become advantageous to combine autocapturing of microscope images plus screen scoring when many samples have to be analyzed for radiological biodosimetry purposes. (authors)
40 CFR 205.171-3 - Test motorcycle sample selection.

Science.gov (United States)

2010-07-01

... 40 Protection of Environment 24 2010-07-01 2010-07-01 false Test motorcycle sample selection. 205... ABATEMENT PROGRAMS TRANSPORTATION EQUIPMENT NOISE EMISSION CONTROLS Motorcycle Exhaust Systems § 205.171-3 Test motorcycle sample selection. A test motorcycle to be used for selective enforcement audit testing...
Sequential Neighborhood Effects: The Effect of Long-Term Exposure to Concentrated Disadvantage on Children's Reading and Math Test Scores.

Science.gov (United States)

Hicks, Andrew L; Handcock, Mark S; Sastry, Narayan; Pebley, Anne R

2018-02-01

Prior research has suggested that children living in a disadvantaged neighborhood have lower achievement test scores, but these studies typically have not estimated causal effects that account for neighborhood choice. Recent studies used propensity score methods to account for the endogeneity of neighborhood exposures, comparing disadvantaged and nondisadvantaged neighborhoods. We develop an alternative propensity function approach in which cumulative neighborhood effects are modeled as a continuous treatment variable. This approach offers several advantages. We use our approach to examine the cumulative effects of neighborhood disadvantage on reading and math test scores in Los Angeles. Our substantive results indicate that recency of exposure to disadvantaged neighborhoods may be more important than average exposure for children's test scores. We conclude that studies of child development should consider both average cumulative neighborhood exposure and the timing of this exposure.
Using College Admission Test Scores to Clarify High School Placement. Leading Indicator Spotlight

Science.gov (United States)

Flug, Susanna

2010-01-01

In "Beyond Test Scores: Leading Indicators for Education," Foley and colleagues (2008) define leading indicators as those that "provide early signals of progress toward academic achievement" (p. 1) and stress that educators "need leading indicators to help them see the direction their efforts are going in and to take…
Differences of wells scores accuracy, caprini scores and padua scores in deep vein thrombosis diagnosis

Science.gov (United States)

Gatot, D.; Mardia, A. I.

2018-03-01

Deep Vein Thrombosis (DVT) is the venous thrombus in lower limbs. Diagnosis is by using venography or ultrasound compression. However, these examinations are not available yet in some health facilities. Therefore many scoring systems are developed for the diagnosis of DVT. The scoring method is practical and safe to use in addition to efficacy, and effectiveness in terms of treatment and costs. The existing scoring systems are wells, caprini and padua score. There have been many studies comparing the accuracy of this score but not in Medan. Therefore, we are interested in comparative research of wells, capriniand padua score in Medan.An observational, analytical, case-control study was conducted to perform diagnostic tests on the wells, caprini and padua score to predict the risk of DVT. The study was at H. Adam Malik Hospital in Medan.From a total of 72 subjects, 39 people (54.2%) are men and the mean age are 53.14 years. Wells score, caprini score and padua score has a sensitivity of 80.6%; 61.1%, 50% respectively; specificity of 80.65; 66.7%; 75% respectively, and accuracy of 87.5%; 64.3%; 65.7% respectively.Wells score has better sensitivity, specificity and accuracy than caprini and padua score in diagnosing DVT.
Test plan for core sampling drill bit temperature monitor

International Nuclear Information System (INIS)

Francis, P.M.

1994-01-01

At WHC, one of the functions of the Tank Waste Remediation System division is sampling waste tanks to characterize their contents. The push-mode core sampling truck is currently used to take samples of liquid and sludge. Sampling of tanks containing hard salt cake is to be performed with the rotary-mode core sampling system, consisting of the core sample truck, mobile exhauster unit, and ancillary subsystems. When drilling through the salt cake material, friction and heat can be generated in the drill bit. Based upon tank safety reviews, it has been determined that the drill bit temperature must not exceed 180 C, due to the potential reactivity of tank contents at this temperature. Consequently, a drill bit temperature limit of 150 C was established for operation of the core sample truck to have an adequate margin of safety. Unpredictable factors, such as localized heating, cause this buffer to be so great. The most desirable safeguard against exceeding this threshold is bit temperature monitoring . This document describes the recommended plan for testing the prototype of a drill bit temperature monitor developed for core sampling by Sandia National Labs. The device will be tested at their facilities. This test plan documents the tests that Westinghouse Hanford Company considers necessary for effective testing of the system

A high COPD assessment test score may predict anxiety in COPD

Directory of Open Access Journals (Sweden)

Harryanto H

2018-03-01

Full Text Available Hilman Harryanto,1 Sally Burrows,2 Yuben Moodley1,2 1Department of Respiratory Medicine, Fiona Stanley Hospital, Perth, WA, Australia; 2Faculty of Health and Medical Sciences, Medical School, University of Western Australia, Perth, WA, AustraliaThe prevalence of anxiety is 55% in patients with COPD,1 and it is associated with worse disease control. Therefore, early recognition and institution of treatment of this comorbidity significantly improve patient’s quality of life. Recently, a questionnaire called the COPD assessment test (CAT has been incorporated into the Global Initiative for Chronic Obstructive Lung Disease (GOLD guidelines for the management of COPD, and a higher score is associated with increased COPD symptoms.2 Considering the regular use of CAT, it was evaluated whether this tool can also be used to identify anxiety. The CAT score was correlated with the Hospital Anxiety and Depression Scale (HADS to determine the level at which CAT may predict anxiety.
Exploration of analysis methods for diagnostic imaging tests: problems with ROC AUC and confidence scores in CT colonography.

Science.gov (United States)

Mallett, Susan; Halligan, Steve; Collins, Gary S; Altman, Doug G

2014-01-01

Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC) for the evaluation of CT colonography for the detection of polyps, either with or without computer assisted detection. In a multireader multicase study of 10 readers and 107 cases we compared sensitivity and specificity, using radiological reporting of the presence or absence of polyps, to ROC AUC calculated from confidence scores concerning the presence of polyps. Both methods were assessed against a reference standard. Here we focus on five readers, selected to illustrate issues in design and analysis. We compared diagnostic measures within readers, showing that differences in results are due to statistical methods. Reader performance varied widely depending on whether sensitivity and specificity or ROC AUC was used. There were problems using confidence scores; in assigning scores to all cases; in use of zero scores when no polyps were identified; the bimodal non-normal distribution of scores; fitting ROC curves due to extrapolation beyond the study data; and the undue influence of a few false positive results. Variation due to use of different ROC methods exceeded differences between test results for ROC AUC. The confidence scores recorded in our study violated many assumptions of ROC AUC methods, rendering these methods inappropriate. The problems we identified will apply to other detection studies using confidence scores. We found sensitivity and specificity were a more reliable and clinically appropriate method to compare diagnostic tests.
Estimation of sample size and testing power (part 5).

Science.gov (United States)

Hu, Liang-ping; Bao, Xiao-lei; Guan, Xue; Zhou, Shi-guo

2012-02-01

Estimation of sample size and testing power is an important component of research design. This article introduced methods for sample size and testing power estimation of difference test for quantitative and qualitative data with the single-group design, the paired design or the crossover design. To be specific, this article introduced formulas for sample size and testing power estimation of difference test for quantitative and qualitative data with the above three designs, the realization based on the formulas and the POWER procedure of SAS software and elaborated it with examples, which will benefit researchers for implementing the repetition principle.
Comprehensive School Reform and Standardized Test Scores in Illinois Elementary and Middle Schools

Science.gov (United States)

McEnroe, James D.

2010-01-01

The study examined the effects of the federally funded Comprehensive School Reform (CSR) program on student performance on mandated standardized tests. The study focused on the mathematics and reading scores of Illinois public elementary and middle and junior high school students. The federal CSR program provided Illinois schools with an annual…
Testing of Small Graphite Samples for Nuclear Qualification

Energy Technology Data Exchange (ETDEWEB)

Julie Chapman

2010-11-01

Accurately determining the mechanical properties of small irradiated samples is crucial to predicting the behavior of the overal irradiated graphite components within a Very High Temperature Reactor. The sample size allowed in a material test reactor, however, is limited, and this poses some difficulties with respect to mechanical testing. In the case of graphite with a larger grain size, a small sample may exhibit characteristics not representative of the bulk material, leading to inaccuracies in the data. A study to determine a potential size effect on the tensile strength was pursued under the Next Generation Nuclear Plant program. It focuses first on optimizing the tensile testing procedure identified in the American Society for Testing and Materials (ASTM) Standard C 781-08. Once the testing procedure was verified, a size effect was assessed by gradually reducing the diameter of the specimens. By monitoring the material response, a size effect was successfully identified.
Specific algorithm method of scoring the Clock Drawing Test applied in cognitively normal elderly

Directory of Open Access Journals (Sweden)

Liana Chaves Mendes-Santos

Full Text Available The Clock Drawing Test (CDT is an inexpensive, fast and easily administered measure of cognitive function, especially in the elderly. This instrument is a popular clinical tool widely used in screening for cognitive disorders and dementia. The CDT can be applied in different ways and scoring procedures also vary. OBJECTIVE: The aims of this study were to analyze the performance of elderly on the CDT and evaluate inter-rater reliability of the CDT scored by using a specific algorithm method adapted from Sunderland et al. (1989. METHODS: We analyzed the CDT of 100 cognitively normal elderly aged 60 years or older. The CDT ("free-drawn" and Mini-Mental State Examination (MMSE were administered to all participants. Six independent examiners scored the CDT of 30 participants to evaluate inter-rater reliability. RESULTS AND CONCLUSION: A score of 5 on the proposed algorithm ("Numbers in reverse order or concentrated", equivalent to 5 points on the original Sunderland scale, was the most frequent (53.5%. The CDT specific algorithm method used had high inter-rater reliability (p<0.01, and mean score ranged from 5.06 to 5.96. The high frequency of an overall score of 5 points may suggest the need to create more nuanced evaluation criteria, which are sensitive to differences in levels of impairment in visuoconstructive and executive abilities during aging.
Diagnosing unilateral primary aldosteronism - comparison of a clinical prediction score, computed tomography and adrenal venous sampling.

Science.gov (United States)

Sze, W C Candy; Soh, Lip Min; Lau, Jeshen H; Reznek, Rodney; Sahdev, Anju; Matson, Matthew; Riddoch, Fiona; Carpenter, Robert; Berney, Dan; Grossman, Ashley B; Chew, Shern L; Akker, Scott A; Druce, Maralyn R; Waterhouse, Mona; Monson, John P; Drake, William M

2014-07-01

In patients with primary aldosteronism (PA), adrenalectomy is potentially curative for those correctly identified as having unilateral excessive aldosterone production. It has been suggested that a recently developed and published clinical prediction score (CPS) may correctly identify some patients as having unilateral disease, without recourse to adrenal venous sampling. We have applied the CPS to a large cohort of PA patients with defined and documented outcomes. We also incorporated a minor modification to the CPS and a radiological grading score (RGS) into our analysis to assess whether its performance could be augmented. A total of 75 patients with a robust diagnosis following bilateral adrenal venous cannulation and/or strictly defined surgical outcome were analysed. Applying the CPS to this group of patients produced a sensitivity of 38·8% and a specificity of 88·5% of correctly identifying unilateral aldosterone production. Using a suggested modification to the CPS, in which different levels of hypokalaemia were given different weightings, the sensitivity rose to 40·8%, with an identical specificity. Using the RGS alone improved sensitivity to 91·7%, but specificity was reduced to 62·5%. Applying the recently developed CPS to this cohort of patients, it was not possible to reproduce the 100% specificity reported in the original publication. Using the modified score or incorporating the RGS did not improve its performance. In this cohort, we were unable to show superiority of the CPS over an imaging-based strategy. CPS may have a role in guiding clinical decision-making, especially in those whose adrenal venous sampling (AVS) has been unsuccessful. © 2013 John Wiley & Sons Ltd.
Examining the reliability of ADAS-Cog change scores.

Science.gov (United States)

Grochowalski, Joseph H; Liu, Ying; Siedlecki, Karen L

2016-09-01

The purpose of this study was to estimate and examine ways to improve the reliability of change scores on the Alzheimer's Disease Assessment Scale, Cognitive Subtest (ADAS-Cog). The sample, provided by the Alzheimer's Disease Neuroimaging Initiative, included individuals with Alzheimer's disease (AD) (n = 153) and individuals with mild cognitive impairment (MCI) (n = 352). All participants were administered the ADAS-Cog at baseline and 1 year, and change scores were calculated as the difference in scores over the 1-year period. Three types of change score reliabilities were estimated using multivariate generalizability. Two methods to increase change score reliability were evaluated: reweighting the subtests of the scale and adding more subtests. Reliability of ADAS-Cog change scores over 1 year was low for both the AD sample (ranging from .53 to .64) and the MCI sample (.39 to .61). Reweighting the change scores from the AD sample improved reliability (.68 to .76), but lengthening provided no useful improvement for either sample. The MCI change scores had low reliability, even with reweighting and adding additional subtests. The ADAS-Cog scores had low reliability for measuring change. Researchers using the ADAS-Cog should estimate and report reliability for their use of the change scores. The ADAS-Cog change scores are not recommended for assessment of meaningful clinical change.
Distribution of Total Depressive Symptoms Scores and Each Depressive Symptom Item in a Sample of Japanese Employees.

Science.gov (United States)

Tomitaka, Shinichiro; Kawasaki, Yohei; Ide, Kazuki; Yamada, Hiroshi; Miyake, Hirotsugu; Furukawa, Toshiaki A; Furukaw, Toshiaki A

2016-01-01

In a previous study, we reported that the distribution of total depressive symptoms scores according to the Center for Epidemiologic Studies Depression Scale (CES-D) in a general population is stable throughout middle adulthood and follows an exponential pattern except for at the lowest end of the symptom score. Furthermore, the individual distributions of 16 negative symptom items of the CES-D exhibit a common mathematical pattern. To confirm the reproducibility of these findings, we investigated the distribution of total depressive symptoms scores and 16 negative symptom items in a sample of Japanese employees. We analyzed 7624 employees aged 20-59 years who had participated in the Northern Japan Occupational Health Promotion Centers Collaboration Study for Mental Health. Depressive symptoms were assessed using the CES-D. The CES-D contains 20 items, each of which is scored in four grades: "rarely," "some," "much," and "most of the time." The descriptive statistics and frequency curves of the distributions were then compared according to age group. The distribution of total depressive symptoms scores appeared to be stable from 30-59 years. The right tail of the distribution for ages 30-59 years exhibited a linear pattern with a log-normal scale. The distributions of the 16 individual negative symptom items of the CES-D exhibited a common mathematical pattern which displayed different distributions with a boundary at "some." The distributions of the 16 negative symptom items from "some" to "most" followed a linear pattern with a log-normal scale. The distributions of the total depressive symptoms scores and individual negative symptom items in a Japanese occupational setting show the same patterns as those observed in a general population. These results show that the specific mathematical patterns of the distributions of total depressive symptoms scores and individual negative symptom items can be reproduced in an occupational population.
Percentiles of the null distribution of 2 maximum lod score tests.

Science.gov (United States)

Ulgen, Ayse; Yoo, Yun Joo; Gordon, Derek; Finch, Stephen J; Mendell, Nancy R

2004-01-01

We here consider the null distribution of the maximum lod score (LOD-M) obtained upon maximizing over transmission model parameters (penetrance values, dominance, and allele frequency) as well as the recombination fraction. Also considered is the lod score maximized over a fixed choice of genetic model parameters and recombination-fraction values set prior to the analysis (MMLS) as proposed by Hodge et al. The objective is to fit parametric distributions to MMLS and LOD-M. Our results are based on 3,600 simulations of samples of n = 100 nuclear families ascertained for having one affected member and at least one other sibling available for linkage analysis. Each null distribution is approximately a mixture p(2)(0) + (1 - p)(2)(v). The values of MMLS appear to fit the mixture 0.20(2)(0) + 0.80chi(2)(1.6). The mixture distribution 0.13(2)(0) + 0.87chi(2)(2.8). appears to describe the null distribution of LOD-M. From these results we derive a simple method for obtaining critical values of LOD-M and MMLS. Copyright 2004 S. Karger AG, Basel
Using Automated Essay Scores as an Anchor When Equating Constructed Response Writing Tests

Science.gov (United States)

Almond, Russell G.

2014-01-01

Assessments consisting of only a few extended constructed response items (essays) are not typically equated using anchor test designs as there are typically too few essay prompts in each form to allow for meaningful equating. This article explores the idea that output from an automated scoring program designed to measure writing fluency (a common…
A One-Sample Test for Normality with Kernel Methods

OpenAIRE

Kellner , Jérémie; Celisse , Alain

2015-01-01

We propose a new one-sample test for normality in a Reproducing Kernel Hilbert Space (RKHS). Namely, we test the null-hypothesis of belonging to a given family of Gaussian distributions. Hence our procedure may be applied either to test data for normality or to test parameters (mean and covariance) if data are assumed Gaussian. Our test is based on the same principle as the MMD (Maximum Mean Discrepancy) which is usually used for two-sample tests such as homogeneity or independence testing. O...
Implications of Deployed and Nondeployed Fathers on Seventh Graders' California Achievement Test Scores during a Military Crisis.

Science.gov (United States)

Pisano, Mark C.

The differences in California Achievement Test (CAT) scores from 1990 to 1991 in seventh graders, currently enrolled in Albritton Junior High School in the Fort Bragg Schools, of deployed and nondeployed fathers were analyzed. CAT percentile scores from 1990 and 1991 (1991 being the year of "Desert Storm") were obtained in reading, math…
40 CFR 205.160-2 - Test sample selection and preparation.

Science.gov (United States)

2010-07-01

... 40 Protection of Environment 24 2010-07-01 2010-07-01 false Test sample selection and preparation... sample selection and preparation. (a) Vehicles comprising the sample which are required to be tested... maintained in any manner unless such preparation, tests, modifications, adjustments or maintenance are part...
How Well Does the Sum Score Summarize the Test? Summability as a Measure of Internal Consistency

NARCIS (Netherlands)

Goeman, J.J.; De, Jong N.H.

2018-01-01

Many researchers use Cronbach's alpha to demonstrate internal consistency, even though it has been shown numerous times that Cronbach's alpha is not suitable for this. Because the intention of questionnaire and test constructers is to summarize the test by its overall sum score, we advocate
Zertifikat Deutsch als Fremdsprache and the Oral Proficiency Interview: A Comparison of Test Scores and Examinations.

Science.gov (United States)

Lalande, John F.; Schweckendiek, Jurgen

1986-01-01

Investigates what correlations might exist between an individual's score on the Zertifikat Deutsch als Fremdsprache and on the Oral Proficiency Interview. The tests themselves are briefly described. Results indicate that the two tests appear to correlate well in their evaluation of speaking skills. (SED)
Rey's Auditory Verbal Learning Test scores can be predicted from whole brain MRI in Alzheimer's disease

Directory of Open Access Journals (Sweden)

Elaheh Moradi

2017-01-01

Full Text Available Rey's Auditory Verbal Learning Test (RAVLT is a powerful neuropsychological tool for testing episodic memory, which is widely used for the cognitive assessment in dementia and pre-dementia conditions. Several studies have shown that an impairment in RAVLT scores reflect well the underlying pathology caused by Alzheimer's disease (AD, thus making RAVLT an effective early marker to detect AD in persons with memory complaints. We investigated the association between RAVLT scores (RAVLT Immediate and RAVLT Percent Forgetting and the structural brain atrophy caused by AD. The aim was to comprehensively study to what extent the RAVLT scores are predictable based on structural magnetic resonance imaging (MRI data using machine learning approaches as well as to find the most important brain regions for the estimation of RAVLT scores. For this, we built a predictive model to estimate RAVLT scores from gray matter density via elastic net penalized linear regression model. The proposed approach provided highly significant cross-validated correlation between the estimated and observed RAVLT Immediate (R = 0.50 and RAVLT Percent Forgetting (R = 0.43 in a dataset consisting of 806 AD, mild cognitive impairment (MCI or healthy subjects. In addition, the selected machine learning method provided more accurate estimates of RAVLT scores than the relevance vector regression used earlier for the estimation of RAVLT based on MRI data. The top predictors were medial temporal lobe structures and amygdala for the estimation of RAVLT Immediate and angular gyrus, hippocampus and amygdala for the estimation of RAVLT Percent Forgetting. Further, the conversion of MCI subjects to AD in 3-years could be predicted based on either observed or estimated RAVLT scores with an accuracy comparable to MRI-based biomarkers.
Clinical score and rapid antigen detection test to guide antibiotic use for sore throats: randomised controlled trial of PRISM (primary care streptococcal management).

Science.gov (United States)

Little, Paul; Hobbs, F D Richard; Moore, Michael; Mant, David; Williamson, Ian; McNulty, Cliodna; Cheng, Ying Edith; Leydon, Geraldine; McManus, Richard; Kelly, Joanne; Barnett, Jane; Glasziou, Paul; Mullee, Mark

2013-10-10

To determine the effect of clinical scores that predict streptococcal infection or rapid streptococcal antigen detection tests compared with delayed antibiotic prescribing. Open adaptive pragmatic parallel group randomised controlled trial. Primary care in United Kingdom. Patients aged ≥ 3 with acute sore throat. An internet programme randomised patients to targeted antibiotic use according to: delayed antibiotics (the comparator group for analyses), clinical score, or antigen test used according to clinical score. During the trial a preliminary streptococcal score (score 1, n=1129) was replaced by a more consistent score (score 2, n=631; features: fever during previous 24 hours; purulence; attends rapidly (within three days after onset of symptoms); inflamed tonsils; no cough/coryza (acronym FeverPAIN). Symptom severity reported by patients on a 7 point Likert scale (mean severity of sore throat/difficulty swallowing for days two to four after the consultation (primary outcome)), duration of symptoms, use of antibiotics. For score 1 there were no significant differences between groups. For score 2, symptom severity was documented in 80% (168/207 (81%) in delayed antibiotics group; 168/211 (80%) in clinical score group; 166/213 (78%) in antigen test group). Reported severity of symptoms was lower in the clinical score group (-0.33, 95% confidence interval -0.64 to -0.02; P=0.04), equivalent to one in three rating sore throat a slight versus moderate problem, with a similar reduction for the antigen test group (-0.30, -0.61 to -0.00; P=0.05). Symptoms rated moderately bad or worse resolved significantly faster in the clinical score group (hazard ratio 1.30, 95% confidence interval 1.03 to 1.63) but not the antigen test group (1.11, 0.88 to 1.40). In the delayed antibiotics group, 75/164 (46%) used antibiotics. Use of antibiotics in the clinical score group (60/161) was 29% lower (adjusted risk ratio 0.71, 95% confidence interval 0.50 to 0.95; P=0.02) and in the
Towards reporting standards for neuropsychological study results: A proposal to minimize communication errors with standardized qualitative descriptors for normalized test scores.

Science.gov (United States)

Schoenberg, Mike R; Rum, Ruba S

2017-11-01

Rapid, clear and efficient communication of neuropsychological results is essential to benefit patient care. Errors in communication are a lead cause of medical errors; nevertheless, there remains a lack of consistency in how neuropsychological scores are communicated. A major limitation in the communication of neuropsychological results is the inconsistent use of qualitative descriptors for standardized test scores and the use of vague terminology. PubMed search from 1 Jan 2007 to 1 Aug 2016 to identify guidelines or consensus statements for the description and reporting of qualitative terms to communicate neuropsychological test scores was conducted. The review found the use of confusing and overlapping terms to describe various ranges of percentile standardized test scores. In response, we propose a simplified set of qualitative descriptors for normalized test scores (Q-Simple) as a means to reduce errors in communicating test results. The Q-Simple qualitative terms are: 'very superior', 'superior', 'high average', 'average', 'low average', 'borderline' and 'abnormal/impaired'. A case example illustrates the proposed Q-Simple qualitative classification system to communicate neuropsychological results for neurosurgical planning. The Q-Simple qualitative descriptor system is aimed as a means to improve and standardize communication of standardized neuropsychological test scores. Research are needed to further evaluate neuropsychological communication errors. Conveying the clinical implications of neuropsychological results in a manner that minimizes risk for communication errors is a quintessential component of evidence-based practice. Copyright © 2017 Elsevier B.V. All rights reserved.
30 CFR 14.5 - Test samples.

Science.gov (United States)

2010-07-01

... MINING PRODUCTS REQUIREMENTS FOR THE APPROVAL OF FLAME-RESISTANT CONVEYOR BELTS General Provisions § 14.5 Test samples. Upon request by MSHA, the applicant must submit 3 precut, unrolled, flat conveyor belt...

Relationship between substances in seminal plasma and Acrobeads Test results.

Science.gov (United States)

Komori, Kazuhiko; Tsujimura, Akira; Okamoto, Yoshio; Matsuoka, Yasuhiro; Takao, Tetsuya; Miyagawa, Yasushi; Takada, Shingo; Nonomura, Norio; Okuyama, Akihiko

2009-01-01

To asses the effects of seminal plasma on sperm function. Retrospective case-control study. University hospital. One hundred fourteen infertile men. Acrobeads Test scores (0-4) and measurement of interleukin (IL)-6, soluble IL-6 receptor, epidermal growth factor, insulin-like growth factor-I (IGF-I), transforming growth factor-beta I, superoxide dismutase, calcitonin, and macrophage migration inhibitory factor (MIF) levels in seminal plasma. Kruskal-Wallis test to compare the concentrations of substances as a nonparametric test for differences among Acrobeads Test scores and a multivariable logistic regression model to find independent risk factors associated with abnormal Acrobeads Test results. The Acrobeads Test score was 0 for 7 samples, 1 for 20 samples, 2 for 18 samples, 3 for 28 samples, and 4 for 41 samples. Age, abstinence period, and semen parameters, except for sperm motility and percentage of sperm with abnormal morphology, had no effect on the Acrobeads Test results. Concentrations of IGF-I and MIF were significantly higher in patients with abnormal Acrobeads Test results. Multivariate analysis indicated that MIF and IGF-I were significantly associated with abnormal Acrobeads Test results (scores 0 to 1). Although further studies are needed, IGF-I and MIF in seminal plasma may have negative effects on sperm function.
Forward selection two sample binomial test

Science.gov (United States)

Wong, Kam-Fai; Wong, Weng-Kee; Lin, Miao-Shan

2016-01-01

Fisher’s exact test (FET) is a conditional method that is frequently used to analyze data in a 2 × 2 table for small samples. This test is conservative and attempts have been made to modify the test to make it less conservative. For example, Crans and Shuster (2008) proposed adding more points in the rejection region to make the test more powerful. We provide another way to modify the test to make it less conservative by using two independent binomial distributions as the reference distribution for the test statistic. We compare our new test with several methods and show that our test has advantages over existing methods in terms of control of the type 1 and type 2 errors. We reanalyze results from an oncology trial using our proposed method and our software which is freely available to the reader. PMID:27335577
Evaluation of the Discrepancy between the European Pharmacopoeia Test and an Adopted United States Pharmacopoeia Test Regarding the Weight Uniformity of Scored Tablet Halves: Is Harmonization Required?

Science.gov (United States)

Zaid, Abdel Naser; Ghoush, Abeer Abu; Al-Ramahi, Rowa'; Are'r, Mohammed

2012-01-01

The aim of this study was to evaluate whether there exists any discrepancy between the European Pharmacopoeia (Ph. Eur.) and adopted United States Pharmacopeia (USP) tests concerning the weight uniformity measurements of tablet halves after splitting. The USP method does not contain provisions to evaluate split tablets, so here we adopt their whole tablet weight uniformity method. Twenty-nine different commercial scored tablets (local and imported) were divided. The split units were individually weighed and the relative standard deviation (RSD) for each product was calculated and then evaluated according to both the adopted USP and the Ph. Eur. tests of weight uniformity. Twenty out of the 29 products tested failed the USP test, while 14 of them failed the Ph. Eur. test. Nine products passed both the USP and Ph. Eur. tests. Six products passed the Ph. Eur. test but failed the USP test, with all of these products having an RSD greater than 6%. The correlation coefficient between the weight and content of split halves for three randomly selected products-corotenol 100 mg, corotenol 50 mg, and lorazepam 2.5 mg-was found to be 0.986, 0.998, and 0.72, respectively. A clear difference can be seen between outcomes obtained by the two compendial tablet splitting methods with regard to weight uniformity. Results from the USP test showed that tighter measures are needed to pass the test. Our results argue that the Ph. Eur. should revise the existing weight uniformity test on scored tablets to include the RSD parameter in it. The USP should include this adopted test as a specific test for scored tablet halves, not just whole tablets. Manufacturers in some cases will need to improve the quality of the produced scored tablets in order to pass the USP test, especially those with low therapeutic indices. Finally, harmonization between the pharmacopoeias regarding the weight uniformity testing of split tablets is warranted. The aim of this study was to evaluate whether there
Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF) predictors of police officer problem behavior and collateral self-report test scores.

Science.gov (United States)

Tarescavage, Anthony M; Fischler, Gary L; Cappo, Bruce M; Hill, David O; Corey, David M; Ben-Porath, Yossef S

2015-03-01

The current study examined the predictive validity of Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF; Ben-Porath & Tellegen, 2008/2011) scores in police officer screenings. We utilized a sample of 712 police officer candidates (82.6% male) from 2 Midwestern police departments. The sample included 426 hired officers, most of whom had supervisor ratings of problem behaviors and human resource records of civilian complaints. With the full sample, we calculated zero-order correlations between MMPI-2-RF scale scores and scale scores from the California Psychological Inventory (Gough, 1956) and Inwald Personality Inventory (Inwald, 2006) by gender. In the hired sample, we correlated MMPI-2-RF scale scores with the outcome data for males only, owing to the relatively small number of hired women. Several scales demonstrated meaningful correlations with the criteria, particularly in the thought dysfunction and behavioral/externalizing dysfunction domains. After applying a correction for range restriction, the correlation coefficient magnitudes were generally in the moderate to large range. The practical implications of these findings were explored by means of risk ratio analyses, which indicated that officers who produced elevations at cutscores lower than the traditionally used 65 T-score level were as much as 10 times more likely than those scoring below the cutoff to exhibit problem behaviors. Overall, the results supported the validity of the MMPI-2-RF in this setting. Implications and limitations of this study are discussed. 2015 APA, all rights reserved
Unexplained Graft Dysfunction after Heart Transplantation—Role of Novel Molecular Expression Test Score and QTc-Interval: A Case Report

Directory of Open Access Journals (Sweden)

Khurram Shahzad

2010-01-01

Full Text Available In the current era of immunosuppressive medications there is increased observed incidence of graft dysfunction in the absence of known histological criteria of rejection after heart transplantation. A noninvasive molecular expression diagnostic test was developed and validated to rule out histological acute cellular rejection. In this paper we present for the first time, longitudinal pattern of changes in this novel diagnostic test score along with QTc-interval in a patient who was admitted with unexplained graft dysfunction. Patient presented with graft failure with negative findings on all known criteria of rejection including acute cellular rejection, antibody mediated rejection and cardiac allograft vasculopathy. The molecular expression test score showed gradual increase and QTc-interval showed gradual prolongation with the gradual decline in graft function. This paper exemplifies that in patients presenting with unexplained graft dysfunction, GEP test score and QTc-interval correlate with the changes in the graft function.
Polygenic Risk Score for Alzheimer's Disease: Implications for Memory Performance and Hippocampal Volumes in Early Life.

Science.gov (United States)

Axelrud, Luiza K; Santoro, Marcos L; Pine, Daniel S; Talarico, Fernanda; Gadelha, Ary; Manfro, Gisele G; Pan, Pedro M; Jackowski, Andrea; Picon, Felipe; Brietzke, Elisa; Grassi-Oliveira, Rodrigo; Bressan, Rodrigo A; Miguel, Eurípedes C; Rohde, Luis A; Hakonarson, Hakon; Pausova, Zdenka; Belangero, Sintia; Paus, Tomas; Salum, Giovanni A

2018-06-01

Alzheimer's disease is a heritable neurodegenerative disorder in which early-life precursors may manifest in cognition and brain structure. The authors evaluate this possibility by examining, in youths, associations among polygenic risk score for Alzheimer's disease, cognitive abilities, and hippocampal volume. Participants were children 6-14 years of age in two Brazilian cities, constituting the discovery (N=364) and replication samples (N=352). As an additional replication, data from a Canadian sample (N=1,029), with distinct tasks, MRI protocol, and genetic risk, were included. Cognitive tests quantified memory and executive function. Reading and writing abilities were assessed by standardized tests. Hippocampal volumes were derived from the Multiple Automatically Generated Templates (MAGeT) multi-atlas segmentation brain algorithm. Genetic risk for Alzheimer's disease was quantified using summary statistics from the International Genomics of Alzheimer's Project. Analyses showed that for the Brazilian discovery sample, each one-unit increase in z-score for Alzheimer's polygenic risk score significantly predicted a 0.185 decrement in z-score for immediate recall and a 0.282 decrement for delayed recall. Findings were similar for the Brazilian replication sample (immediate and delayed recall, β=-0.259 and β=-0.232, both significant). Quantile regressions showed lower hippocampal volumes bilaterally for individuals with high polygenic risk scores. Associations fell short of significance for the Canadian sample. Genetic risk for Alzheimer's disease may affect early-life cognition and hippocampal volumes, as shown in two independent samples. These data support previous evidence that some forms of late-life dementia may represent developmental conditions with roots in childhood. This result may vary depending on a sample's genetic risk and may be specific to some types of memory tasks.
Clinical use of the ABO-Scoring Index: reliability and subtraction frequency.

Science.gov (United States)

Lieber, William S; Carlson, Sean K; Baumrind, Sheldon; Poulton, Donald R

2003-10-01

This study tested the reliability and subtraction frequency of the study model-scoring system of the American Board of Orthodontists (ABO). We used a sample of 36 posttreatment study models that were selected randomly from six different orthodontic offices. Intrajudge and interjudge reliability was calculated using nonparametric statistics (Spearman rank coefficient, Wilcoxon, Kruskal-Wallis, and Mann-Whitney tests). We found differences ranging from 3 to 6 subtraction points (total score) for intrajudge scoring between two sessions. For overall total ABO score, the average correlation was .77. Intrajudge correlation was greatest for occlusal relationships and least for interproximal contacts. Interjudge correlation for ABO score averaged r = .85. Correlation was greatest for buccolingual inclination and least for overjet. The data show that some judges, on average, were much more lenient than others and that this resulted in a range of total scores between 19.7 and 27.5. Most of the deductions were found in the buccal segments and most were related to the second molars. We present these findings in the context of clinicians preparing for the ABO phase III examination and for orthodontists in their ongoing evaluation of clinical results.
Rugby versus Soccer in South Africa: Content Familiarity Contributes to Cross-Cultural Differences in Cognitive Test Scores

Science.gov (United States)

Malda, Maike; van de Vijver, Fons J. R.; Temane, Q. Michael

2010-01-01

In this study, cross-cultural differences in cognitive test scores are hypothesized to depend on a test's cultural complexity (Cultural Complexity Hypothesis: CCH), here conceptualized as its content familiarity, rather than on its cognitive complexity (Spearman's Hypothesis: SH). The content familiarity of tests assessing short-term memory,…
Semiparametric Copula Models for Biometric Score Level

NARCIS (Netherlands)

Caselli, M.

2016-01-01

In biometric recognition systems, biometric samples (images of faces, finger- prints, voices, gaits, etc.) of people are compared and classifiers (matchers) indicate the level of similarity between any pair of samples by a score. If two samples of the same person are compared, a genuine score is
Importance of Statistical Evidence in Estimating Valid DEA Scores.

Science.gov (United States)

Barnum, Darold T; Johnson, Matthew; Gleason, John M

2016-03-01

Data Envelopment Analysis (DEA) allows healthcare scholars to measure productivity in a holistic manner. It combines a production unit's multiple outputs and multiple inputs into a single measure of its overall performance relative to other units in the sample being analyzed. It accomplishes this task by aggregating a unit's weighted outputs and dividing the output sum by the unit's aggregated weighted inputs, choosing output and input weights that maximize its output/input ratio when the same weights are applied to other units in the sample. Conventional DEA assumes that inputs and outputs are used in different proportions by the units in the sample. So, for the sample as a whole, inputs have been substituted for each other and outputs have been transformed into each other. Variables are assigned different weights based on their marginal rates of substitution and marginal rates of transformation. If in truth inputs have not been substituted nor outputs transformed, then there will be no marginal rates and therefore no valid basis for differential weights. This paper explains how to statistically test for the presence of substitutions among inputs and transformations among outputs. Then, it applies these tests to the input and output data from three healthcare DEA articles, in order to identify the effects on DEA scores when input substitutions and output transformations are absent in the sample data. It finds that DEA scores are badly biased when substitution and transformation are absent and conventional DEA models are used.
Matrix Sampling of Items in Large-Scale Assessments

Directory of Open Access Journals (Sweden)

Ruth A. Childs

2003-07-01

Full Text Available Matrix sampling of items -' that is, division of a set of items into different versions of a test form..-' is used by several large-scale testing programs. Like other test designs, matrixed designs have..both advantages and disadvantages. For example, testing time per student is less than if each..student received all the items, but the comparability of student scores may decrease. Also,..curriculum coverage is maintained, but reporting of scores becomes more complex. In this paper,..matrixed designs are compared with more traditional designs in nine categories of costs:..development costs, materials costs, administration costs, educational costs, scoring costs,..reliability costs, comparability costs, validity costs, and reporting costs. In choosing among test..designs, a testing program should examine the costs in light of its mandate(s, the content of the..tests, and the financial resources available, among other considerations.
Failure-censored accelerated life test sampling plans for Weibull distribution under expected test time constraint

International Nuclear Information System (INIS)

Bai, D.S.; Chun, Y.R.; Kim, J.G.

1995-01-01

This paper considers the design of life-test sampling plans based on failure-censored accelerated life tests. The lifetime distribution of products is assumed to be Weibull with a scale parameter that is a log linear function of a (possibly transformed) stress. Two levels of stress higher than the use condition stress, high and low, are used. Sampling plans with equal expected test times at high and low test stresses which satisfy the producer's and consumer's risk requirements and minimize the asymptotic variance of the test statistic used to decide lot acceptability are obtained. The properties of the proposed life-test sampling plans are investigated
Test of a sample container for shipment of small size plutonium samples with PAT-2

International Nuclear Information System (INIS)

Kuhn, E.; Aigner, H.; Deron, S.

1981-11-01

A light-weight container for the air transport of plutonium, to be designated PAT-2, has been developed in the USA and is presently undergoing licensing. The very limited effective space for bearing plutonium required the design of small size sample canisters to meet the needs of international safeguards for the shipment of plutonium samples. The applicability of a small canister for the sampling of small size powder and solution samples has been tested in an intralaboratory experiment. The results of the experiment, based on the concept of pre-weighed samples, show that the tested canister can successfully be used for the sampling of small size PuO 2 -powder samples of homogeneous source material, as well as for dried aliquands of plutonium nitrate solutions. (author)
A risk score to predict type 2 diabetes mellitus in an elderly Spanish Mediterranean population at high cardiovascular risk.

Directory of Open Access Journals (Sweden)

Marta Guasch-Ferré

Full Text Available INTRODUCTION: To develop and test a diabetes risk score to predict incident diabetes in an elderly Spanish Mediterranean population at high cardiovascular risk. MATERIALS AND METHODS: A diabetes risk score was derived from a subset of 1381 nondiabetic individuals from three centres of the PREDIMED study (derivation sample. Multivariate Cox regression model ß-coefficients were used to weigh each risk factor. PREDIMED-personal Score included body-mass-index, smoking status, family history of type 2 diabetes, alcohol consumption and hypertension as categorical variables; PREDIMED-clinical Score included also high blood glucose. We tested the predictive capability of these scores in the DE-PLAN-CAT cohort (validation sample. The discrimination of Finnish Diabetes Risk Score (FINDRISC, German Diabetes Risk Score (GDRS and our scores was assessed with the area under curve (AUC. RESULTS: The PREDIMED-clinical Score varied from 0 to 14 points. In the subset of the PREDIMED study, 155 individuals developed diabetes during the 4.75-years follow-up. The PREDIMED-clinical score at a cutoff of ≥6 had sensitivity of 72.2%, and specificity of 72.5%, whereas AUC was 0.78. The AUC of the PREDIMED-clinical Score was 0.66 in the validation sample (sensitivity = 85.4%; specificity = 26.6%, and was significantly higher than the FINDRISC and the GDRS in both the derivation and validation samples. DISCUSSION: We identified classical risk factors for diabetes and developed the PREDIMED-clinical Score to determine those individuals at high risk of developing diabetes in elderly individuals at high cardiovascular risk. The predictive capability of the PREDIMED-clinical Score was significantly higher than the FINDRISC and GDRS, and also used fewer items in the questionnaire.
Differences in distribution of T-scores and Z-scores among bone densitometry tests in postmenopausal women (a comparative study)

International Nuclear Information System (INIS)

Wendlova, J.

2002-01-01

To determine the character of T-score and Z-score value distribution in individually selected methods of bone densitometry and to compare them using statistical analysis. We examined 56 postmenopausal women with an age between 43 and 68 years with osteopenia or osteoporosis according to the WHO classification. The following measurements were made in each patient: T-score and Z-score for: 1) Stiffness index (S) of the left heel bone, USM (index). 2) Bone mineral density of the left heel bone (BMDh), DEXA (g of Ca hydroxyapatite per cm 2 ). 3) Bone mineral density of trabecular bone of the L1 vertebra (BMDL1). QCT (mg of Ca hydroxyapatite per cm 3 ). The densitometers used in the study were: ultrasonometer to measure heel bone, Achilles plus LUNAR, USA: DEXA to measure heel bone, PIXl, LUNAR, USA: QCT to measure the L1 vertebra, CT, SOMATOM Plus, Siemens, Germany. Statistical analysis: differences between measured values of T-scores (Z-scores) were evaluated by parametric or non-parametric methods of determining the 95 % confidence intervals (C.I.). Differences between Z-score and T-score values for compared measurements were statistically significant; however, these differences were lower for Z-scores. Largest differences in 95 % C.I., characterizing individual measurements of T-score values (in comparison with Z-scores), were found for those densitometers whose age range of the reference groups of young adults differed the most, and conversely, the smallest differences in T-score values were found when the differences between the age ranges of reference groups were smallest. The higher variation in T-score values in comparison to Z-scores is also caused by a non-standard selection of the reference groups of young adults for the QCT, PIXI and Achilles Plus densitometers used in the study. Age characteristics of the reference group for T-scores should be standardized for all types of densitometers. (author)
Comparing the Scoring of Human Decomposition from Digital Images to Scoring Using On-site Observations.

Science.gov (United States)

Dabbs, Gretchen R; Bytheway, Joan A; Connor, Melissa

2017-09-01

When in forensic casework or empirical research in-person assessment of human decomposition is not possible, the sensible substitution is color photographic images. To date, no research has confirmed the utility of color photographic images as a proxy for in situ observation of the level of decomposition. Sixteen observers scored photographs of 13 human cadavers in varying decomposition stages (PMI 2-186 days) using the Total Body Score system (total n = 929 observations). The on-site TBS was compared with recorded observations from digital color images using a paired samples t-test. The average difference between on-site and photographic observations was -0.20 (t = -1.679, df = 928, p = 0.094). Individually, only two observers, both students with human decomposition based on digital images can be substituted for assessments based on observation of the corpse in situ, when necessary. © 2017 American Academy of Forensic Sciences.
A Study on Variables that Affect Class Scores of Primary Education Students in Placement Test

OpenAIRE

Yavuz, Mustafa

2010-01-01

This study aims to determine the variables that predict class scores which are obtained by adding 70 % of the Placement Test (PT) scores of the primary education sixth and seventh grade students who took it for the first time in the 2007-2008 academic year within the framework of the system of passing to secondary education reorganized by the MNE, 25 % of their end-of-the-year passing grades. The study is of general survey model. The study group consists of students who took the PT in the 200...
Validity of the Wechsler Test of Adult Reading (WTAR): effort considered in a clinical sample of U.S. military veterans.

Science.gov (United States)

Whitney, Kriscinda A; Shepard, Polly H; Mariner, Jennifer; Mossbarger, Brad; Herman, Steven M

2010-07-01

The current study represents an examination of the construct validity of the Wechsler Test of Adult Reading (WTAR) among a sample of U.S. military veterans referred for outpatient neuropsychological evaluation that included a measure of negative response bias, namely, the Test of Memory Malingering (TOMM). This retrospective data analysis examined the relationship between the WTAR and measures of current verbal general intellectual function and current cognitive skills. Findings showed that, among patients passing the TOMM (N = 98), WTAR scores were most highly correlated with current verbal IQ but also showed significant correlations with verbal memory and lesser, but still significant, correlations with measures of visual-spatial memory. Discriminant validity for the WTAR was also shown among the group passing the TOMM in the sense that the WTAR, which is designed to measure verbal premorbid general intellectual skill, was not as highly correlated with measures of learning and memory as was a measure of current verbal general intellectual skill. Whereas scores on most study measures did significantly differ between the groups that passed versus failed the TOMM (N = 26), scores on the WTAR did not, suggesting that the WTAR may remain robust even in the face of suboptimal effort.
An analysis of aviation test scores to characterize Student Naval Aviator disqualification

OpenAIRE

Wahl, Erich J.

1998-01-01

Approved for public release; distribution is unlimited The U.S. Navy uses the Aviation Selection Test Battery (ASTh) to identify those Student Naval Aviator (SNA) applicants most likely to succeed in flight training. Using classification and regression trees, this thesis concludes that individual answers to an ASTh subtest, the Biographical Inventory, are not good predictors of SNA primary flight grades. It also concludes that those SNA who score less than a 6 on the Pilot Biographical Inv...
Just as smart but not as successful: obese students obtain lower school grades but equivalent test scores to nonobese students.

Science.gov (United States)

MacCann, C; Roberts, R D

2013-01-01

The obesity epidemic in industrialized nations has important implications for education, as research demonstrates lower academic achievement among obese students. The current paper compares the test scores and school grades of obese, overweight and normal-weight students in secondary and further education, controlling for demographic variables, personality, ability and well-being confounds. This study included 383 eighth-grade students (49% female; study 1) and 1036 students from 24 community colleges and universities (64% female, study 2), both drawn from five regions across the United States. In study 1, body mass index (BMI) was calculated using self-reports and parent reports of weight and height. In study 2, BMI was calculated from self-reported weight and height only. Both samples completed age-appropriate assessments of mathematics, vocabulary and the personality trait conscientiousness. Eighth-grade students additionally completed a measure of life satisfaction, with both self-reports and parent reports of their grades from the previous semester also obtained. Higher education students additionally completed measures of positive and negative affect, and self-reported their grades and college entrance scores. Obese students receive significantly lower grades in middle school (d=0.83), community college (d=0.34) and university (d=0.36), but show no statistically significant differences in intelligence or achievement test scores. Even after controlling for demographic variables, intelligence, personality and well-being, obese students obtain significantly lower grades than normal-weight students in the eighth grade (d=0.39), community college (d=0.42) and university (d=0.31). Lower grades may reflect peer and teacher prejudice against overweight and obese students rather than lack of ability among these students.

Outgassing tests on iras solar panel samples

Science.gov (United States)

Premat, G.; Zwaal, A.; Pennings, N. H.

1980-01-01

Several outgassing tests were carried out on representative solar panel samples in order to determine the extent of contamination that could be expected from this source. The materials for the construction of the solar panels were selected as a result of contamination obtained in micro volatile condensable materials tests.
Psychometric Evaluation of the Lower Extremity Computerized Adaptive Test, the Modified Harris Hip Score, and the Hip Outcome Score.

Science.gov (United States)

Hung, Man; Hon, Shirley D; Cheng, Christine; Franklin, Jeremy D; Aoki, Stephen K; Anderson, Mike B; Kapron, Ashley L; Peters, Christopher L; Pelt, Christopher E

2014-12-01

The applicability and validity of many patient-reported outcome measures in the high-functioning population are not well understood. To compare the psychometric properties of the modified Harris Hip Score (mHHS), the Hip Outcome Score activities of daily living subscale (HOS-ADL) and sports (HOS-sports), and the Lower Extremity Computerized Adaptive Test (LE CAT). The hypotheses was that all instruments would perform well but that the LE CAT would show superiority psychometrically because a combination of CAT and a large item bank allows for a high degree of measurement precision. Cohort study (diagnosis); Level of evidence, 2. Data were collected from 472 advanced-age, active participants from the Huntsman World Senior Games in 2012. Validity evidences were examined through item fit, dimensionality, monotonicity, local independence, differential item functioning, person raw score to measure correlation, and instrument coverage (ie, ceiling and floor effects), and reliability evidences were examined through Cronbach alpha and person separation index. All instruments demonstrated good item fit, unidimensionality, monotonicity, local independence, and person raw score to measure correlations. The HOS-ADL had high ceiling effects of 36.02%, and the mHHS had ceiling effects of 27.54%. The LE CAT had ceiling effects of 8.47%, and the HOS-sports had no ceiling effects. None of the instruments had any floor effects. The mHHS had a very low Cronbach alpha of 0.41 and an extremely low person separation index of 0.08. Reliabilities for the LE CAT were excellent and for the HOS-ADL and HOS-sports were good. The LE CAT showed better psychometric properties overall than the HOS-ADL, HOS-sports, and mHHS for the senior population. The mHHS demonstrated pronounced ceiling effects and poor reliabilities that should be of concern. The high ceiling effects for the HOS-ADL were also of concern. The LE CAT was superior in all psychometric aspects examined in this study. Future
Validating Score Interpretations and Uses: Messick Lecture, Language Testing Research Colloquium, Cambridge, April 2010

Science.gov (United States)

Kane, Michael

2012-01-01

The argument-based approach to validation involves two steps; specification of the proposed interpretations and uses of the test scores as an interpretive argument, and the evaluation of the plausibility of the proposed interpretive argument. More ambitious interpretations and uses tend to involve an extended network of inferences and assumptions…
GalaxyDock BP2 score: a hybrid scoring function for accurate protein-ligand docking

Science.gov (United States)

Baek, Minkyung; Shin, Woong-Hee; Chung, Hwan Won; Seok, Chaok

2017-07-01

Protein-ligand docking is a useful tool for providing atomic-level understanding of protein functions in nature and design principles for artificial ligands or proteins with desired properties. The ability to identify the true binding pose of a ligand to a target protein among numerous possible candidate poses is an essential requirement for successful protein-ligand docking. Many previously developed docking scoring functions were trained to reproduce experimental binding affinities and were also used for scoring binding poses. However, in this study, we developed a new docking scoring function, called GalaxyDock BP2 Score, by directly training the scoring power of binding poses. This function is a hybrid of physics-based, empirical, and knowledge-based score terms that are balanced to strengthen the advantages of each component. The performance of the new scoring function exhibits significant improvement over existing scoring functions in decoy pose discrimination tests. In addition, when the score is used with the GalaxyDock2 protein-ligand docking program, it outperformed other state-of-the-art docking programs in docking tests on the Astex diverse set, the Cross2009 benchmark set, and the Astex non-native set. GalaxyDock BP2 Score and GalaxyDock2 with this score are freely available at http://galaxy.seoklab.org/softwares/galaxydock.html.
Testing Homogeneity in a Semiparametric Two-Sample Problem

Directory of Open Access Journals (Sweden)

Yukun Liu

2012-01-01

Full Text Available We study a two-sample homogeneity testing problem, in which one sample comes from a population with density f(x and the other is from a mixture population with mixture density (1−λf(x+λg(x. This problem arises naturally from many statistical applications such as test for partial differential gene expression in microarray study or genetic studies for gene mutation. Under the semiparametric assumption g(x=f(xeα+βx, a penalized empirical likelihood ratio test could be constructed, but its implementation is hindered by the fact that there is neither feasible algorithm for computing the test statistic nor available research results on its theoretical properties. To circumvent these difficulties, we propose an EM test based on the penalized empirical likelihood. We prove that the EM test has a simple chi-square limiting distribution, and we also demonstrate its competitive testing performances by simulations. A real-data example is used to illustrate the proposed methodology.
A Fault Sample Simulation Approach for Virtual Testability Demonstration Test

Institute of Scientific and Technical Information of China (English)

ZHANG Yong; QIU Jing; LIU Guanjun; YANG Peng

2012-01-01

Virtual testability demonstration test has many advantages,such as low cost,high efficiency,low risk and few restrictions.It brings new requirements to the fault sample generation.A fault sample simulation approach for virtual testability demonstration test based on stochastic process theory is proposed.First,the similarities and differences of fault sample generation between physical testability demonstration test and virtual testability demonstration test are discussed.Second,it is pointed out that the fault occurrence process subject to perfect repair is renewal process.Third,the interarrival time distribution function of the next fault event is given.Steps and flowcharts of fault sample generation are introduced.The number of faults and their occurrence time are obtained by statistical simulation.Finally,experiments are carried out on a stable tracking platform.Because a variety of types of life distributions and maintenance modes are considered and some assumptions are removed,the sample size and structure of fault sample simulation results are more similar to the actual results and more reasonable.The proposed method can effectively guide the fault injection in virtual testability demonstration test.
Forging the Basis for Developing Protein-Ligand Interaction Scoring Functions.

Science.gov (United States)

Liu, Zhihai; Su, Minyi; Han, Li; Liu, Jie; Yang, Qifan; Li, Yan; Wang, Renxiao

2017-02-21

In structure-based drug design, scoring functions are widely used for fast evaluation of protein-ligand interactions. They are often applied in combination with molecular docking and de novo design methods. Since the early 1990s, a whole spectrum of protein-ligand interaction scoring functions have been developed. Regardless of their technical difference, scoring functions all need data sets combining protein-ligand complex structures and binding affinity data for parametrization and validation. However, data sets of this kind used to be rather limited in terms of size and quality. On the other hand, standard metrics for evaluating scoring function used to be ambiguous. Scoring functions are often tested in molecular docking or even virtual screening trials, which do not directly reflect the genuine quality of scoring functions. Collectively, these underlying obstacles have impeded the invention of more advanced scoring functions. In this Account, we describe our long-lasting efforts to overcome these obstacles, which involve two related projects. On the first project, we have created the PDBbind database. It is the first database that systematically annotates the protein-ligand complexes in the Protein Data Bank (PDB) with experimental binding data. This database has been updated annually since its first public release in 2004. The latest release (version 2016) provides binding data for 16 179 biomolecular complexes in PDB. Data sets provided by PDBbind have been applied to many computational and statistical studies on protein-ligand interaction and various subjects. In particular, it has become a major data resource for scoring function development. On the second project, we have established the Comparative Assessment of Scoring Functions (CASF) benchmark for scoring function evaluation. Our key idea is to decouple the "scoring" process from the "sampling" process, so scoring functions can be tested in a relatively pure context to reflect their quality. In our
Testing of candidate non-lethal sampling methods for detection of Renibacterium salmoninarum in juvenile Chinook salmon Oncorhynchus tshawytscha

Science.gov (United States)

Elliott, Diane G.; McKibben, Constance L.; Conway, Carla M.; Purcell, Maureen K.; Chase, Dorothy M.; Applegate, Lynn M.

2015-01-01

Non-lethal pathogen testing can be a useful tool for fish disease research and management. Our research objectives were to determine if (1) fin clips, gill snips, surface mucus scrapings, blood draws, or kidney biopsies could be obtained non-lethally from 3 to 15 g Chinook salmon Oncorhynchus tshawytscha, (2) non-lethal samples could accurately discriminate between fish exposed to the bacterial kidney disease agent Renibacterium salmoninarum and non-exposed fish, and (3) non-lethal samples could serve as proxies for lethal kidney samples to assess infection intensity. Blood draws and kidney biopsies caused ≥5% post-sampling mortality (Objective 1) and may be appropriate only for larger fish, but the other sample types were non-lethal. Sampling was performed over 21 wk following R. salmoninarum immersion challenge of fish from 2 stocks (Objectives 2 and 3), and nested PCR (nPCR) and real-time quantitative PCR (qPCR) results from candidate non-lethal samples were compared with kidney tissue analysis by nPCR, qPCR, bacteriological culture, enzyme-linked immunosorbent assay (ELISA), fluorescent antibody test (FAT) and histopathology/immunohistochemistry. R. salmoninarum was detected by PCR in >50% of fin, gill, and mucus samples from challenged fish. Mucus qPCR was the only non-lethal assay exhibiting both diagnostic sensitivity and specificity estimates >90% for distinguishing between R. salmoninarum-exposed and non-exposed fish and was the best candidate for use as an alternative to lethal kidney sample testing. Mucus qPCR R. salmoninarum quantity estimates reflected changes in kidney bacterial load estimates, as evidenced by significant positive correlations with kidney R. salmoninaruminfection intensity scores at all sample times and in both fish stocks, and were not significantly impacted by environmentalR. salmoninarum concentrations.
Male-female differences in Scoliosis Research Society-30 scores in adolescent idiopathic scoliosis.

Science.gov (United States)

Roberts, David W; Savage, Jason W; Schwartz, Daniel G; Carreon, Leah Y; Sucato, Daniel J; Sanders, James O; Richards, Benjamin Stephens; Lenke, Lawrence G; Emans, John B; Parent, Stefan; Sarwark, John F

2011-01-01

Longitudinal cohort study. To compare functional outcomes between male and female patients before and after surgery for adolescent idiopathic scoliosis (AIS). There is no clear consensus in the existing literature with respect to sex differences in functional outcomes in the surgical treatment of AIS. A prospective, consecutive, multicenter database of patients who underwent surgical correction for adolescent idiopathic scoliosis was analyzed retrospectively. All patients completed Scoliosis Research Society-30 (SRS-30) questionnaires before and 2 years after surgery. Patients with previous spine surgery were excluded. Data were collected for sex, age, Risser grade, previous bracing history, maximum preoperative Cobb angle, curve correction at 2 years, and SRS-30 domain scores. Paired sample t tests were used to compare preoperative and postoperative scores within each sex. Independent sample t tests were used to compare scores between sexes. A P value of Self-image/appearance had the greatest relative improvement. Males had better self-image/appearance scores preoperatively, better pain scores at 2 years, and better mental health and total scores both preoperatively and at 2 years. Both males and females were similarly satisfied with surgery. Males treated with surgery for AIS report better preoperative self-image, less postoperative pain, and better mental health than females. These differences may be clinically significant. For both males and females, the most beneficial effect of surgery is improved self-image/appearance. Overall, the benefits of surgery for AIS are similar for both sexes.
Utilizing the Six Realms of Meaning in Improving Campus Standardized Test Scores through Team Teaching and Strategic Planning

Science.gov (United States)

Stevenson, Rosnisha D.; Kritsonis, William Allan

2009-01-01

This article will seek to utilize Dr. William Allan Kritsonis' book "Ways of Knowing Through the Realms of Meaning" (2007) as a framework to improve a campus's standardized test scores, more specifically, their TAKS (Texas Assessment of Knowledge and Skills) scores. Many campuses have an improvement plan, also known as a Campus…
Integrating GIS in the Middle School Curriculum: Impacts on Diverse Students' Standardized Test Scores

Science.gov (United States)

Goldstein, Donna; Alibrandi, Marsha

2013-01-01

This case study conducted with 1,425 middle school students in Palm Beach County, Florida, included a treatment group receiving GIS instruction (256) and a control group without GIS instruction (1,169). Quantitative analyses on standardized test scores indicated that inclusion of GIS in middle school curriculum had a significant effect on student…
Virginia tech freshman class becoming more competitive; Rise in grades and test scores noted

OpenAIRE

Virginia Tech News

2004-01-01

Admission to Virginia Tech continues to become more competitive as applicants report higher grade point averages and test scores than previous years. The incoming class of 4,975 students has an average grade point average (GPA) of 3.68 and SAT 1203, up from 3.60 GPA and 1197 SAT in 2003.
Data Quality Objectives For Selecting Waste Samples To Test The Fluid Bed Steam Reformer Test

International Nuclear Information System (INIS)

Banning, D.L.

2010-01-01

This document describes the data quality objectives to select archived samples located at the 222-S Laboratory for Fluid Bed Steam Reformer testing. The type, quantity and quality of the data required to select the samples for Fluid Bed Steam Reformer testing are discussed. In order to maximize the efficiency and minimize the time to treat Hanford tank waste in the Waste Treatment and Immobilization Plant, additional treatment processes may be required. One of the potential treatment processes is the fluid bed steam reformer (FBSR). A determination of the adequacy of the FBSR process to treat Hanford tank waste is required. The initial step in determining the adequacy of the FBSR process is to select archived waste samples from the 222-S Laboratory that will be used to test the FBSR process. Analyses of the selected samples will be required to confirm the samples meet the testing criteria.
The Health Professions Admission Test (HPAT) score and leaving certificate results can independently predict academic performance in medical school: do we need both tests?

LENUS (Irish Health Repository)

Halpenny, D

2010-11-01

A recent study raised concerns regarding the ability of the health professions admission test (HPAT) Ireland to improve the selection process in Irish medical schools. We aimed to establish whether performance in a mock HPAT correlated with academic success in medicine. A modified HPAT examination and a questionnaire were administered to a group of doctors and medical students. There was a significant correlation between HPAT score and college results (r2: 0.314, P = 0.018, Spearman Rank) and between leaving cert score and college results (r2: 0.306, P = 0.049, Spearman Rank). There was no correlation between leaving cert points score and HPAT score. There was no difference in HPAT score across a number of other variables including gender, age and medical speciality. Our results suggest that both the HPAT Ireland and the leaving certificate examination could act as independent predictors of academic achievement in medicine.
Genotoxicity assessment of water sampled from R-11 reservoir by means of allium test

Energy Technology Data Exchange (ETDEWEB)

Bukatich, E.; Pryakhin, E. [Urals Research Center for Radiation Medicine (Russian Federation); Geraskin, S. [Russian Institute of Agricultural Radiology and Agroecology (Russian Federation)

2014-07-01

slides of root tips meristem were dyed with aceto-orcein. Approximately 150 ana-telophases were scored for each root. 20-40 roots were analyzed for each water sample. In total 3000 - 6000 ana-telophases for each water sample were analyzed. Chromosome aberrations in ana-telophases (chromatid and chromosomal bridges and fragments), mitotic abnormalities (multipolar mitosis and laggards) were scored. The data analysis was arranged using R statistics. Aberration frequency in water samples from the natural control reservoir (0.46 ± 0.12%) exceeded insignificantly the frequency of aberrations in distilled (0.15 ± 0.08%) and bottled waters (0.33 ± 0.08%). Average frequency of aberrant cells in root meristem of onion germinated in water samples from R-11 reservoir (1.36 ± 0.24%) was about 3 times higher compared to control ones. Mitotic activity in root meristem was slightly inhibited in bulbs germinated in R-11 sample, but this effect was statistically insignificant. There was no difference in types of aberrations among all water samples but only in the frequency of abnormalities. So genotoxicity assessment of water sampled from R-11 reservoir by means of allium test shows the presence of genotoxic factor in water from the reservoir. Document available in abstract form only. (authors)
Scoring System Improvements to Three Leadership Predictors

National Research Council Canada - National Science Library

Dela

1997-01-01

.... The modified scoring systems were evaluated by rescoring responses randomly selected from the sample which had been scored according to the scoring systems originally developed for the leadership research...
Science Teacher Efficacy and Outcome Expectancy as Predictors of Students' End-of-Instruction (EOI) Biology I Test Scores

Science.gov (United States)

Angle, Julie; Moseley, Christine

2009-01-01

The purpose of this study was to compare teacher efficacy beliefs of secondary Biology I teachers whose students' mean scores on the statewide End-of-Instruction (EOI) Biology I test met or exceeded the state academic proficiency level (Proficient Group) to teacher efficacy beliefs of secondary Biology I teachers whose students' mean scores on the…
Acceptance test report for core sample trucks 3 and 4

International Nuclear Information System (INIS)

Corbett, J.E.

1996-01-01

The purpose of this Acceptance Test Report is to provide documentation for the acceptance testing of the rotary mode core sample trucks 3 and 4, designated as HO-68K-4600 and HO-68K-4647, respectively. This report conforms to the guidelines established in WHC-IP-1026, ''Engineering Practice Guidelines,'' Appendix M, ''Acceptance Test Procedures and Reports.'' Rotary mode core sample trucks 3 and 4 were based upon the design of the second core sample truck (HO-68K-4345) which was constructed to implement rotary mode sampling of the waste tanks at Hanford. Successful completion of acceptance testing on June 30, 1995 verified that all design requirements were met. This report is divided into four sections, beginning with general information. Acceptance testing was performed on trucks 3 and 4 during the months of March through June, 1995. All testing was performed at the ''Rock Slinger'' test site in the 200 West area. The sequence of testing was determined by equipment availability, and the initial revision of the Acceptance Test Procedure (ATP) was used for both trucks. Testing was directed by ICF-KH, with the support of WHC Characterization Equipment Engineering and Characterization Project Operations. Testing was completed per the ATP without discrepancies or deviations, except as noted
Optimum sample size allocation to minimize cost or maximize power for the two-sample trimmed mean test.

Science.gov (United States)

Guo, Jiin-Huarng; Luh, Wei-Ming

2009-05-01

When planning a study, sample size determination is one of the most important tasks facing the researcher. The size will depend on the purpose of the study, the cost limitations, and the nature of the data. By specifying the standard deviation ratio and/or the sample size ratio, the present study considers the problem of heterogeneous variances and non-normality for Yuen's two-group test and develops sample size formulas to minimize the total cost or maximize the power of the test. For a given power, the sample size allocation ratio can be manipulated so that the proposed formulas can minimize the total cost, the total sample size, or the sum of total sample size and total cost. On the other hand, for a given total cost, the optimum sample size allocation ratio can maximize the statistical power of the test. After the sample size is determined, the present simulation applies Yuen's test to the sample generated, and then the procedure is validated in terms of Type I errors and power. Simulation results show that the proposed formulas can control Type I errors and achieve the desired power under the various conditions specified. Finally, the implications for determining sample sizes in experimental studies and future research are discussed.
A Danish diabetes risk score for targeted screening: the Inter99 study.

Science.gov (United States)

Glümer, Charlotte; Carstensen, Bendix; Sandbaek, Annelli; Lauritzen, Torsten; Jørgensen, Torben; Borch-Johnsen, Knut

2004-03-01

To develop a simple self-administered questionnaire identifying individuals with undiagnosed diabetes with a sensitivity of 75% and minimizing the high-risk group needing subsequent testing. A population-based sample (Inter99 study) of 6,784 individuals aged 30-60 years completed a questionnaire on diabetes-related symptoms and risk factors. The participants underwent an oral glucose tolerance test. The risk score was derived from the first half and validated on the second half of the study population. External validation was performed based on the Danish Anglo-Danish-Dutch Study of Intensive Treatment in People with Screen Detected Diabetes in Primary Care (ADDITION) pilot study. The risk score was developed by stepwise backward multiple logistic regression. The final risk score included age, sex, BMI, known hypertension, physical activity at leisure time, and family history of diabetes, items independently and significantly (Pscreening strategy for type 2 diabetes, decreasing the numbers of subsequent tests and thereby possibly minimizing the economical and personal costs of the screening strategy.

Further examination of embedded performance validity indicators for the Conners' Continuous Performance Test and Brief Test of Attention in a large outpatient clinical sample.

Science.gov (United States)

Sharland, Michael J; Waring, Stephen C; Johnson, Brian P; Taran, Allise M; Rusin, Travis A; Pattock, Andrew M; Palcher, Jeanette A

2018-01-01

Assessing test performance validity is a standard clinical practice and although studies have examined the utility of cognitive/memory measures, few have examined attention measures as indicators of performance validity beyond the Reliable Digit Span. The current study further investigates the classification probability of embedded Performance Validity Tests (PVTs) within the Brief Test of Attention (BTA) and the Conners' Continuous Performance Test (CPT-II), in a large clinical sample. This was a retrospective study of 615 patients consecutively referred for comprehensive outpatient neuropsychological evaluation. Non-credible performance was defined two ways: failure on one or more PVTs and failure on two or more PVTs. Classification probability of the BTA and CPT-II into non-credible groups was assessed. Sensitivity, specificity, positive predictive value, and negative predictive value were derived to identify clinically relevant cut-off scores. When using failure on two or more PVTs as the indicator for non-credible responding compared to failure on one or more PVTs, highest classification probability, or area under the curve (AUC), was achieved by the BTA (AUC = .87 vs. .79). CPT-II Omission, Commission, and Total Errors exhibited higher classification probability as well. Overall, these findings corroborate previous findings, extending them to a large clinical sample. BTA and CPT-II are useful embedded performance validity indicators within a clinical battery but should not be used in isolation without other performance validity indicators.
Tests on CANDU fuel elements sheath samples

International Nuclear Information System (INIS)

Ionescu, S.; Uta, O.; Mincu, M.; Prisecaru, I.

2016-01-01

This work is a study of the behavior of CANDU fuel elements after irradiation. The tests are made on ring samples taken from fuel cladding in INR Pitesti. This paper presents the results of examinations performed in the Post Irradiation Examination Laboratory. By metallographic and ceramographic examination we determinate that the hydride precipitates are orientated parallel to the cladding surface. A content of hydrogen of about 120 ppm was estimated. After the preliminary tests, ring samples were cut from the fuel rod, and were subject of tensile test on an INSTRON 5569 model machine in order to evaluate the changes of their mechanical properties as consequence of irradiation. Scanning electron microscopy was performed on a microscope model TESCAN MIRA II LMU CS with Schottky FE emitter and variable pressure. The analysis shows that the central zone has deeper dimples, whereas on the outer zone, the dimples are tilted and smaller. (authors)
Associations between cadmium exposure and neurocognitive test scores in a cross-sectional study of US adults.

Science.gov (United States)

Ciesielski, Timothy; Bellinger, David C; Schwartz, Joel; Hauser, Russ; Wright, Robert O

2013-02-05

Low-level environmental cadmium exposure and neurotoxicity has not been well studied in adults. Our goal was to evaluate associations between neurocognitive exam scores and a biomarker of cumulative cadmium exposure among adults in the Third National Health and Nutrition Examination Survey (NHANES III). NHANES III is a nationally representative cross-sectional survey of the U.S. population conducted between 1988 and 1994. We analyzed data from a subset of participants, age 20-59, who participated in a computer-based neurocognitive evaluation. There were four outcome measures: the Simple Reaction Time Test (SRTT: visual motor speed), the Symbol Digit Substitution Test (SDST: attention/perception), the Serial Digit Learning Test (SDLT) trials-to-criterion, and the SDLT total-error-score (SDLT-tests: learning recall/short-term memory). We fit multivariable-adjusted models to estimate associations between urinary cadmium concentrations and test scores. 5662 participants underwent neurocognitive screening, and 5572 (98%) of these had a urinary cadmium level available. Prior to multivariable-adjustment, higher urinary cadmium concentration was associated with worse performance in each of the 4 outcomes. After multivariable-adjustment most of these relationships were not significant, and age was the most influential variable in reducing the association magnitudes. However among never-smokers with no known occupational cadmium exposure the relationship between urinary cadmium and SDST score (attention/perception) was significant: a 1 μg/L increase in urinary cadmium corresponded to a 1.93% (95%CI: 0.05, 3.81) decrement in performance. These results suggest that higher cumulative cadmium exposure in adults may be related to subtly decreased performance in tasks requiring attention and perception, particularly among those adults whose cadmium exposure is primarily though diet (no smoking or work based cadmium exposure). This association was observed among exposure levels
COMPARISON BETWEEN WOOD DRYING DEFECT SCORES: SPECIMEN TESTING X ANALYSIS OF KILN-DRIED BOARDS

Directory of Open Access Journals (Sweden)

Djeison Cesar Batista

2015-04-01

Full Text Available It is important to develop drying technologies for Eucalyptus grandis lumber, which is one of the most planted species of this genus in Brazil and plays an important role as raw material for the wood industry. The general aim of this work was to assess the conventional kiln drying of juvenile wood of three clones of Eucalyptus grandis. The specific aims were to compare the behavior between: i drying defects indicated by tests with wood specimens and conventional kiln-dried boards; and ii physical properties and the drying quality. Five 11-year-old trees of each clone were felled, and only flatsawn boards of the first log were used. Basic density and total shrinkage were determined, and the drying test with wood specimens at 100 °C was carried out. Kiln drying of boards was performed, and initial and final moisture content, moisture gradient in thickness, drying stresses and drying defects were assessed. The defect scoring method was used to verify the behavior between the defects detected by specimen testing and the defects detected in kiln-dried boards. As main results, the drying schedule was too severe for the wood, resulting in a high level of boards with defects. The behavior between the defects in the drying test with specimens and the defects of kiln-dried boards was different, there was no correspondence, according to the defect scoring method.
On Wasserstein Two-Sample Testing and Related Families of Nonparametric Tests

Directory of Open Access Journals (Sweden)

Aaditya Ramdas

2017-01-01

Full Text Available Nonparametric two-sample or homogeneity testing is a decision theoretic problem that involves identifying differences between two random variables without making parametric assumptions about their underlying distributions. The literature is old and rich, with a wide variety of statistics having being designed and analyzed, both for the unidimensional and the multivariate setting. Inthisshortsurvey,wefocusonteststatisticsthatinvolvetheWassersteindistance. Usingan entropic smoothing of the Wasserstein distance, we connect these to very different tests including multivariate methods involving energy statistics and kernel based maximum mean discrepancy and univariate methods like the Kolmogorov–Smirnov test, probability or quantile (PP/QQ plots and receiver operating characteristic or ordinal dominance (ROC/ODC curves. Some observations are implicit in the literature, while others seem to have not been noticed thus far. Given nonparametric two-sample testing’s classical and continued importance, we aim to provide useful connections for theorists and practitioners familiar with one subset of methods but not others.
The effect of an intervention program on functional movement screen test scores in mixed martial arts athletes.

Science.gov (United States)

Bodden, Jamie G; Needham, Robert A; Chockalingam, Nachiappan

2015-01-01

This study assessed the basic fundamental movements of mixed martial arts (MMA) athletes using the functional movement screen (FMS) assessment and determined if an intervention program was successful at improving results. Participants were placed into 1 of the 2 groups: intervention and control groups. The intervention group was required to complete a corrective exercise program 4 times per week, and all participants were asked to continue their usual MMA training routine. A mid-intervention FMS test was included to examine if successful results were noticed sooner than the 8-week period. Results highlighted differences in FMS test scores between the control group and intervention group (p = 0.006). Post hoc testing revealed a significant increase in the FMS score of the intervention group between weeks 0 and 8 (p = 0.00) and weeks 0 and 4 (p = 0.00) and no significant increase between weeks 4 and 8 (p = 1.00). A χ analysis revealed that the intervention group participants were more likely to have an FMS score >14 than participants in the control group at week 4 (χ = 7.29, p < 0.01) and week 8 (χ = 5.2, p ≤ 0.05). Finally, a greater number of participants in the intervention group were free from asymmetry at week 4 and week 8 compared with the initial test period. The results of the study suggested that a 4-week intervention program was sufficient at improving FMS scores. Most if not all, the movements covered on the FMS relate to many aspects of MMA training. The knowledge that the FMS can identify movement dysfunctions and, furthermore, the fact that the issues can be improved through a standardized intervention program could be advantageous to MMA coaches, thus, providing the opportunity to adapt and implement new additions to training programs.
Walk Score® and Transit Score® and Walking in the Multi-Ethnic Study of Atherosclerosis

Science.gov (United States)

Hirsch, Jana A.; Moore, Kari A.; Evenson, Kelly R.; Rodriguez, Daniel A; Diez Roux, Ana V.

2013-01-01

Background Walk Score® and Transit Score® are open-source measures of the neighborhood built environment to support walking (“walkability”) and access to transportation. Purpose To investigate associations of Street Smart Walk Score and Transit Score with self-reported transport and leisure walking using data from a large multi-city and diverse population-based sample of adults. Methods Data from a sample of 4552 residents of Baltimore MD; Chicago IL; Forsyth County NC; Los Angeles CA; New York NY; and St. Paul MN from the Multi-Ethnic Study of Atherosclerosis (2010–2012) were linked to Walk Score and Transit Score (collected in 2012). Logistic and linear regression models estimated ORs of not walking and mean differences in minutes walked, respectively, associated with continuous and categoric Walk Score and Transit Score. All analyses were conducted in 2012. Results After adjustment for site, key sociodemographic, and health variables, a higher Walk Score was associated with lower odds of not walking for transport and more minutes/week of transport walking. Compared to those in a “walker’s paradise,” lower categories of Walk Score were associated with a linear increase in odds of not transport walking and a decline in minutes of leisure walking. An increase in Transit Score was associated with lower odds of not transport walking or leisure walking, and additional minutes/week of leisure walking. Conclusions Walk Score and Transit Score appear to be useful as measures of walkability in analyses of neighborhood effects. PMID:23867022
Increasing the reliability of the fluid/crystallized difference score from the Kaufman Adolescent and Adult Intelligence Test with reliable component analysis.

Science.gov (United States)

Caruso, J C

2001-06-01

The unreliability of difference scores is a well documented phenomenon in the social sciences and has led researchers and practitioners to interpret differences cautiously, if at all. In the case of the Kaufman Adult and Adolescent Intelligence Test (KAIT), the unreliability of the difference between the Fluid IQ and the Crystallized IQ is due to the high correlation between the two scales. The consequences of the lack of precision with which differences are identified are wide confidence intervals and unpowerful significance tests (i.e., large differences are required to be declared statistically significant). Reliable component analysis (RCA) was performed on the subtests of the KAIT in order to address these problems. RCA is a new data reduction technique that results in uncorrelated component scores with maximum proportions of reliable variance. Results indicate that the scores defined by RCA have discriminant and convergent validity (with respect to the equally weighted scores) and that differences between the scores, derived from a single testing session, were more reliable than differences derived from equal weighting for each age group (11-14 years, 15-34 years, 35-85+ years). This reliability advantage results in narrower confidence intervals around difference scores and smaller differences required for statistical significance.
Optimizing Scoring and Sampling Methods for Assessing Built Neighborhood Environment Quality in Residential Areas

Science.gov (United States)

Adu-Brimpong, Joel; Coffey, Nathan; Ayers, Colby; Berrigan, David; Yingling, Leah R.; Thomas, Samantha; Mitchell, Valerie; Ahuja, Chaarushi; Rivers, Joshua; Hartz, Jacob; Powell-Wiley, Tiffany M.

2017-01-01

Optimization of existing measurement tools is necessary to explore links between aspects of the neighborhood built environment and health behaviors or outcomes. We evaluate a scoring method for virtual neighborhood audits utilizing the Active Neighborhood Checklist (the Checklist), a neighborhood audit measure, and assess street segment representativeness in low-income neighborhoods. Eighty-two home neighborhoods of Washington, D.C. Cardiovascular Health/Needs Assessment (NCT01927783) participants were audited using Google Street View imagery and the Checklist (five sections with 89 total questions). Twelve street segments per home address were assessed for (1) Land-Use Type; (2) Public Transportation Availability; (3) Street Characteristics; (4) Environment Quality and (5) Sidewalks/Walking/Biking features. Checklist items were scored 0–2 points/question. A combinations algorithm was developed to assess street segments’ representativeness. Spearman correlations were calculated between built environment quality scores and Walk Score®, a validated neighborhood walkability measure. Street segment quality scores ranged 10–47 (Mean = 29.4 ± 6.9) and overall neighborhood quality scores, 172–475 (Mean = 352.3 ± 63.6). Walk scores® ranged 0–91 (Mean = 46.7 ± 26.3). Street segment combinations’ correlation coefficients ranged 0.75–1.0. Significant positive correlations were found between overall neighborhood quality scores, four of the five Checklist subsection scores, and Walk Scores® (r = 0.62, p health behaviors and outcomes. PMID:28282878
MALDI-TOF mass spectrometry and high-consequence bacteria: safety and stability of biothreat bacterial sample testing in clinical diagnostic laboratories.

Science.gov (United States)

Tracz, Dobryan M; Tober, Ashley D; Antonation, Kym S; Corbett, Cindi R

2018-03-01

We considered the application of MALDI-TOF mass spectrometry for BSL-3 bacterial diagnostics, with a focus on the biosafety of live-culture direct-colony testing and the stability of stored extracts. Biosafety level 2 (BSL-2) bacterial species were used as surrogates for BSL-3 high-consequence pathogens in all live-culture MALDI-TOF experiments. Viable BSL-2 bacteria were isolated from MALDI-TOF mass spectrometry target plates after 'direct-colony' and 'on-plate' extraction testing, suggesting that the matrix chemicals alone cannot be considered sufficient to inactivate bacterial culture and spores in all samples. Sampling of the instrument interior after direct-colony analysis did not recover viable organisms, suggesting that any potential risks to the laboratory technician are associated with preparation of the MALDI-TOF target plate before or after testing. Secondly, a long-term stability study (3 years) of stored MALDI-TOF extracts showed that match scores can decrease below the threshold for reliable species identification (<1.7), which has implications for proficiency test panel item storage and distribution.
Effects of Public Preschool Expenditures on the Test Scores of 4 Graders: Evidence from TIMSS.

Science.gov (United States)

Waldfogel, Jane; Zhai, Fuhua

2008-02-01

This study examines the effects of public preschool expenditures on the math and science scores of 4(th) graders, holding constant child, family, and school characteristics, other relevant social expenditures, and country and year effects, in seven Organization for Economic Co-operation and Development (OECD) countries -- Australia, Japan, Netherlands, New Zealand, Norway, U.K., and U.S -- using data from the 1995 and 2003 Trends in International Mathematics and Science Study (TIMSS). Our results indicate that there are small but significant positive effects of public preschool expenditures on the math and science scores of 4(th) graders and preschool expenditures reduce the risk of children scoring at the low level of proficiency. We also find some evidence that children from low-resource homes and homes where the test language is not always spoken may tend to gain more from increased public preschool expenditures than other children,.
Optimizing Scoring and Sampling Methods for Assessing Built Neighborhood Environment Quality in Residential Areas

Directory of Open Access Journals (Sweden)

Joel Adu-Brimpong

2017-03-01

Full Text Available Optimization of existing measurement tools is necessary to explore links between aspects of the neighborhood built environment and health behaviors or outcomes. We evaluate a scoring method for virtual neighborhood audits utilizing the Active Neighborhood Checklist (the Checklist, a neighborhood audit measure, and assess street segment representativeness in low-income neighborhoods. Eighty-two home neighborhoods of Washington, D.C. Cardiovascular Health/Needs Assessment (NCT01927783 participants were audited using Google Street View imagery and the Checklist (five sections with 89 total questions. Twelve street segments per home address were assessed for (1 Land-Use Type; (2 Public Transportation Availability; (3 Street Characteristics; (4 Environment Quality and (5 Sidewalks/Walking/Biking features. Checklist items were scored 0–2 points/question. A combinations algorithm was developed to assess street segments’ representativeness. Spearman correlations were calculated between built environment quality scores and Walk Score®, a validated neighborhood walkability measure. Street segment quality scores ranged 10–47 (Mean = 29.4 ± 6.9 and overall neighborhood quality scores, 172–475 (Mean = 352.3 ± 63.6. Walk scores® ranged 0–91 (Mean = 46.7 ± 26.3. Street segment combinations’ correlation coefficients ranged 0.75–1.0. Significant positive correlations were found between overall neighborhood quality scores, four of the five Checklist subsection scores, and Walk Scores® (r = 0.62, p < 0.001. This scoring method adequately captures neighborhood features in low-income, residential areas and may aid in delineating impact of specific built environment features on health behaviors and outcomes.
The Validity of Graduate Management Admission Test Scores: A Summary of Studies Conducted from 1997 to 2004

Science.gov (United States)

Talento-Miller, Eileen; Rudner, Lawrence M.

2008-01-01

The validity of Graduate Management Admission Test (GMAT) scores is examined by summarizing 273 studies conducted between 1997 and 2004. Each of the studies was conducted through the Validity Study Service of the test sponsor and contained identical variables and statistical methods. Validity coefficients from each of the studies were corrected…
Your move: The effect of chess on mathematics test scores.

Science.gov (United States)

Rosholm, Michael; Mikkelsen, Mai Bjørnskov; Gumede, Kamilla

2017-01-01

We analyse the effect of substituting a weekly mathematics lesson in primary school grades 1-3 with a lesson in mathematics based on chess instruction. We use data from the City of Aarhus in Denmark, combining test score data with a comprehensive data set obtained from administrative registers. We use two different methodological approaches to identify and estimate treatment effects and we tend to find positive effects, indicating that knowledge acquired through chess play can be transferred to the domain of mathematics. We also find larger impacts for unhappy children and children who are bored in school, perhaps because chess instruction facilitates learning by providing an alternative approach to mathematics for these children. The results are encouraging and suggest that chess may be an important and effective tool for improving mathematical capacity in young students.
Estimation of sample size and testing power (Part 3).

Science.gov (United States)

Hu, Liang-ping; Bao, Xiao-lei; Guan, Xue; Zhou, Shi-guo

2011-12-01

This article introduces the definition and sample size estimation of three special tests (namely, non-inferiority test, equivalence test and superiority test) for qualitative data with the design of one factor with two levels having a binary response variable. Non-inferiority test refers to the research design of which the objective is to verify that the efficacy of the experimental drug is not clinically inferior to that of the positive control drug. Equivalence test refers to the research design of which the objective is to verify that the experimental drug and the control drug have clinically equivalent efficacy. Superiority test refers to the research design of which the objective is to verify that the efficacy of the experimental drug is clinically superior to that of the control drug. By specific examples, this article introduces formulas of sample size estimation for the three special tests, and their SAS realization in detail.
Group SkSP-R sampling plan for accelerated life tests

Indian Academy of Sciences (India)

Muhammad Aslam

2017-09-15

Sep 15, 2017 ... SkSP-R sampling; life test; Weibull distribution; producer's risk; ... designed a sampling plan under a time-truncated life test .... adjusted using an acceleration factor. ... where P is the probability of lot acceptance for a single.
Sway Area and Velocity Correlated With MobileMat Balance Error Scoring System (BESS) Scores.

Science.gov (United States)

Caccese, Jaclyn B; Buckley, Thomas A; Kaminski, Thomas W

2016-08-01

The Balance Error Scoring System (BESS) is often used for sport-related concussion balance assessment. However, moderate intratester and intertester reliability may cause low initial sensitivity, suggesting that a more objective balance assessment method is needed. The MobileMat BESS was designed for objective BESS scoring, but the outcome measures must be validated with reliable balance measures. Thus, the purpose of this investigation was to compare MobileMat BESS scores to linear and nonlinear measures of balance. Eighty-eight healthy collegiate student-athletes (age: 20.0 ± 1.4 y, height: 177.7 ± 10.7 cm, mass: 74.8 ± 13.7 kg) completed the MobileMat BESS. MobileMat BESS scores were compared with 95% area, sway velocity, approximate entropy, and sample entropy. MobileMat BESS scores were significantly correlated with 95% area for single-leg (r = .332) and tandem firm (r = .474), and double-leg foam (r = .660); and with sway velocity for single-leg (r = .406) and tandem firm (r = .601), and double-leg (r = .575) and single-leg foam (r = .434). MobileMat BESS scores were not correlated with approximate or sample entropy. MobileMat BESS scores were low to moderately correlated with linear measures, suggesting the ability to identify changes in the center of mass-center of pressure relationship, but not higher-order processing associated with nonlinear measures. These results suggest that the MobileMat BESS may be a clinically-useful tool that provides objective linear balance measures.
Adaptive designs for the one-sample log-rank test.

Science.gov (United States)

Schmidt, Rene; Faldum, Andreas; Kwiecien, Robert

2017-09-22

Traditional designs in phase IIa cancer trials are single-arm designs with a binary outcome, for example, tumor response. In some settings, however, a time-to-event endpoint might appear more appropriate, particularly in the presence of loss to follow-up. Then the one-sample log-rank test might be the method of choice. It allows to compare the survival curve of the patients under treatment to a prespecified reference survival curve. The reference curve usually represents the expected survival under standard of the care. In this work, convergence of the one-sample log-rank statistic to Brownian motion is proven using Rebolledo's martingale central limit theorem while accounting for staggered entry times of the patients. On this basis, a confirmatory adaptive one-sample log-rank test is proposed where provision is made for data dependent sample size reassessment. The focus is to apply the inverse normal method. This is done in two different directions. The first strategy exploits the independent increments property of the one-sample log-rank statistic. The second strategy is based on the patient-wise separation principle. It is shown by simulation that the proposed adaptive test might help to rescue an underpowered trial and at the same time lowers the average sample number (ASN) under the null hypothesis as compared to a single-stage fixed sample design. © 2017, The International Biometric Society.
Speech-discrimination scores modeled as a binomial variable.

Science.gov (United States)

Thornton, A R; Raffin, M J

1978-09-01

Many studies have reported variability data for tests of speech discrimination, and the disparate results of these studies have not been given a simple explanation. Arguments over the relative merits of 25- vs 50-word tests have ignored the basic mathematical properties inherent in the use of percentage scores. The present study models performance on clinical tests of speech discrimination as a binomial variable. A binomial model was developed, and some of its characteristics were tested against data from 4120 scores obtained on the CID Auditory Test W-22. A table for determining significant deviations between scores was generated and compared to observed differences in half-list scores for the W-22 tests. Good agreement was found between predicted and observed values. Implications of the binomial characteristics of speech-discrimination scores are discussed.
A two-sample Bayesian t-test for microarray data

Directory of Open Access Journals (Sweden)

Dimmic Matthew W

2006-03-01

Full Text Available Abstract Background Determining whether a gene is differentially expressed in two different samples remains an important statistical problem. Prior work in this area has featured the use of t-tests with pooled estimates of the sample variance based on similarly expressed genes. These methods do not display consistent behavior across the entire range of pooling and can be biased when the prior hyperparameters are specified heuristically. Results A two-sample Bayesian t-test is proposed for use in determining whether a gene is differentially expressed in two different samples. The test method is an extension of earlier work that made use of point estimates for the variance. The method proposed here explicitly calculates in analytic form the marginal distribution for the difference in the mean expression of two samples, obviating the need for point estimates of the variance without recourse to posterior simulation. The prior distribution involves a single hyperparameter that can be calculated in a statistically rigorous manner, making clear the connection between the prior degrees of freedom and prior variance. Conclusion The test is easy to understand and implement and application to both real and simulated data shows that the method has equal or greater power compared to the previous method and demonstrates consistent Type I error rates. The test is generally applicable outside the microarray field to any situation where prior information about the variance is available and is not limited to cases where estimates of the variance are based on many similar observations.

Comparison of formula and number-right scoring in undergraduate medical training: a Rasch model analysis.

Science.gov (United States)

Cecilio-Fernandes, Dario; Medema, Harro; Collares, Carlos Fernando; Schuwirth, Lambert; Cohen-Schotanus, Janke; Tio, René A

2017-11-09

Progress testing is an assessment tool used to periodically assess all students at the end-of-curriculum level. Because students cannot know everything, it is important that they recognize their lack of knowledge. For that reason, the formula-scoring method has usually been used. However, where partial knowledge needs to be taken into account, the number-right scoring method is used. Research comparing both methods has yielded conflicting results. As far as we know, in all these studies, Classical Test Theory or Generalizability Theory was used to analyze the data. In contrast to these studies, we will explore the use of the Rasch model to compare both methods. A 2 × 2 crossover design was used in a study where 298 students from four medical schools participated. A sample of 200 previously used questions from the progress tests was selected. The data were analyzed using the Rasch model, which provides fit parameters, reliability coefficients, and response option analysis. The fit parameters were in the optimal interval ranging from 0.50 to 1.50, and the means were around 1.00. The person and item reliability coefficients were higher in the number-right condition than in the formula-scoring condition. The response option analysis showed that the majority of dysfunctional items emerged in the formula-scoring condition. The findings of this study support the use of number-right scoring over formula scoring. Rasch model analyses showed that tests with number-right scoring have better psychometric properties than formula scoring. However, choosing the appropriate scoring method should depend not only on psychometric properties but also on self-directed test-taking strategies and metacognitive skills.
Bovine milk sampling efficiency for pregnancy-associated glycoproteins (PAG) detection test

Energy Technology Data Exchange (ETDEWEB)

Silva, H. K. da; Cassoli, L.D.; Pantoja, J.F.C.; Cerqueira, P.H.R.; Coitinho, T.B.; Machado, P.F.

2016-07-01

Two experiments were conducted to verify whether the time of day at which a milk sample is collected and the possible carryover in the milking system may affect pregnancy-associated glycoproteins (PAG) levels and, consequently, the pregnancy test results in dairy cows. In experiment one, we evaluated the effect of time of day at which the milk sample is collected from 51 cows. In experiment two, which evaluated the possible occurrence of carryover in the milk meter milking system, milk samples from 94 cows belonging to two different farms were used. The samples were subjected to pregnancy test using ELISA methodology to measure PAG concentrations and to classify the samples as positive (pregnant), negative (nonpregnant), or suspicious (recheck). We found that the time of milking did not affect the PAG levels. As to the occurrence of carryover in the milk meter, the PAG levels of the samples collected from Farm-2 were heavily influenced by a carryover effect compared with the samples from Farm-1. Thus, milk samples submitted to a pregnancy test can be collected during the morning or the evening milking. When the sample is collected from the milk meters, periodic equipment maintenance should be noted, including whether the milk meter is totally drained between different animals’ milking and equipment cleaning between milking is performed correctly to minimize the occurrence of carryover, thereby avoiding the effect on PAG levels and, consequently, the pregnancy test results. Therefore, a single milk sample can be used for both milk quality tests and pregnancy test.
Bovine milk sampling efficiency for pregnancy-associated glycoproteins (PAG) detection test

International Nuclear Information System (INIS)

Silva, H. K. da; Cassoli, L.D.; Pantoja, J.F.C.; Cerqueira, P.H.R.; Coitinho, T.B.; Machado, P.F.

2016-01-01

Two experiments were conducted to verify whether the time of day at which a milk sample is collected and the possible carryover in the milking system may affect pregnancy-associated glycoproteins (PAG) levels and, consequently, the pregnancy test results in dairy cows. In experiment one, we evaluated the effect of time of day at which the milk sample is collected from 51 cows. In experiment two, which evaluated the possible occurrence of carryover in the milk meter milking system, milk samples from 94 cows belonging to two different farms were used. The samples were subjected to pregnancy test using ELISA methodology to measure PAG concentrations and to classify the samples as positive (pregnant), negative (nonpregnant), or suspicious (recheck). We found that the time of milking did not affect the PAG levels. As to the occurrence of carryover in the milk meter, the PAG levels of the samples collected from Farm-2 were heavily influenced by a carryover effect compared with the samples from Farm-1. Thus, milk samples submitted to a pregnancy test can be collected during the morning or the evening milking. When the sample is collected from the milk meters, periodic equipment maintenance should be noted, including whether the milk meter is totally drained between different animals’ milking and equipment cleaning between milking is performed correctly to minimize the occurrence of carryover, thereby avoiding the effect on PAG levels and, consequently, the pregnancy test results. Therefore, a single milk sample can be used for both milk quality tests and pregnancy test.
Quality standards for sample collection in coagulation testing.

Science.gov (United States)

Lippi, Giuseppe; Salvagno, Gian Luca; Montagnana, Martina; Lima-Oliveira, Gabriel; Guidi, Gian Cesare; Favaloro, Emmanuel J

2012-09-01

Preanalytical activities, especially those directly connected with blood sample collection and handling, are the most vulnerable steps throughout the testing process. The receipt of unsuitable samples is commonplace in laboratory practice and represents a serious problem, given the reliability of test results can be adversely compromised following analysis of these specimens. The basic criteria for an appropriate and safe venipuncture are nearly identical to those used for collecting blood for clinical chemistry and immunochemistry testing, and entail proper patient identification, use of the correct technique, as well as appropriate devices and needles. There are, however, some peculiar aspects, which are deemed to be particularly critical when collecting quality specimens for clot-based tests, and these require clearer recognition. These include prevention of prolonged venous stasis, collection of nonhemolyzed specimens, order of draw, and appropriate filling and mixing of the primary collection tubes. All of these important preanalytical issues are discussed in this article, and evidence-based suggestions as well as recommendations on how to obtain a high-quality sample for coagulation testing are also illustrated. We have also performed an investigation aimed to identify variation of test results due to underfilling of primary blood tubes, and have identified a clinically significant bias in test results when tubes are drawn at less than 89% of total fill for activated partial thromboplastin time, less than 78% for fibrinogen, and less than 67% for coagulation factor VIII, whereas prothrombin time and activated protein C resistance remain relatively reliable even in tubes drawn at 67% of the nominal volume. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
The Sinonasal Outcome Test 22 score in persons without chronic rhinosinusitis

DEFF Research Database (Denmark)

Lange, Bibi; Thilsing, T; Baelum, J

2016-01-01

-67 with a mean score of 10.5 (CI: 9.1 - 11.9) and the median score was 7. Persons with allergic rhinitis and blue collar workers had a significant higher score. CONCLUSION: The median value of 7 is taken as the normal SNOT 22 score in persons without CRS and can be used as a reference in clinical settings...... and research. Allergic rhinitis and occupation affects SNOT 22 in persons without CRS. This article is protected by copyright. All rights reserved....
Mineralogic and petrologic investigation of post-test core samples from the Spent Fuel Test - Climax

International Nuclear Information System (INIS)

Ryerson, F.J.; Beiriger, J.

1985-02-01

We have characterized a suite of samples taken subsequent to the end of the Spent Fuel Test - Climax by petrographic and microanalytical techniques and determined their mineral assemblage, modal properties, and mineral chemistry. The samples were obtained immediately adjacent to the canister borehole at a variety of depths and positions within the canister drift, as well as radially outward from each canister hole. This method of sampling allows variations in post-test mineralogic properties to be evaluated on the basis of (1) depth along a particular canister hole and (2) position within the canister drift, with respect to the heat and radiation sources, and with respect to the pre - test samples. In no case did we find any significant correlation between the mineralogical properties and variables listed above. In short, the Spent Fuel Test - Climax has produced no identifiable mineralogical response in the Climax quartz monzonite. 12 refs., 11 figs., 5 tabs
ISOLOK VALVE ACCEPTANCE TESTING FOR DWPF SME SAMPLING PROCESS

Energy Technology Data Exchange (ETDEWEB)

Edwards, T.; Hera, K.; Coleman, C.; Jones, M.; Wiedenman, B.

2011-12-05

Evaluation of the Defense Waste Processing Facility (DWPF) Chemical Process Cell (CPC) cycle time identified several opportunities to improve the CPC processing time. Of the opportunities, a focus area related to optimizing the equipment and efficiency of the sample turnaround time for DWPF Analytical Laboratory was identified. The Mechanical Systems & Custom Equipment Development (MS&CED) Section of the Savannah River National Laboratory (SRNL) evaluated the possibility of using an Isolok{reg_sign} sampling valve as an alternative to the Hydragard{reg_sign} valve for taking process samples. Previous viability testing was conducted with favorable results using the Isolok sampler and reported in SRNL-STI-2010-00749 (1). This task has the potential to improve operability, reduce maintenance time and decrease CPC cycle time. This report summarizes the results from acceptance testing which was requested in Task Technical Request (TTR) HLW-DWPF-TTR-2010-0036 (2) and which was conducted as outlined in Task Technical and Quality Assurance Plan (TTQAP) SRNL-RP-2011-00145 (3). The Isolok to be tested is the same model which was tested, qualified, and installed in the Sludge Receipt Adjustment Tank (SRAT) sample system. RW-0333P QA requirements apply to this task. This task was to qualify the Isolok sampler for use in the DWPF Slurry Mix Evaporator (SME) sampling process. The Hydragard, which is the current baseline sampling method, was used for comparison to the Isolok sampling data. The Isolok sampler is an air powered grab sampler used to 'pull' a sample volume from a process line. The operation of the sampler is shown in Figure 1. The image on the left shows the Isolok's spool extended into the process line and the image on the right shows the sampler retracted and then dispensing the liquid into the sampling container. To determine tank homogeneity, a Coliwasa sampler was used to grab samples at a high and low location within the mixing tank. Data from
Reassessing the "traditional background hypothesis" for elevated MMPI and MMPI-2 Lie-scale scores.

Science.gov (United States)

Rosen, Gerald M; Baldwin, Scott A; Smith, Ronald E

2016-10-01

The Lie (L) scale of the Minnesota Multiphasic Personality Inventory (MMPI) is widely regarded as a measure of conscious attempts to deny common human foibles and to present oneself in an unrealistically positive light. At the same time, the current MMPI-2 manual states that "traditional" and religious backgrounds can account for elevated L scale scores as high as 65T-79T, thereby tempering impression management interpretations for faith-based individuals. To assess the validity of the traditional background hypothesis, we reviewed 11 published studies that employed the original MMPI with religious samples and found that only 1 obtained an elevated mean L score. We then conducted a meta-analysis of 12 published MMPI-2 studies in which we compared L scores of religious samples to the test normative group. The meta-analysis revealed large between-study heterogeneity (I2 = 87.1), L scale scores for religious samples that were somewhat higher but did not approach the upper limits specified in the MMPI-2 manual, and an overall moderate effect size (d¯ = 0.54, p < .001; 95% confidence interval [0.37, 0.70]). Our analyses indicated that religious-group membership accounts, on average, for elevations on L of about 5 t-score points. Whether these scores reflect conscious "fake good" impression management or religious-based virtuousness remains unanswered. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Classroom Organizational Structure in Fifth Grade Math Classrooms and the Effect on Standardized Test Scores

Science.gov (United States)

Lane, Dallas Marie

2017-01-01

The purpose of this study was to determine if there is a relationship between the classroom organizational structure and MCT2 test scores of fifth-grade math students. The researcher gained insight regarding which structure teachers believe is most beneficial to them and students, and whether or not their belief of classroom organizational…
A risk score for predicting coronary artery disease in women with angina pectoris and abnormal stress test finding.

Science.gov (United States)

Lo, Monica Y; Bonthala, Nirupama; Holper, Elizabeth M; Banks, Kamakki; Murphy, Sabina A; McGuire, Darren K; de Lemos, James A; Khera, Amit

2013-03-15

Women with angina pectoris and abnormal stress test findings commonly have no epicardial coronary artery disease (CAD) at catheterization. The aim of the present study was to develop a risk score to predict obstructive CAD in such patients. Data were analyzed from 337 consecutive women with angina pectoris and abnormal stress test findings who underwent cardiac catheterization at our center from 2003 to 2007. Forward selection multivariate logistic regression analysis was used to identify the independent predictors of CAD, defined by ≥50% diameter stenosis in ≥1 epicardial coronary artery. The independent predictors included age ≥55 years (odds ratio 2.3, 95% confidence interval 1.3 to 4.0), body mass index stress imaging (odds ratio 2.8, 95% confidence interval 1.5 to 5.5), and exercise capacity statistic of 0.745 (95% confidence interval 0.70 to 0.79), and an optimized cutpoint of a score of ≤2 included 62% of the subjects and had a negative predictive value of 80%. In conclusion, a simple clinical risk score of 7 characteristics can help differentiate those more or less likely to have CAD among women with angina pectoris and abnormal stress test findings. This tool, if validated, could help to guide testing strategies in women with angina pectoris. Copyright © 2013 Elsevier Inc. All rights reserved.
An Investigation of the Sampling Distributions of Equating Coefficients.

Science.gov (United States)

Baker, Frank B.

1996-01-01

Using the characteristic curve method for dichotomously scored test items, the sampling distributions of equating coefficients were examined. Simulations indicate that for the equating conditions studied, the sampling distributions of the equating coefficients appear to have acceptable characteristics, suggesting confidence in the values obtained…
Your move: The effect of chess on mathematics test scores

DEFF Research Database (Denmark)

Rosholm, Michael; Mikkelsen, Mai Bjørnskov; Gumede, Kamilla Trille

2017-01-01

We analyse the effect of substituting a weekly mathematics lesson in primary school grades 1–3 with a lesson in mathematics based on chess instruction. We use data from the City of Aarhus in Denmark, combining test score data with a comprehensive data set obtained from administrative registers. We...... use two different methodological approaches to identify and estimate treatment effects and we tend to find positive effects, indicating that knowledge acquired through chess play can be transferred to the domain of mathematics. We also find larger impacts for unhappy children and children who...... are bored in school, perhaps because chess instruction facilitates learning by providing an alternative approach to mathematics for these children. The results are encouraging and suggest that chess may be an important and effective tool for improving mathematical capacity in young students....
Your move: The effect of chess on mathematics test scores.

Directory of Open Access Journals (Sweden)

Michael Rosholm

Full Text Available We analyse the effect of substituting a weekly mathematics lesson in primary school grades 1-3 with a lesson in mathematics based on chess instruction. We use data from the City of Aarhus in Denmark, combining test score data with a comprehensive data set obtained from administrative registers. We use two different methodological approaches to identify and estimate treatment effects and we tend to find positive effects, indicating that knowledge acquired through chess play can be transferred to the domain of mathematics. We also find larger impacts for unhappy children and children who are bored in school, perhaps because chess instruction facilitates learning by providing an alternative approach to mathematics for these children. The results are encouraging and suggest that chess may be an important and effective tool for improving mathematical capacity in young students.
A Simple Risk Score for Identifying Individuals with Impaired Fasting Glucose in the Southern Chinese Population

Directory of Open Access Journals (Sweden)

Hui Wang

2015-01-01

Full Text Available This study aimed to develop and validate a simple risk score for detecting individuals with impaired fasting glucose (IFG among the Southern Chinese population. A sample of participants aged ≥20 years and without known diabetes from the 2006–2007 Guangzhou diabetes cross-sectional survey was used to develop separate risk scores for men and women. The participants completed a self-administered structured questionnaire and underwent simple clinical measurements. The risk scores were developed by multiple logistic regression analysis. External validation was performed based on three other studies: the 2007 Zhuhai rural population-based study, the 2008–2010 Guangzhou diabetes cross-sectional study and the 2007 Tibet population-based study. Performance of the scores was measured with the Hosmer-Lemeshow goodness-of-fit test and ROC c-statistic. Age, waist circumference, body mass index and family history of diabetes were included in the risk score for both men and women, with the additional factor of hypertension for men. The ROC c-statistic was 0.70 for both men and women in the derivation samples. Risk scores of ≥28 for men and ≥18 for women showed respective sensitivity, specificity, positive predictive value and negative predictive value of 56.6%, 71.7%, 13.0% and 96.0% for men and 68.7%, 60.2%, 11% and 96.0% for women in the derivation population. The scores performed comparably with the Zhuhai rural sample and the 2008–2010 Guangzhou urban samples but poorly in the Tibet sample. The performance of pre-existing USA, Shanghai, and Chengdu risk scores was poorer in our population than in their original study populations. The results suggest that the developed simple IFG risk scores can be generalized in Guangzhou city and nearby rural regions and may help primary health care workers to identify individuals with IFG in their practice.
Visual-Constructional Ability in Individuals with Severe Obesity: Rey Complex Figure Test Accuracy and the Q-Score

Directory of Open Access Journals (Sweden)

Hanna L. Sargénius

2017-09-01

Full Text Available The aims of this study were to investigate visual-construction and organizational strategy among individuals with severe obesity, as measured by the Rey Complex Figure Test (RCFT, and to examine the validity of the Q-score as a measure for the quality of performance on the RCFT. Ninety-six non-demented morbidly obese (MO patients and 100 healthy controls (HC completed the RCFT. Their performance was calculated by applying the standard scoring criteria. The quality of the copying process was evaluated per the directions of the Q-score scoring system. Results revealed that the MO did not perform significantly lower than the HC on Copy accuracy (mean difference −0.302, CI −1.374 to 0.769, p = 0.579. In contrast, the groups did statistically differ from each other, with MO performing poorer than the HC on the Q-score (mean −1.784, CI −3.237 to −0.331, p = 0.016 and the Unit points (mean −1.409, CI −2.291 to −0.528, p = 0.002, but not on the Order points score (mean −0.351, CI −0.994 to 0.293, p = 0.284. Differences on the Unit score and the Q-score were slightly reduced when adjusting for gender, age, and education. This study presents evidence supporting the presence of inefficiency in visuospatial constructional ability among MO patients. We believe we have found an indication that the Q-score captures a wider range of cognitive processes that are not described by traditional scoring methods. Rather than considering accuracy and placement of the different elements only, the Q-score focuses more on how the subject has approached the task.
CHARACTERIZATION AND ACTUAL WASTE TEST WITH TANK 5F SAMPLES

International Nuclear Information System (INIS)

Fletcher, D.

2007-01-01

The initial phase of bulk waste removal operations was recently completed in Tank 5F. Video inspection of the tank indicates several mounds of sludge still remain in the tank. Additionally, a mound of white solids was observed under Riser 5. In support of chemical cleaning and heel removal programs, samples of the sludge and the mound of white solids were obtained from the tank for characterization and testing. A core sample of the sludge and Super Snapper sample of the white solids were characterized. A supernate dip sample from Tank 7F was also characterized. A portion of the sludge was used in two tank cleaning tests using oxalic acid at 50 C and 75 C. The filtered oxalic acid from the tank cleaning tests was subsequently neutralized by addition to a simulated Tank 7F supernate. Solids and liquid samples from the tank cleaning test and neutralization test were characterized. A separate report documents the results of the gas generation from the tank cleaning test using oxalic acid and Tank 5F sludge. The characterization results for the Tank 5F sludge sample (FTF-05-06-55) appear quite good with respect to the tight precision of the sample replicates, good results for the glass standards, and minimal contamination found in the blanks and glass standards. The aqua regia and sodium peroxide fusion data also show good agreement between the two dissolution methods. Iron dominates the sludge composition with other major contributors being uranium, manganese, nickel, sodium, aluminum, and silicon. The low sodium value for the sludge reflects the absence of supernate present in the sample due to the core sampler employed for obtaining the sample. The XRD and CSEM results for the Super Snapper salt sample (i.e., white solids) from Tank 5F (FTF-05-07-1) indicate the material contains hydrated sodium carbonate and bicarbonate salts along with some aluminum hydroxide. These compounds likely precipitated from the supernate in the tank. A solubility test showed the material
Systematic review of the evidence for Trails B cut-off scores in assessing fitness-to-drive.

Science.gov (United States)

Roy, Mononita; Molnar, Frank

2013-01-01

Fitness-to-drive guidelines recommend employing the Trail Making B Test (a.k.a. Trails B), but do not provide guidance regarding cut-off scores. There is ongoing debate regarding the optimal cut-off score on the Trails B test. The objective of this study was to address this controversy by systematically reviewing the evidence for specific Trails B cut-off scores (e.g., cut-offs in both time to completion and number of errors) with respect to fitness-to-drive. Systematic review of all prospective cohort, retrospective cohort, case-control, correlation, and cross-sectional studies reporting the ability of the Trails B to predict driving safety that were published in English-language, peer-reviewed journals. Forty-seven articles were reviewed. None of the articles justified sample sizes via formal calculations. Cut-off scores reported based on research include: 90 seconds, 133 seconds, 147 seconds, 180 seconds, and Trails B cut-offs of 3 minutes or 3 errors (the '3 or 3 rule'). Major methodological limitations of this body of research were uncovered including (1) lack of justification of sample size leaving studies open to Type II error (i.e., false negative findings), and (2) excessive focus on associations rather than clinically useful cut-off scores.
Use of e-rater[R] in Scoring of the TOEFL iBT[R] Writing Test. Research Report. ETS RR-11-25

Science.gov (United States)

Haberman, Shelby J.

2011-01-01

Alternative approaches are discussed for use of e-rater[R] to score the TOEFL iBT[R] Writing test. These approaches involve alternate criteria. In the 1st approach, the predicted variable is the expected rater score of the examinee's 2 essays. In the 2nd approach, the predicted variable is the expected rater score of 2 essay responses by the…
Detection of acute deterioration in health status visit among COPD patients by monitoring COPD assessment test score

Directory of Open Access Journals (Sweden)

Pothirat C

2015-02-01

Full Text Available Chaicharn Pothirat, Warawut Chaiwong, Atikun Limsukon, Athavudh Deesomchok, Chalerm Liwsrisakun, Chaiwat Bumroongkit, Theerakorn Theerakittikul, Nittaya PhetsukDivision of Pulmonary, Critical Care and Allergy, Department of Internal Medicine, Faculty of Medicine, Chiang Mai University, Chiang Mai, ThailandBackground: The Chronic Obstructive Pulmonary Disease Assessment Test (CAT could play a role in detecting acute deterioration in health status during monitoring visits in routine clinical practice.Objective: To evaluate the discriminative property of a change in CAT score from a stable baseline visit for detecting acute deterioration in health status visits of chronic obstructive pulmonary disease (COPD patients.Methods: The CAT questionnaire was administered to stable COPD patients routinely attending the chest clinic of Chiang Mai University Hospital who were monitored using the CAT score every 1–3 months for 15 months. Acute deterioration in health status was defined as worsening or exacerbation. CAT scores at baseline, and subsequent visits with acute deterioration in health status were analyzed using the t-test. The receiver operating characteristic curve was performed to evaluate the discriminative property of change in CAT score for detecting acute deterioration during a health status visit.Results: A total of 354 follow-up visits were made by 140 patients, aged 71.1±8.4 years, with a forced expiratory volume in 1 second of 47.49%±18.2% predicted, who were monitored for 15 months. The mean CAT score change between stable baseline visits, by patients’ and physicians’ global assessments, were 0.05 (95% confidence interval [CI], -0.37–0.46 and 0.18 (95% CI, -0.23–0.60, respectively. At worsening visits, as assessed by patients, there was significant increase in CAT score (6.07; 95% CI, 4.95–7.19. There were also significant increases in CAT scores at visits with mild and moderate exacerbation (5.51 [95% CI, 4.39–6
IMPACT OF SHOTS ON FINAL SCORE OF A FOOTBALL MATCH

Directory of Open Access Journals (Sweden)

Miroslav Radoman

2008-08-01

Full Text Available The research has been done on a sample of 64 played games on the World championship FIFA, World Cup Germany 2006 and 128 results of the games divided in three integrals according to the score (win, defeat and unresolved score . The analysis is done according to the total number of shots during the game. Considering the results that are got and their interpretations, we could conclude that the results of data analysis in which is used the multi-method of MANOVA analysis and discriminative analysis, has shown that there are significant difference in frequency of the games result (win, defeat or unresolved score in shots element during the game. Even thou the noticed difference in frequency are not equally expressed, the results that are got have insinuated that there are significant differences in followed elements of the football game. Implemented analysis (royev test i T-test have confirmed that in every analyzed elements of the shot there are statistically significant differences in the result of the game (win, defeat, unresolved score and that the differences in shot’s elements are consequence different selection of the tactics and techniques also the ability of their realization in the stage of at tack and defense.

Test plan for evaluating the performance of the in-tank fluidic sampling system

International Nuclear Information System (INIS)

BOGER, R.M.

1999-01-01

The PHMC will provide Low Activity Wastes (LAW) tank wastes for final treatment by a privatization contractor from double-shell feed tanks, 241-AP-102 and 241-AP-104, Concerns about the inability of the baseline ''grab'' sampling to provide large volume samples within time constraints has led to the development of a conceptual sampling system that would be deployed in a feed tank riser, This sampling system will provide large volume, representative samples without the environmental, radiation exposure, and sample volume impacts of the current base-line ''grab'' sampling method. This test plan identifies ''proof-of-principle'' cold tests for the conceptual sampling system using simulant materials. The need for additional testing was identified as a result of completing tests described in the revision test plan document, Revision 1 outlines tests that will evaluate the performance and ability to provide samples that are representative of a tanks' content within a 95 percent confidence interval, to recovery from plugging, to sample supernatant wastes with over 25 wt% solids content, and to evaluate the impact of sampling at different heights within the feed tank. The test plan also identifies operating parameters that will optimize the performance of the sampling system
The NeBoP score - a clinical prediction test for evaluation of children with Lyme Neuroborreliosis in Europe.

Science.gov (United States)

Skogman, Barbro H; Sjöwall, Johanna; Lindgren, Per-Eric

2015-12-17

The diagnosis of Lyme neuroborreliosis (LNB) in Europe is based on clinical symptoms and laboratory data, such as pleocytosis and anti-Borrelia antibodies in serum and CSF according to guidelines. However, the decision to start antibiotic treatment on admission cannot be based on Borrelia serology since results are not available at the time of lumbar puncture. Therefore, an early prediction test would be useful in clinical practice. The aim of the study was to develop and evaluate a clinical prediction test for children with LNB in a relevant European setting. Clinical and laboratory data were collected retrospectively from a cohort of children being evaluated for LNB in Southeast Sweden. A clinical neuroborreliosis prediction test, the NeBoP score, was designed to differentiate between a high and a low risk of having LNB. The NeBoP score was then prospectively validated in a cohort of children being evaluated for LNB in Central and Southeast Sweden (n = 190) and controls with other specific diagnoses (n = 49). The sensitivity of the NeBoP score was 90 % (CI 95 %; 82-99 %) and the specificity was 90 % (CI 95 %; 85-96 %). Thus, the diagnostic accuracy (i.e. how the test correctly discriminates patients from controls) was 90 % and the area under the curve in a ROC analysis was 0.95. The positive predictive value (PPV) was 0.83 (CI 95 %; 0.75-0.93) and the negative predictive value (NPV) was 0.95 (CI 95 %; 0.90-0.99). The overall diagnostic performance of the NeBoP score is high (90 %) and the test is suggested to be useful for decision-making about early antibiotic treatment in children being evaluated for LNB in European Lyme endemic areas.
Semiparametric copula models for biometric score level fusion

NARCIS (Netherlands)

Susyanto, N.

2016-01-01

In biometric recognition, biometric samples (images of faces, fingerprints, voices, gaits, etc.) of people are compared and matchers (classifiers) indicate the level of similarity between any pair of samples by a score. If we model the joint distribution of all scores by a (semiparametric) Gaussian
A study of low scores in Canadian children and adolescents on the Wechsler Intelligence Scale For Children, Fourth Edition (WISC-IV).

Science.gov (United States)

Brooks, Brian L

2011-01-01

Knowing the prevalence of low neurocognitive scores for the WISC-IV Canadian normative sample (WISC-IV(CDN)) is an important supplement for clinical interpretation of test performance. On the WISC-IV(CDN), it is uncommon for children and adolescents to have 4 or more subtest scores or 2 or more Index scores ≤ 9th percentile when all scores on the battery are considered simultaneously. As the level of the child's intelligence increases or the number of years of parental education increases, the prevalence of low scores decreases. These results are consistent with existing studies of the base rates of low scores in children and adolescents on pediatric cognitive batteries, including the WISC-IV American normative sample. Tables provided are ready for clinical use.
Test Score Gaps between Private and Government Sector Students at School Entry Age in India

Science.gov (United States)

Singh, Abhijeet

2014-01-01

Various studies have noted that students enrolled in private schools in India perform better on average than students in government schools. In this paper, I show that large gaps in the test scores of children in private and public sector education are evident even at the point of initial enrollment in formal schooling and are associated with…
How different from random are docking predictions when ranked by scoring functions?

DEFF Research Database (Denmark)

Feliu, Elisenda; Oliva, Baldomero

2010-01-01

on the number of near-native structures in the sampling. We studied the effect of filtering out redundant structures and tested the use of pair-potentials derived using ZDock and ZRank. Our results show that for many targets, it is not possible to determine when a successful reranking performed by scoring...... functions results merely from random choice. This analysis reveals that changes should be made in the design of the CAPRI scoring experiment. We propose including the statistical assessment in this experiment either at the preprocessing or the evaluation step....
Factor Analysis of Temperament Category Scores in a Sample of Nursery School Children.

Science.gov (United States)

Simonds, John F.; Simonds, M. Patricia

1982-01-01

Mothers of children attending nursery schools completed the Behavior Style Questionnaire (BSQ) from which scores for nine temperament categories were derived. Found membership in groups based on factor scores independent of sex, socioeconomic class, age but not ordinal birth position. (Author)
NCACO-score: An effective main-chain dependent scoring function for structure modeling

Directory of Open Access Journals (Sweden)

Dong Xiaoxi

2011-05-01

Full Text Available Abstract Background Development of effective scoring functions is a critical component to the success of protein structure modeling. Previously, many efforts have been dedicated to the development of scoring functions. Despite these efforts, development of an effective scoring function that can achieve both good accuracy and fast speed still presents a grand challenge. Results Based on a coarse-grained representation of a protein structure by using only four main-chain atoms: N, Cα, C and O, we develop a knowledge-based scoring function, called NCACO-score, that integrates different structural information to rapidly model protein structure from sequence. In testing on the Decoys'R'Us sets, we found that NCACO-score can effectively recognize native conformers from their decoys. Furthermore, we demonstrate that NCACO-score can effectively guide fragment assembly for protein structure prediction, which has achieved a good performance in building the structure models for hard targets from CASP8 in terms of both accuracy and speed. Conclusions Although NCACO-score is developed based on a coarse-grained model, it is able to discriminate native conformers from decoy conformers with high accuracy. NCACO is a very effective scoring function for structure modeling.
Translation and Adaptation of Knee Injury and Osteoarthritis Outcome Score (KOOS in to Persian and Testing Persian Version Reliability Among Iranians with Osteoarthritis

Directory of Open Access Journals (Sweden)

Solaleh Saraei-Pour

2007-04-01

Full Text Available Objective: To achieve a reliable tool for measuring health related quality of life among Iranians with knee osteoarthritis, by translating and culturally adapting the Knee injury and Osteoarthritis Outcome Score(KOOS to Persian and testing the reliability and internal consistency of the Iranian version. Materials & Methods: It was a non experimental methodology study. KOOS was translated and adapted culturally to Persian language and culture in three phases with respect to IQOLA project. For examining test-retest reliability Iranians version of KOOS was corresponded twice with in at least two days or at most one week interval, by 30 Iranian people with knee OA whom were referred to Municipality and 110 physiotherapy clinics of Tehran with PT order by physicians. It was a non experimental methodological research and we used sample of convenience and non probability design for sampling. Psychometric evaluation: the collected data from the questionnaires was rated and analyzed with SPSS software from the aspects of test-retest reliability, absolute reliability, subscale and item internal consistency. Results: Internal consistency which was calculated by Cronbach '&alpha was high for all the subscales (at least 0.76, except for "symptom" subscale which was moderate, and showed that items of each subscale measured the same construct. Item internal consistency after correction for overlap, was higher than optimal value (0.4, except for the items of" symptom" subscale , which demonstrated good item internal consistency. SEM and ICC which were used for evaluating the absolute and test-retest reliability in respect showed that all the subscales had good test-retest reliability (0.7 and the absolute reliability was also very good in such away that the highest calculated SEM for Persian version was 7.44 which was less than Minimal Perceptible Clinical Improvement (MPCI that is estimated 8 to 10 for the KOOS questionnaire. Conclusion: With the Persian
Effects of Public Preschool Expenditures on the Test Scores of 4th Graders: Evidence from TIMSS

Science.gov (United States)

Waldfogel, Jane; Zhai, Fuhua

2011-01-01

This study examines the effects of public preschool expenditures on the math and science scores of 4th graders, holding constant child, family, and school characteristics, other relevant social expenditures, and country and year effects, in seven Organization for Economic Co-operation and Development (OECD) countries -- Australia, Japan, Netherlands, New Zealand, Norway, U.K., and U.S -- using data from the 1995 and 2003 Trends in International Mathematics and Science Study (TIMSS). Our results indicate that there are small but significant positive effects of public preschool expenditures on the math and science scores of 4th graders and preschool expenditures reduce the risk of children scoring at the low level of proficiency. We also find some evidence that children from low-resource homes and homes where the test language is not always spoken may tend to gain more from increased public preschool expenditures than other children,. PMID:21442008
Are WISC IQ scores in children with mathematical learning disabilities underestimated? The influence of a specialized intervention on test performance.

Science.gov (United States)

Lambert, Katharina; Spinath, Birgit

2018-01-01

Intelligence measures play a pivotal role in the diagnosis of mathematical learning disabilities (MLD). Probably as a result of math-related material in IQ tests, children with MLD often display reduced IQ scores. However, it remains unclear whether the effects of math remediation extend to IQ scores. The present study investigated the impact of a special remediation program compared to a control group receiving private tutoring (PT) on the WISC IQ scores of children with MLD. We included N=45 MLD children (7-12 years) in a study with a pre- and post-test control group design. Children received remediation for two years on average. The analyses revealed significantly greater improvements in the experimental group on the Full-Scale IQ, and the Verbal Comprehension, Perceptual Reasoning, and Working Memory indices, but not Processing Speed, compared to the PT group. Children in the experimental group showed an average WISC IQ gain of more than ten points. Results indicate that the WISC IQ scores of MLD children might be underestimated and that an effective math intervention can improve WISC IQ test performance. Taking limitations into account, we discuss the use of IQ measures more generally for defining MLD in research and practice. Copyright © 2017 Elsevier Ltd. All rights reserved.
Prediction of mortality using on-line, self-reported health data: empirical test of the RealAge score.

Directory of Open Access Journals (Sweden)

William R Hobbs

Full Text Available OBJECTIVE: We validate an online, personalized mortality risk measure called "RealAge" assigned to 30 million individuals over the past 10 years. METHODS: 188,698 RealAge survey respondents were linked to California Department of Public Health death records using a one-way cryptographic hash of first name, last name, and date of birth. 1,046 were identified as deceased. We used Cox proportional hazards models and receiver operating characteristic (ROC curves to estimate the relative scales and predictive accuracies of chronological age, the RealAge score, and the Framingham ATP-III score for hard coronary heart disease (HCHD in this data. To address concerns about selection and to examine possible heterogeneity, we compared the results by time to death at registration, underlying cause of death, and relative health among users. RESULTS: THE REALAGE SCORE IS ACCURATELY SCALED (HAZARD RATIOS: age 1.076; RealAge-age 1.084 and more accurate than chronological age (age c-statistic: 0.748; RealAge c-statistic: 0.847 in predicting mortality from hard coronary heart disease following survey completion. The score is more accurate than the Framingham ATP-III score for hard coronary heart disease (c-statistic: 0.814, perhaps because self-reported cholesterol levels are relatively uninformative in the RealAge user sample. RealAge predicts deaths from malignant neoplasms, heart disease, and external causes. The score does not predict malignant neoplasm deaths when restricted to users with no smoking history, no prior cancer diagnosis, and no indicated health interest in cancer (p-value 0.820. CONCLUSION: The RealAge score is a valid measure of mortality risk in its user population.
The Impact of Scholastic Instrumental Music and Scholastic Chess Study on the Standardized Test Scores of Students in Grades Three, Four, and Five

Science.gov (United States)

Martinez, Edwin E.

2012-01-01

This study examines the impact of instrumental music study and group chess lessons on the standardized test scores of suburban elementary public school students (grades three through five) in Levittown, New York. The study divides the students into the following groups and compares the standardized test scores of each: a) instrumental music…
Questionnaire design: carry-over effects of overall acceptance question placement and pre-evaluation instructions on overall acceptance scores in central location tests.

Science.gov (United States)

Bastian, Mauresa; Eggett, Dennis L; Jefferies, Laura K

2015-02-01

Question placement and usage of pre-evaluation instructions (PEI) in questionnaires for food sensory analysis may bias consumers' scores via carry-over effects. Data from consumer sensory panels previously conducted at a central location, spanning 11 years and covering a broad range of food product categories, were compiled. Overall acceptance (OA) question placement was studied with categories designated as first (the first evaluation question following demographic questions), after nongustation questions (immediately following questions that do not require panelists to taste the product), and later (following all other hedonic and just-about-right [JAR] questions, but occasionally before ranking, open-ended comments, and/or intent to purchase questions). Each panel was categorized as having or not having PEI in the questionnaire; PEI are instructions that appear immediately before the first evaluation question and show panelists all attributes they will evaluate before receiving test samples. Postpanel surveys were administered regarding the self-reported effect of PEI on panelists' evaluation experience. OA scores were analyzed and compared (1) between OA question placement categories and (2) between panels with and without PEI. For most product categories, OA scores tended to be lower when asked later in the questionnaire, suggesting evidence of a carry-over effect. Usage of PEI increased OA scores by 0.10 of a 9-point hedonic scale point, which is not practically significant. Postpanel survey data showed that presence of PEI typically improved the panelists' experience. Using PEI does not appear to introduce a meaningful carry-over effect. © 2015 Institute of Food Technologists®
Accuracy of a pediatric early warning score in the recognition of clinical deterioration

Directory of Open Access Journals (Sweden)

Juliana de Oliveira Freitas Miranda

Full Text Available ABSTRACT Objective: to evaluate the accuracy of the version of the Brighton Pediatric Early Warning Score translated and adapted for the Brazilian context, in the recognition of clinical deterioration. Method: a diagnostic test study to measure the accuracy of the Brighton Pediatric Early Warning Score for the Brazilian context, in relation to a reference standard. The sample consisted of 271 children, aged 0 to 10 years, blindly evaluated by a nurse and a physician, specialists in pediatrics, with interval of 5 to 10 minutes between the evaluations, for the application of the Brighton Pediatric Early Warning Score for the Brazilian context and of the reference standard. The data were processed and analyzed using the Statistical Package for the Social Sciences and VassarStats.net programs. The performance of the Brighton Pediatric Early Warning Score for the Brazilian context was evaluated through the indicators of sensitivity, specificity, predictive values, area under the ROC curve, likelihood ratios and post-test probability. Results: the Brighton Pediatric Early Warning Score for the Brazilian context showed sensitivity of 73.9%, specificity of 95.5%, positive predictive value of 73.3%, negative predictive value of 94.7%, area under Receiver Operating Characteristic Curve of 91.9% and the positive post-test probability was 80%. Conclusion: the Brighton Pediatric Early Warning Score for the Brazilian context, presented good performance, considered valid for the recognition of clinical deterioration warning signs of the children studied.
A United States forensic sample for the Gudjonsson Suggestibility Scales.

Science.gov (United States)

Frumkin, I Bruce; Lally, Stephen J; Sexton, James E

2012-01-01

The Gudjonsson Suggestibility Scales (GSS) is a valuable test to use as part of a comprehensive assessment of psychological and interrogative factors relevant to a defendant's vulnerability to giving a false or involuntary confession. One limitation of the test is that the manual only provides information for samples from Iceland and Great Britain. This report describes the results of 334 individuals in the United States, who were administered the tests as part of an evaluation to assess confession-related issues in a forensic context (i.e., capacity to waive Miranda rights or vulnerability in providing a false or involuntary confession). This forensic sample includes both juveniles and adults. Results are consistent with Gudjonsson's British and Icelandic samples, in which the Yield 1 score is more affected by intellectual and cognitive variables, but Shift and, to a lesser extent, Yield 2 scores are more related to emotional and personality characteristics. Copyright © 2012 John Wiley & Sons, Ltd.
Use of Automated Scoring in Spoken Language Assessments for Test Takers with Speech Impairments. Research Report. ETS RR-17-42

Science.gov (United States)

Loukina, Anastassia; Buzick, Heather

2017-01-01

This study is an evaluation of the performance of automated speech scoring for speakers with documented or suspected speech impairments. Given that the use of automated scoring of open-ended spoken responses is relatively nascent and there is little research to date that includes test takers with disabilities, this small exploratory study focuses…
A novel PMT test system based on waveform sampling

Science.gov (United States)

Yin, S.; Ma, L.; Ning, Z.; Qian, S.; Wang, Y.; Jiang, X.; Wang, Z.; Yu, B.; Gao, F.; Zhu, Y.; Wang, Z.

2018-01-01

Comparing with the traditional test system based on a QDC and TDC and scaler, a test system based on waveform sampling is constructed for signal sampling of the 8"R5912 and the 20"R12860 Hamamatsu PMT in different energy states from single to multiple photoelectrons. In order to achieve high throughput and to reduce the dead time in data processing, the data acquisition software based on LabVIEW is developed and runs with a parallel mechanism. The analysis algorithm is realized in LabVIEW and the spectra of charge, amplitude, signal width and rising time are analyzed offline. The results from Charge-to-Digital Converter, Time-to-Digital Converter and waveform sampling are discussed in detailed comparison.
Scoring in genetically modified organism proficiency tests based on log-transformed results.

Science.gov (United States)

Thompson, Michael; Ellison, Stephen L R; Owen, Linda; Mathieson, Kenneth; Powell, Joanne; Key, Pauline; Wood, Roger; Damant, Andrew P

2006-01-01

The study considers data from 2 UK-based proficiency schemes and includes data from a total of 29 rounds and 43 test materials over a period of 3 years. The results from the 2 schemes are similar and reinforce each other. The amplification process used in quantitative polymerase chain reaction determinations predicts a mixture of normal, binomial, and lognormal distributions dominated by the latter 2. As predicted, the study results consistently follow a positively skewed distribution. Log-transformation prior to calculating z-scores is effective in establishing near-symmetric distributions that are sufficiently close to normal to justify interpretation on the basis of the normal distribution.
Parametric analyses of summative scores may lead to conflicting inferences when comparing groups: A simulation study.

Science.gov (United States)

Khan, Asaduzzaman; Chien, Chi-Wen; Bagraith, Karl S

2015-04-01

To investigate whether using a parametric statistic in comparing groups leads to different conclusions when using summative scores from rating scales compared with using their corresponding Rasch-based measures. A Monte Carlo simulation study was designed to examine between-group differences in the change scores derived from summative scores from rating scales, and those derived from their corresponding Rasch-based measures, using 1-way analysis of variance. The degree of inconsistency between the 2 scoring approaches (i.e. summative and Rasch-based) was examined, using varying sample sizes, scale difficulties and person ability conditions. This simulation study revealed scaling artefacts that could arise from using summative scores rather than Rasch-based measures for determining the changes between groups. The group differences in the change scores were statistically significant for summative scores under all test conditions and sample size scenarios. However, none of the group differences in the change scores were significant when using the corresponding Rasch-based measures. This study raises questions about the validity of the inference on group differences of summative score changes in parametric analyses. Moreover, it provides a rationale for the use of Rasch-based measures, which can allow valid parametric analyses of rating scale data.

Bayesian posterior sampling via stochastic gradient Fisher scoring

NARCIS (Netherlands)

Ahn, S.; Korattikara, A.; Welling, M.; Langford, J.; Pineau, J.

2012-01-01

In this paper we address the following question: "Can we approximately sample from a Bayesian posterior distribution if we are only allowed to touch a small mini-batch of data-items for every sample we generate?". An algorithm based on the Langevin equation with stochastic gradients (SGLD) was
Commutability of food microbiology proficiency testing samples.

Science.gov (United States)

Abdelmassih, M; Polet, M; Goffaux, M-J; Planchon, V; Dierick, K; Mahillon, J

2014-03-01

Food microbiology proficiency testing (PT) is a useful tool to assess the analytical performances among laboratories. PT items should be close to routine samples to accurately evaluate the acceptability of the methods. However, most PT providers distribute exclusively artificial samples such as reference materials or irradiated foods. This raises the issue of the suitability of these samples because the equivalence-or 'commutability'-between results obtained on artificial vs. authentic food samples has not been demonstrated. In the clinical field, the use of noncommutable PT samples has led to erroneous evaluation of the performances when different analytical methods were used. This study aimed to provide a first assessment of the commutability of samples distributed in food microbiology PT. REQUASUD and IPH organized 13 food microbiology PTs including 10-28 participants. Three types of PT items were used: genuine food samples, sterile food samples and reference materials. The commutability of the artificial samples (reference material or sterile samples) was assessed by plotting the distribution of the results on natural and artificial PT samples. This comparison highlighted matrix-correlated issues when nonfood matrices, such as reference materials, were used. Artificially inoculated food samples, on the other hand, raised only isolated commutability issues. In the organization of a PT-scheme, authentic or artificially inoculated food samples are necessary to accurately evaluate the analytical performances. Reference materials, used as PT items because of their convenience, may present commutability issues leading to inaccurate penalizing conclusions for methods that would have provided accurate results on food samples. For the first time, the commutability of food microbiology PT samples was investigated. The nature of the samples provided by the organizer turned out to be an important factor because matrix effects can impact on the analytical results. © 2013
Interpreting force concept inventory scores: Normalized gain and SAT scores

Directory of Open Access Journals (Sweden)

Jeffrey J. Steinert

2007-05-01

Full Text Available Preinstruction SAT scores and normalized gains (G on the force concept inventory (FCI were examined for individual students in interactive engagement (IE courses in introductory mechanics at one high school (N=335 and one university (N=292 , and strong, positive correlations were found for both populations ( r=0.57 and r=0.46 , respectively. These correlations are likely due to the importance of cognitive skills and abstract reasoning in learning physics. The larger correlation coefficient for the high school population may be a result of the much shorter time interval between taking the SAT and studying mechanics, because the SAT may provide a more current measure of abilities when high school students begin the study of mechanics than it does for college students, who begin mechanics years after the test is taken. In prior research a strong correlation between FCI G and scores on Lawson’s Classroom Test of Scientific Reasoning for students from the same two schools was observed. Our results suggest that, when interpreting class average normalized FCI gains and comparing different classes, it is important to take into account the variation of students’ cognitive skills, as measured either by the SAT or by Lawson’s test. While Lawson’s test is not commonly given to students in most introductory mechanics courses, SAT scores provide a readily available alternative means of taking account of students’ reasoning abilities. Knowing the students’ cognitive level before instruction also allows one to alter instruction or to use an intervention designed to improve students’ cognitive level.
Interpreting force concept inventory scores: Normalized gain and SAT scores

Directory of Open Access Journals (Sweden)

Vincent P. Coletta

2007-05-01

Full Text Available Preinstruction SAT scores and normalized gains (G on the force concept inventory (FCI were examined for individual students in interactive engagement (IE courses in introductory mechanics at one high school (N=335 and one university (N=292, and strong, positive correlations were found for both populations (r=0.57 and r=0.46, respectively. These correlations are likely due to the importance of cognitive skills and abstract reasoning in learning physics. The larger correlation coefficient for the high school population may be a result of the much shorter time interval between taking the SAT and studying mechanics, because the SAT may provide a more current measure of abilities when high school students begin the study of mechanics than it does for college students, who begin mechanics years after the test is taken. In prior research a strong correlation between FCI G and scores on Lawson’s Classroom Test of Scientific Reasoning for students from the same two schools was observed. Our results suggest that, when interpreting class average normalized FCI gains and comparing different classes, it is important to take into account the variation of students’ cognitive skills, as measured either by the SAT or by Lawson’s test. While Lawson’s test is not commonly given to students in most introductory mechanics courses, SAT scores provide a readily available alternative means of taking account of students’ reasoning abilities. Knowing the students’ cognitive level before instruction also allows one to alter instruction or to use an intervention designed to improve students’ cognitive level.
REIMEP-22 inter-laboratory comparison. ''U Age Dating - determination of the production date of a uranium certified test sample''

Energy Technology Data Exchange (ETDEWEB)

Venchiarutti, Celia; Richter, Stephan; Jakopic, Rozle; Aregbe, Yetunde [European Commission, Joint Research Centre (JRC), Geel (Belgium). Institute for Reference Materials and Measurements (IRMM); Varga, Zsolt; Mayer, Klaus [European Commission, Joint Research Centre (JRC), Karlsruhe (Germany). Institute for Transuranium Elements (ITU)

2015-07-01

The REIMEP-22 inter-laboratory comparison aimed at determining the production date of a uranium certified test sample (i.e. the last chemical separation date of the material). Participants in REIMEP-22 on ''U Age Dating - Determination of the production date of a uranium certified test sample'' received one low-enriched 20 mg uranium sample for mass spectrometry measurements and/or one 50 mg uranium sample for a-spectrometry measurements, with an undisclosed value for the production date. They were asked to report the isotope amount ratios n({sup 230}Th)/n({sup 234}U) for the 20 mg uranium sample and/or the activity ratios A({sup 230}Th)/A({sup 234}U) for the 50 mg uranium sample in addition to the calculated production date of the certified test samples with its uncertainty. Reporting of the {sup 231}Pa/{sup 235}U ratio and the respective calculated production date was optional. Eleven laboratories reported results in REIMEP-22. Two of them reported results for both the 20 mg and 50 mg uranium certified test samples. The measurement capability of the participants was assessed against the independent REIMEP-22 reference value by means of z- and zeta-scores in compliance with ISO 13528:2005. Furthermore a performance assessment criterion for acceptable uncertainty was applied to evaluate the participants' results. In general, the REIMEP-22 participants' results were satisfactory. This confirms the analytical capabilities of laboratories to determine accurately the age of uranium materials with low amount of ingrown thorium (young certified test sample). The Joint Research Centre of the European Commission (EC-JRC) organised REIMEP-22 in parallel to the preparation and certification of a uranium reference material certified for the production date (IRMM-1000a and IRMM-1000b).
Polytrauma Defined by the New Berlin Definition: A Validation Test Based on Propensity-Score Matching Approach.

Science.gov (United States)

Rau, Cheng-Shyuan; Wu, Shao-Chun; Kuo, Pao-Jen; Chen, Yi-Chun; Chien, Peng-Chen; Hsieh, Hsiao-Yun; Hsieh, Ching-Hua

2017-09-11

Background: Polytrauma patients are expected to have a higher risk of mortality than that obtained by the summation of expected mortality owing to their individual injuries. This study was designed to investigate the outcome of patients with polytrauma, which was defined using the new Berlin definition, as cases with an Abbreviated Injury Scale (AIS) ≥ 3 for two or more different body regions and one or more additional variables from five physiologic parameters (hypotension [systolic blood pressure ≤ 90 mmHg], unconsciousness [Glasgow Coma Scale score ≤ 8], acidosis [base excess ≤ -6.0], coagulopathy [partial thromboplastin time ≥ 40 s or international normalized ratio ≥ 1.4], and age [≥70 years]). Methods: We retrieved detailed data on 369 polytrauma patients and 1260 non-polytrauma patients with an overall Injury Severity Score (ISS) ≥ 18 who were hospitalized between 1 January 2009 and 31 December 2015 for the treatment of all traumatic injuries, from the Trauma Registry System at a level I trauma center. Patients with burn injury or incomplete registered data were excluded. Categorical data were compared with two-sided Fisher exact or Pearson chi-square tests. The unpaired Student t -test and the Mann-Whitney U -test was used to analyze normally distributed continuous data and non-normally distributed data, respectively. Propensity-score matched cohort in a 1:1 ratio was allocated using the NCSS software with logistic regression to evaluate the effect of polytrauma on patient outcomes. Results: The polytrauma patients had a significantly higher ISS than non-polytrauma patients (median (interquartile range Q1-Q3), 29 (22-36) vs. 24 (20-25), respectively; p Polytrauma patients had a 1.9-fold higher odds of mortality than non-polytrauma patients (95% CI 1.38-2.49; p polytrauma patients, polytrauma patients had a substantially longer hospital length of stay (LOS). In addition, a higher proportion of polytrauma patients were admitted to the intensive
Predisposing factors of pneumothorax in percutaneous transthoracic fine needle aspiration biopsy: comparison between CT emphysema score and pulmonary function test

Energy Technology Data Exchange (ETDEWEB)

Lee, Chang Ho; Park, Kyung Joo; Park, Dong Won; Jung, Kyung Il; Suh, Jung Ho [Ajou Univ. College of Medicine, Seoul (Korea, Republic of)

1997-11-01

To compare the CT emphysema score with various factors of pulmonary function test by simple spirometry and to use the result as a predictor of pneumothorax in percutaneous transthoracic fine needle aspiration biopsy. The CT scans of 106 patients who had undergone percutaneous transthoracic fine needle aspiration biopsy of lung lesions within the previous 18 months were retrospectively reviewed. In 75 of these 106 cases, the results of the pulmonary function test were also reviewed. On plain chest radiography, pneumothorax was noted in 20 cases (19%). Emphysema was blindly evaluated. We divided each lung into four segments and determined the severity and involved volume of emphysema, as seen on CT. Severity was classified as one of four grades, as follow : absence of emphysema=0 ; low attenuation area of less than 5mm=1 ; low attenuation area of more than 5mm, and vascular pruning with normal lung intervening=2 ; and diffuse low attenuation without intervening normal lung, and larger confluent low attenuation with vascular pruning and distortion of branching pattern occupying all or almost all the involved parenchyma=3. The involved area was also classified as one of four grades : less than 25%=1 ; 25 - 49%=2 ; 51 - 74%=3 ; and more than 75%=4. The CT emphysema score was defined as the average of the grade of severity multiplied by the grade of involved area. Pulmonary function tests, consisting of simple spirometry and a pulmonologist's interpretation, were evaluated. We also evaluated depth and size of lesion as known predisposing factors in postbioptic pneumothorax. Statistical analysis was performed using the chi-square test, Wilcoxon ranks sum W test and the student t test. A comparison between the two groups of occurrence(with or without pneumothorax) showed the emphysema scores to be 1.69{+-}2.0 and 1.11{+-}2.9, respectively ; there was thus no significant difference between the two groups (z= - 0.048, p>0.10). Nor were differences revealed by the
Predisposing factors of pneumothorax in percutaneous transthoracic fine needle aspiration biopsy: comparison between CT emphysema score and pulmonary function test

International Nuclear Information System (INIS)

Lee, Chang Ho; Park, Kyung Joo; Park, Dong Won; Jung, Kyung Il; Suh, Jung Ho

1997-01-01

To compare the CT emphysema score with various factors of pulmonary function test by simple spirometry and to use the result as a predictor of pneumothorax in percutaneous transthoracic fine needle aspiration biopsy. The CT scans of 106 patients who had undergone percutaneous transthoracic fine needle aspiration biopsy of lung lesions within the previous 18 months were retrospectively reviewed. In 75 of these 106 cases, the results of the pulmonary function test were also reviewed. On plain chest radiography, pneumothorax was noted in 20 cases (19%). Emphysema was blindly evaluated. We divided each lung into four segments and determined the severity and involved volume of emphysema, as seen on CT. Severity was classified as one of four grades, as follow : absence of emphysema=0 ; low attenuation area of less than 5mm=1 ; low attenuation area of more than 5mm, and vascular pruning with normal lung intervening=2 ; and diffuse low attenuation without intervening normal lung, and larger confluent low attenuation with vascular pruning and distortion of branching pattern occupying all or almost all the involved parenchyma=3. The involved area was also classified as one of four grades : less than 25%=1 ; 25 - 49%=2 ; 51 - 74%=3 ; and more than 75%=4. The CT emphysema score was defined as the average of the grade of severity multiplied by the grade of involved area. Pulmonary function tests, consisting of simple spirometry and a pulmonologist's interpretation, were evaluated. We also evaluated depth and size of lesion as known predisposing factors in postbioptic pneumothorax. Statistical analysis was performed using the chi-square test, Wilcoxon ranks sum W test and the student t test. A comparison between the two groups of occurrence(with or without pneumothorax) showed the emphysema scores to be 1.69±2.0 and 1.11±2.9, respectively ; there was thus no significant difference between the two groups (z= - 0.048, p>0.10). Nor were differences revealed by the pulmonary
Predicting Freshman Grade Point Average From College Admissions Test Scores and State High School Test Scores

Directory of Open Access Journals (Sweden)

Daniel Koretz

2016-09-01

Full Text Available The current focus on assessing “college and career readiness” raises an empirical question: How do high school tests compare with college admissions tests in predicting performance in college? We explored this using data from the City University of New York and public colleges in Kentucky. These two systems differ in the choice of college admissions test, the stakes for students on the high school test, and demographics. We predicted freshman grade point average (FGPA from high school GPA and both college admissions and high school tests in mathematics and English. In both systems, the choice of tests had only trivial effects on the aggregate prediction of FGPA. Adding either test to an equation that included the other had only trivial effects on prediction. Although the findings suggest that the choice of test might advantage or disadvantage different students, it had no substantial effect on the over- and underprediction of FGPA for students classified by race-ethnicity or poverty.
Tracer gas diffusion sampling test plan

International Nuclear Information System (INIS)

Rohay, V.J.

1993-01-01

Efforts are under way to employ active and passive vapor extraction to remove carbon tetrachloride from the soil in the 200 West Area an the Hanford Site as part of the 200 West Area Carbon Tetrachloride Expedited Response Action. In the active approach, a vacuum is applied to a well, which causes soil gas surrounding the well to be drawn up to the surface. The contaminated air is cleaned by passage through a granular activated carbon bed. There are questions concerning the radius of influence associated with application of the vacuum system and related uncertainties about the soil-gas diffusion rates with and without the vacuum system present. To address these questions, a series of tracer gas diffusion sampling tests is proposed in which an inert, nontoxic tracer gas, sulfur hexafluoride (SF 6 ), will be injected into a well, and the rates of SF 6 diffusion through the surrounding soil horizon will be measured by sampling in nearby wells. Tracer gas tests will be conducted at sites very near the active vacuum extraction system and also at sites beyond the radius of influence of the active vacuum system. In the passive vapor extraction approach, barometric pressure fluctuations cause soil gas to be drawn to the surface through the well. At the passive sites, the effects of barometric ''pumping'' due to changes in atmospheric pressure will be investigated. Application of tracer gas testing to both the active and passive vapor extraction methods is described in the wellfield enhancement work plan (Rohay and Cameron 1993)
'Neknomination': Predictors in a sample of UK university students.

Science.gov (United States)

Moss, Antony C; Spada, Marcantonio M; Harkin, Jamila; Albery, Ian P; Rycroft, Nicola; Nikčević, Ana V

2015-06-01

To identify prevalence and predictors of participation in the online drinking game 'neknomination' amongst university students. A convenience sample of 145 university students participated in a study about drinking behaviours, completing a questionnaire about their participation in neknomination, the Alcohol Use Disorders Identification Test, and the Resistance to Peer Influence Scale. Out of 145 students sampled, 54% took part in neknomination in the previous month. Mann-Whitney U tests revealed significantly higher scores on the Alcohol Use Disorders Identification Test, and significantly lower scores on the Resistance to Peer Influence Scale, for those who had participated in neknomination. A significant correlation was also shown between specific peer pressure to neknominate, and engagement in neknomination. A logistic regression analysis indicated that scores on the Alcohol Use Disorders Identification Test, but not the Resistance to Peer Influence Scale, predicted classification as an individual who participated in neknomination. We found that over half of respondents had participated in a neknomination game in the past month, with almost all male respondents having done so. Participation in neknomination was strongly associated with general hazardous drinking behaviour but not with resistance to peer influence. Further research is needed to understand the role of engagement with social media in drinking games and risky drinking.
Test--retest variability of Randot stereoacuity measures gathered in an unselected sample of UK primary school children.

Science.gov (United States)

Adler, Paul; Scally, Andrew J; Barrett, Brendan T

2012-05-01

To determine the test-retest reliability of the Randot stereoacuity test when used as part of vision screening in schools. Randot stereoacuity (graded-circles) and logMAR visual acuity measures were gathered in an unselected sample of 139 children (aged 4-12, mean 8.1±2.1 years) in two schools. Randot testing was repeated on two occasions (average interval between successive tests 8 days, range: 1-21 days). Three Randot scores were obtained in 97.8% of children. Randot stereoacuity improved by an average of one plate (ie, one test level) on repeat testing but was little changed when tested on the third occasion. Within-subject variability was up to three test levels on repeat testing. When stereoacuity was categorised as 'fine', 'intermediate' or 'coarse', the greatest variability was found among younger children who exhibited 'intermediate' or 'coarse'/nil stereopsis on initial testing. Whereas 90.8% of children with 'fine' stereopsis (≤50 arc-seconds) on the first test exhibited 'fine' stereopsis on both subsequent tests, only ∼16% of children with 'intermediate' (>50 but ≤140 arc-seconds) or 'coarse'/nil (≥200 arc-seconds) stereoacuity on initial testing exhibited stable test results on repeat testing. Children exhibiting abnormal stereoacuity on initial testing are very likely to exhibit a normal result when retested. The value of a single, abnormal Randot graded-circles stereoacuity measure from school screening is therefore questionable.
Timed up & go test score in patients with hip fracture is related to the type of walking aid

DEFF Research Database (Denmark)

Kristensen, Morten T; Bandholm, Thomas; Holm, Bente

2009-01-01

Kristensen MT, Bandholm T, Holm B, Ekdahl C, Kehlet H. Timed Up & Go test score in patients with hip fracture is related to the type of walking aid. OBJECTIVE: To determine the relationship between Timed Up & Go (TUG) test scores and type of walking aid used during the test, and to determine...... the feasibility of using the rollator as a standardized walking aid during the TUG in patients with hip fracture who were allowed full weight-bearing (FWB). DESIGN: Prospective methodological study. SETTING: An acute orthopedic hip fracture unit at a university hospital. PARTICIPANTS: Patients (N=126; 90 women......, 36 men) with hip fracture with a mean age +/- SD of 74.8+/-12.7 years performed the TUG the day before discharge from the orthopedic ward. INTERVENTIONS: Not applicable. MAIN OUTCOME MEASURES: The TUG was performed with the walking aid the patient was to be discharged with: a walker (n=88) or elbow...
Zero Calcium Score as a Filter for Further Testing in Patients Admitted to the Coronary Care Unit with Chest Pain.

Science.gov (United States)

Correia, Luis Cláudio Lemos; Esteves, Fábio P; Carvalhal, Manuela; Souza, Thiago Menezes Barbosa de; Sá, Nicole de; Correia, Vitor Calixto de Almeida; Alexandre, Felipe Kalil Beirão; Lopes, Fernanda; Ferreira, Felipe; Noya-Rabelo, Márcia

2017-06-12

The accuracy of zero coronary calcium score as a filter in patients with chest pain has been demonstrated at the emergency room and outpatient clinics, populations with low prevalence of coronary artery disease (CAD). To test the gatekeeping role of zero calcium score in patients with chest pain admitted to the coronary care unit (CCU), where the pretest probability of CAD is higher than that of other populations. Patients underwent computed tomography for calcium scoring, and obstructive CAD was defined by a minimum 70% stenosis on invasive angiography. In 146 patients studied, the prevalence of CAD was 41%. A zero calcium score was present in 35% of the patients. The sensitivity and specificity of zero calcium score yielded a negative likelihood ratio of 0.16. After logistic regression adjustment for pretest probability, zero calcium score was independently associated with lower odds of CAD (OR = 0.12, 95%CI = 0.04-0.36), increasing the area under the ROC curve of the clinical model from 0.76 to 0.82 (p = 0.006). Zero calcium score provided a net reclassification improvement of 0.20 (p = 0.0018) over the clinical model when using a pretest probability threshold of 10% for discharging without further testing. In patients with pretest probability zero calcium score had a negative predictive value of 95% (95%CI = 83%-99%), with a number needed to test of 2.1 for obtaining one additional discharge. Zero calcium score substantially reduces the pretest probability of obstructive CAD in patients admitted to the CCU with acute chest pain. (Arq Bras Cardiol. 2017; [online].ahead print, PP.0-0). A acurácia do escore de cálcio coronário zero como um filtro nos pacientes com dor torácica aguda tem sido demonstrada na sala de emergência e nos ambulatórios, populações com baixa prevalência de doença arterial coronariana (DAC). Testar o papel do escore de cálcio zero como filtro nos pacientes com dor torácica admitidos numa unidade coronariana intensiva (UCI), na
Women’s experience with home-based self-sampling for human papillomavirus testing

International Nuclear Information System (INIS)

Sultana, Farhana; Mullins, Robyn; English, Dallas R.; Simpson, Julie A.; Drennan, Kelly T.; Heley, Stella; Wrede, C. David; Brotherton, Julia M. L.; Saville, Marion; Gertig, Dorota M.

2015-01-01

Increasing cervical screening coverage by reaching inadequately screened groups is essential for improving the effectiveness of cervical screening programs. Offering HPV self-sampling to women who are never or under-screened can improve screening participation, however participation varies widely between settings. Information on women’s experience with self-sampling and preferences for future self-sampling screening is essential for programs to optimize participation. The survey was conducted as part of a larger trial (“iPap”) investigating the effect of HPV self-sampling on participation of never and under-screened women in Victoria, Australia. Questionnaires were mailed to a) most women who participated in the self-sampling to document their experience with and preference for self-sampling in future, and b) a sample of the women who did not participate asking reasons for non-participation and suggestions for enabling participation. Reasons for not having a previous Pap test were also explored. About half the women who collected a self sample for the iPap trial returned the subsequent questionnaire (746/1521). Common reasons for not having cervical screening were that having Pap test performed by a doctor was embarrassing (18 %), not having the time (14 %), or that a Pap test was painful and uncomfortable (11 %). Most (94 %) found the home-based self-sampling less embarrassing, less uncomfortable (90 %) and more convenient (98 %) compared with their last Pap test experience (if they had one); however, many were unsure about the test accuracy (57 %). Women who self-sampled thought the instructions were clear (98 %), it was easy to use the swab (95 %), and were generally confident that they did the test correctly (81 %). Most preferred to take the self-sample at home in the future (88 %) because it was simple and did not require a doctor’s appointment. Few women (126/1946, 7 %) who did not return a self-sample in the iPap trial returned the questionnaire
Extension of the lod score: the mod score.

Science.gov (United States)

Clerget-Darpoux, F

2001-01-01

In 1955 Morton proposed the lod score method both for testing linkage between loci and for estimating the recombination fraction between them. If a disease is controlled by a gene at one of these loci, the lod score computation requires the prior specification of an underlying model that assigns the probabilities of genotypes from the observed phenotypes. To address the case of linkage studies for diseases with unknown mode of inheritance, we suggested (Clerget-Darpoux et al., 1986) extending the lod score function to a so-called mod score function. In this function, the variables are both the recombination fraction and the disease model parameters. Maximizing the mod score function over all these parameters amounts to maximizing the probability of marker data conditional on the disease status. Under the absence of linkage, the mod score conforms to a chi-square distribution, with extra degrees of freedom in comparison to the lod score function (MacLean et al., 1993). The mod score is asymptotically maximum for the true disease model (Clerget-Darpoux and Bonaïti-Pellié, 1992; Hodge and Elston, 1994). Consequently, the power to detect linkage through mod score will be highest when the space of models where the maximization is performed includes the true model. On the other hand, one must avoid overparametrization of the model space. For example, when the approach is applied to affected sibpairs, only two constrained disease model parameters should be used (Knapp et al., 1994) for the mod score maximization. It is also important to emphasize the existence of a strong correlation between the disease gene location and the disease model. Consequently, there is poor resolution of the location of the susceptibility locus when the disease model at this locus is unknown. Of course, this is true regardless of the statistics used. The mod score may also be applied in a candidate gene strategy to model the potential effect of this gene in the disease. Since, however, it
Evaluation of Two Methods for Modeling Measurement Errors When Testing Interaction Effects with Observed Composite Scores

Science.gov (United States)

Hsiao, Yu-Yu; Kwok, Oi-Man; Lai, Mark H. C.

2018-01-01

Path models with observed composites based on multiple items (e.g., mean or sum score of the items) are commonly used to test interaction effects. Under this practice, researchers generally assume that the observed composites are measured without errors. In this study, we reviewed and evaluated two alternative methods within the structural…
Test plan for the Sample Transfer Canister system

International Nuclear Information System (INIS)

Flanagan, B.D.

1998-01-01

The Sample Transfer Canister will be used by the Waste Receiving and Processing Facility (WRAP) for the transport of small quantity liquid samples that meet the definition of a limited quantity radioactive material, and may also be corrosive and/or flammable. These samples will be packaged and shipped in accordance with the US Department of Transportation (DOT) regulation 49 CFR 173.4, ''Exceptions for small quantities.'' The Sample Transfer Canister is of a ''French Can'' design, intended to be mated with a glove box for loading/unloading. Transport will typically take place north of the Wye Barricade between WRAP and the 222-S Laboratory. The Sample Transfer Canister will be shipped in an insulated ice chest, but the ice chest will not be a part of the small quantity package during prototype testing
The Implementation of Role-Playing Model in Principles of Finance Accounting Learning to Improve Students’ Enjoyment and Students’ Test Scores

Directory of Open Access Journals (Sweden)

L. Saptono

2010-01-01

Full Text Available This research is a classroom action research. The goal of conducting this research is to improve students’ enjoyment level and their test scores by implementing role-playing method. The research is conducted in Accounting Education Study Program of Sanata Dharma University at odd semester on academic year 2010/2011. The participants were divided into two classes. The first class was the class that got the treatment, while the second class was the control class. The result of the study showed that there was an improvement of students’ enjoyment level and test scores in the class which implemented role-playing method.
Benefits of expressive writing in reducing test anxiety: A randomized controlled trial in Chinese samples.

Science.gov (United States)

Shen, Lujun; Yang, Lei; Zhang, Jing; Zhang, Meng

2018-01-01

To explore the effect of expressive writing of positive emotions on test anxiety among senior-high-school students. The Test Anxiety Scale (TAS) was used to assess the anxiety level of 200 senior-high-school students. Seventy-five students with high anxiety were recruited and divided randomly into experimental and control groups. Each day for 30 days, the experimental group engaged in 20 minutes of expressive writing of positive emotions, while the control group was asked to merely write down their daily events. A second test was given after the month-long experiment to analyze whether there had been a reduction in anxiety among the sample. Quantitative data was obtained from TAS scores. The NVivo10.0 software program was used to examine the frequency of particular word categories used in participants' writing manuscripts. Senior-high-school students indicated moderate to high test anxiety. There was a significant difference in post-test results (P 0.05). Students' writing manuscripts were mainly encoded on five code categories: cause, anxiety manifestation, positive emotion, insight and evaluation. There was a negative relation between positive emotion, insight codes and test anxiety. There were significant differences in the positive emotion, anxiety manifestation, and insight code categories between the first 10 days' manuscripts and the last 10 days' ones. Long-term expressive writing of positive emotions appears to help reduce test anxiety by using insight and positive emotion words for Chinese students. Efficient and effective intervention programs to ease test anxiety can be designed based on this study.

Benefits of expressive writing in reducing test anxiety: A randomized controlled trial in Chinese samples.

Directory of Open Access Journals (Sweden)

Lujun Shen

Full Text Available To explore the effect of expressive writing of positive emotions on test anxiety among senior-high-school students.The Test Anxiety Scale (TAS was used to assess the anxiety level of 200 senior-high-school students. Seventy-five students with high anxiety were recruited and divided randomly into experimental and control groups. Each day for 30 days, the experimental group engaged in 20 minutes of expressive writing of positive emotions, while the control group was asked to merely write down their daily events. A second test was given after the month-long experiment to analyze whether there had been a reduction in anxiety among the sample. Quantitative data was obtained from TAS scores. The NVivo10.0 software program was used to examine the frequency of particular word categories used in participants' writing manuscripts.Senior-high-school students indicated moderate to high test anxiety. There was a significant difference in post-test results (P 0.05. Students' writing manuscripts were mainly encoded on five code categories: cause, anxiety manifestation, positive emotion, insight and evaluation. There was a negative relation between positive emotion, insight codes and test anxiety. There were significant differences in the positive emotion, anxiety manifestation, and insight code categories between the first 10 days' manuscripts and the last 10 days' ones.Long-term expressive writing of positive emotions appears to help reduce test anxiety by using insight and positive emotion words for Chinese students. Efficient and effective intervention programs to ease test anxiety can be designed based on this study.
Benefits of expressive writing in reducing test anxiety: A randomized controlled trial in Chinese samples

Science.gov (United States)

Zhang, Jing; Zhang, Meng

2018-01-01

Purpose To explore the effect of expressive writing of positive emotions on test anxiety among senior-high-school students. Methods The Test Anxiety Scale (TAS) was used to assess the anxiety level of 200 senior-high-school students. Seventy-five students with high anxiety were recruited and divided randomly into experimental and control groups. Each day for 30 days, the experimental group engaged in 20 minutes of expressive writing of positive emotions, while the control group was asked to merely write down their daily events. A second test was given after the month-long experiment to analyze whether there had been a reduction in anxiety among the sample. Quantitative data was obtained from TAS scores. The NVivo10.0 software program was used to examine the frequency of particular word categories used in participants’ writing manuscripts. Results Senior-high-school students indicated moderate to high test anxiety. There was a significant difference in post-test results (P 0.05). Students’ writing manuscripts were mainly encoded on five code categories: cause, anxiety manifestation, positive emotion, insight and evaluation. There was a negative relation between positive emotion, insight codes and test anxiety. There were significant differences in the positive emotion, anxiety manifestation, and insight code categories between the first 10 days’ manuscripts and the last 10 days’ ones. Conclusions Long-term expressive writing of positive emotions appears to help reduce test anxiety by using insight and positive emotion words for Chinese students. Efficient and effective intervention programs to ease test anxiety can be designed based on this study. PMID:29401473
A comparative study of general intelligence in Spanish and Moroccan samples.

Science.gov (United States)

Diaz, Amelia; Sellami, Khadija; Infanzón, Eugenia; Lanzón, Teresa; Lynn, Richard

2012-07-01

The aim of this study is to fill a gap in intelligence research by presenting data for the average IQ in Morocco and for a comparable sample in Spain. Adult samples were administered the Raven Standard Progressive Matrices (SPM) (Raven, Court, & Raven, 2001) and scored for the total test and for the three sub-factors of gestalt continuation, verbal-analytical reasoning and visuospatial ability identified by Lynn, Allik, and Irwing (2004). The total test and the three factors have shown satisfactory reliability. Our results for the Moroccan sample show significant relationship between general intelligence factor, gestalt continuation and visuospatial ability with education level and income. Conversely, these variables have been shown to be independent for the Spanish sample. This sample obtained significantly higher scores for the four factors assessed than the Moroccan one. These differences have been found also comparing samples with the same education levels. Finally, the errors percentage for Moroccans has been higher than for Spaniards in all the items, suggesting that the level of difficulty was higher for the Moroccan sample.
EPS-LASSO: Test for High-Dimensional Regression Under Extreme Phenotype Sampling of Continuous Traits.

Science.gov (United States)

Xu, Chao; Fang, Jian; Shen, Hui; Wang, Yu-Ping; Deng, Hong-Wen

2018-01-25

Extreme phenotype sampling (EPS) is a broadly-used design to identify candidate genetic factors contributing to the variation of quantitative traits. By enriching the signals in extreme phenotypic samples, EPS can boost the association power compared to random sampling. Most existing statistical methods for EPS examine the genetic factors individually, despite many quantitative traits have multiple genetic factors underlying their variation. It is desirable to model the joint effects of genetic factors, which may increase the power and identify novel quantitative trait loci under EPS. The joint analysis of genetic data in high-dimensional situations requires specialized techniques, e.g., the least absolute shrinkage and selection operator (LASSO). Although there are extensive research and application related to LASSO, the statistical inference and testing for the sparse model under EPS remain unknown. We propose a novel sparse model (EPS-LASSO) with hypothesis test for high-dimensional regression under EPS based on a decorrelated score function. The comprehensive simulation shows EPS-LASSO outperforms existing methods with stable type I error and FDR control. EPS-LASSO can provide a consistent power for both low- and high-dimensional situations compared with the other methods dealing with high-dimensional situations. The power of EPS-LASSO is close to other low-dimensional methods when the causal effect sizes are small and is superior when the effects are large. Applying EPS-LASSO to a transcriptome-wide gene expression study for obesity reveals 10 significant body mass index associated genes. Our results indicate that EPS-LASSO is an effective method for EPS data analysis, which can account for correlated predictors. The source code is available at https://github.com/xu1912/EPSLASSO. hdeng2@tulane.edu. Supplementary data are available at Bioinformatics online. © The Author (2018). Published by Oxford University Press. All rights reserved. For Permissions, please
Heart valve surgery: EuroSCORE vs. EuroSCORE II vs. Society of Thoracic Surgeons score

Directory of Open Access Journals (Sweden)

Muhammad Sharoz Rabbani

2014-12-01

Full Text Available Background This is a validation study comparing the European System for Cardiac Operative Risk Evaluation (EuroSCORE II with the previous additive (AES and logistic EuroSCORE (LES and the Society of Thoracic Surgeons’ (STS risk prediction algorithm, for patients undergoing valve replacement with or without bypass in Pakistan. Patients and Methods Clinical data of 576 patients undergoing valve replacement surgery between 2006 and 2013 were retrospectively collected and individual expected risks of death were calculated by all four risk prediction algorithms. Performance of these risk algorithms was evaluated in terms of discrimination and calibration. Results There were 28 deaths (4.8% among 576 patients, which was lower than the predicted mortality of 5.16%, 6.96% and 4.94% by AES, LES and EuroSCORE II but was higher than 2.13% predicted by STS scoring system. For single and double valve replacement procedures, EuroSCORE II was the best predictor of mortality with highest Hosmer and Lemmeshow test (H-L p value (0.346 to 0.689 and area under the receiver operating characteristic (ROC curve (0.637 to 0.898. For valve plus concomitant coronary artery bypass grafting (CABG patients actual mortality was 1.88%. STS calculator came out to be the best predictor of mortality for this subgroup with H-L p value (0.480 to 0.884 and ROC (0.657 to 0.775. Conclusions For Pakistani population EuroSCORE II is an accurate predictor for individual operative risk in patients undergoing isolated valve surgery, whereas STS performs better in the valve plus CABG group.
Pattern analysis of total item score and item response of the Kessler Screening Scale for Psychological Distress (K6 in a nationally representative sample of US adults

Directory of Open Access Journals (Sweden)

Shinichiro Tomitaka

2017-02-01

Full Text Available Background Several recent studies have shown that total scores on depressive symptom measures in a general population approximate an exponential pattern except for the lower end of the distribution. Furthermore, we confirmed that the exponential pattern is present for the individual item responses on the Center for Epidemiologic Studies Depression Scale (CES-D. To confirm the reproducibility of such findings, we investigated the total score distribution and item responses of the Kessler Screening Scale for Psychological Distress (K6 in a nationally representative study. Methods Data were drawn from the National Survey of Midlife Development in the United States (MIDUS, which comprises four subsamples: (1 a national random digit dialing (RDD sample, (2 oversamples from five metropolitan areas, (3 siblings of individuals from the RDD sample, and (4 a national RDD sample of twin pairs. K6 items are scored using a 5-point scale: “none of the time,” “a little of the time,” “some of the time,” “most of the time,” and “all of the time.” The pattern of total score distribution and item responses were analyzed using graphical analysis and exponential regression model. Results The total score distributions of the four subsamples exhibited an exponential pattern with similar rate parameters. The item responses of the K6 approximated a linear pattern from “a little of the time” to “all of the time” on log-normal scales, while “none of the time” response was not related to this exponential pattern. Discussion The total score distribution and item responses of the K6 showed exponential patterns, consistent with other depressive symptom scales.
Credit concession through credit scoring: Analysis and application proposal

Directory of Open Access Journals (Sweden)

Oriol Amat

2017-01-01

Full Text Available Purpose: The study herein develops and tests a credit scoring model which can help financial institutions in assessing credit requests. Design/methodology/approach: The empirical study has the objective of answering two questions: (1 Which ratios better discriminate the companies based on their being solvent or insolvent? and (2 What is the relative importance of these ratios? To do this, several statistical techniques with a multifactorial focus have been used (Multivariate Analysis of Variance, Linear Discriminant Analysis, Logit and Probit Models. Several samples of companies have been used in order to obtain and to test the model. Findings: Through the application of several statistical techniques, the credit scoring model has been proved to be effective in discriminating between good and bad creditors. Research limitations: This study focuses on manufacturing, commercial and services companies of all sizes in Spain; Therefore, the conclusions may differ for other geographical locations. Practical implications: Because credit is one of the main drivers of growth, a solid credit scoring model can help financial institutions assessing to whom to grant credit and to whom not to grant credit. Social implications: Because of the growing importance of credit for our society and the fear of granting it due to the latest financial turmoil, a solid credit scoring model can strengthen the trust toward the financial institutions assessment’s. Originality/value: There is already a stream of literature related to credit scoring. However, this paper focuses on Spanish firms and proves the results of our model based on real data. The application of the model to detect the probability of default in loans is original.
Lord-Wingersky Algorithm Version 2.0 for Hierarchical Item Factor Models with Applications in Test Scoring, Scale Alignment, and Model Fit Testing.

Science.gov (United States)

Cai, Li

2015-06-01

Lord and Wingersky's (Appl Psychol Meas 8:453-461, 1984) recursive algorithm for creating summed score based likelihoods and posteriors has a proven track record in unidimensional item response theory (IRT) applications. Extending the recursive algorithm to handle multidimensionality is relatively simple, especially with fixed quadrature because the recursions can be defined on a grid formed by direct products of quadrature points. However, the increase in computational burden remains exponential in the number of dimensions, making the implementation of the recursive algorithm cumbersome for truly high-dimensional models. In this paper, a dimension reduction method that is specific to the Lord-Wingersky recursions is developed. This method can take advantage of the restrictions implied by hierarchical item factor models, e.g., the bifactor model, the testlet model, or the two-tier model, such that a version of the Lord-Wingersky recursive algorithm can operate on a dramatically reduced set of quadrature points. For instance, in a bifactor model, the dimension of integration is always equal to 2, regardless of the number of factors. The new algorithm not only provides an effective mechanism to produce summed score to IRT scaled score translation tables properly adjusted for residual dependence, but leads to new applications in test scoring, linking, and model fit checking as well. Simulated and empirical examples are used to illustrate the new applications.
The first DC performance test and analysis of CC conductor short sample at ASIPP conductor test facility

International Nuclear Information System (INIS)

Shi Yi; Wu Yu; Liu Huajun; Long Feng; Qian Li; Ren Zhibin; Li Shaolei; Liu Bo; Chen Jinglin

2012-01-01

Highlights: ► In this study the first DC performance experiments of ITER correction coil conductor short sample have been carried out in ASIPP test facility. ► A CC conductor short sample was fabricated and tested to confirm the capability of this test facility for qualification tests of CC conductors. ► There is no obvious impact of cycling on DC performance measurement. ► Those measured results of current sharing temperature are in agreement with the expected results from strand scaling - Abstract: The first DC performance experiments of ITER correction coil (CC) conductor short sample have been carried out in the conductor test facility of Institute of Plasma Physics, CAS (ASIPP) in January this year. Those experiments aim to investigate the DC performance of ITER CC conductor. The tested conductor short sample is bended as a half circle with the diameter of 270 mm to meet the background magnetic field shape. The half circle part of sample is longer than the final twist pitch. The current sharing temperature (T cs ) in the 3.86 T external magnetic field (B ex ), ≤12 kA could be measured including the critical current (I c ) run. There is no obvious impact of 1000 cycles on DC performance. Those measured T cs results are in agreement with the expected results from strand scaling.
Predicting Freshman Grade Point Average From College Admissions Test Scores and State High School Test Scores

OpenAIRE

Koretz, Daniel; Yu, C; Mbekeani, Preeya Pandya; Langi, M.; Dhaliwal, Tasminda Kaur; Braslow, David Arthur

2016-01-01

The current focus on assessing “college and career readiness” raises an empirical question: How do high school tests compare with college admissions tests in predicting performance in college? We explored this using data from the City University of New York and public colleges in Kentucky. These two systems differ in the choice of college admissions test, the stakes for students on the high school test, and demographics. We predicted freshman grade point average (FGPA) from high school GPA an...
Laboratory Tests of Bitumen Samples Elasticity

Science.gov (United States)

Ziganshin, E. R.; Usmanov, S. A.; Khasanov, D. I.; Khamidullina, G. S.

2018-05-01

This paper is devoted to the study of the elastic and acoustic properties of bitumen core samples. The travel velocities of the ultrasonic P- and S-waves were determined under in-situ simulation conditions. The resulting data were then used to calculate dynamic Young's modulus and Poisson's ratio. The authors studied the correlation between the elasticity and the permeability and porosity. In addition, the tests looked into how the acoustic properties had changed with temperature rise.
Evaluation of the validity of osteoporosis and fracture risk assessment tools (IOF One Minute Test, SCORE, and FRAX) in postmenopausal Palestinian women.

Science.gov (United States)

Kharroubi, Akram; Saba, Elias; Ghannam, Ibrahim; Darwish, Hisham

2017-12-01

The need for simple self-assessment tools is necessary to predict women at high risk for developing osteoporosis. In this study, tools like the IOF One Minute Test, Fracture Risk Assessment Tool (FRAX), and Simple Calculated Osteoporosis Risk Estimation (SCORE) were found to be valid for Palestinian women. The threshold for predicting women at risk for each tool was estimated. The purpose of this study is to evaluate the validity of the updated IOF (International Osteoporosis Foundation) One Minute Osteoporosis Risk Assessment Test, FRAX, SCORE as well as age alone to detect the risk of developing osteoporosis in postmenopausal Palestinian women. Three hundred eighty-two women 45 years and older were recruited including 131 women with osteoporosis and 251 controls following bone mineral density (BMD) measurement, 287 completed questionnaires of the different risk assessment tools. Receiver operating characteristic (ROC) curves were evaluated for each tool using bone BMD as the gold standard for osteoporosis. The area under the ROC curve (AUC) was the highest for FRAX calculated with BMD for predicting hip fractures (0.897) followed by FRAX for major fractures (0.826) with cut-off values ˃1.5 and ˃7.8%, respectively. The IOF One Minute Test AUC (0.629) was the lowest compared to other tested tools but with sufficient accuracy for predicting the risk of developing osteoporosis with a cut-off value ˃4 total yes questions out of 18. SCORE test and age alone were also as good predictors of risk for developing osteoporosis. According to the ROC curve for age, women ≥64 years had a higher risk of developing osteoporosis. Higher percentage of women with low BMD (T-score ≤-1.5) or osteoporosis (T-score ≤-2.5) was found among women who were not exposed to the sun, who had menopause before the age of 45 years, or had lower body mass index (BMI) compared to controls. Women who often fall had lower BMI and approximately 27% of the recruited postmenopausal
Use of Prehire Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF) Police Candidate Scores to Predict Supervisor Ratings of Posthire Performance.

Science.gov (United States)

Tarescavage, Anthony M; Brewster, JoAnne; Corey, David M; Ben-Porath, Yossef S

2015-08-01

We examined associations between prehire Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF) scores and posthire performance ratings for a sample of 131 male police officers. Substantive scale scores in this sample were meaningfully lower than those obtained by the test's normative sample and substantially range restricted, but scores were consistent with those produced by members of the police candidate comparison group (Corey & Ben-Porath). After applying a statistical correction for range restriction, we found several associations between MMPI-2-RF substantive scale scores and supervisor ratings of job-related performance. Findings for scales from the emotional dysfunction and interpersonal functioning domains of the test were particularly strong. For example, scales assessing low positive emotions and social avoidance were associated with several criteria that may be affected by lack of engagement with one's environment and other people, including problems with routine task performance, decision making, assertiveness, conscientiousness, and social competence. Implications of these findings for assessment science and practice are discussed. © The Author(s) 2014.
Intelligence Test Scores and Birth Order among Young Norwegian Men (Conscripts) Analyzed within and between Families

Science.gov (United States)

Bjerkedal, Tor; Kristensen, Petter; Skjeret, Geir A.; Brevik, John I.

2007-01-01

The present paper reports the results of a within and between family analysis of the relation between birth order and intelligence. The material comprises more than a quarter of a million test scores for intellectual performance of Norwegian male conscripts recorded during 1984-2004. Conscripts, mostly 18-19 years of age, were born to women for…
Genome-Wide Polygenic Scores Predict Reading Performance Throughout the School Years.

Science.gov (United States)

Selzam, Saskia; Dale, Philip S; Wagner, Richard K; DeFries, John C; Cederlöf, Martin; O'Reilly, Paul F; Krapohl, Eva; Plomin, Robert

2017-07-04

It is now possible to create individual-specific genetic scores, called genome-wide polygenic scores (GPS). We used a GPS for years of education ( EduYears ) to predict reading performance assessed at UK National Curriculum Key Stages 1 (age 7), 2 (age 12) and 3 (age 14) and on reading tests administered at ages 7 and 12 in a UK sample of 5,825 unrelated individuals. EduYears GPS accounts for up to 5% of the variance in reading performance at age 14. GPS predictions remained significant after accounting for general cognitive ability and family socioeconomic status. Reading performance of children in the lowest and highest 12.5% of the EduYears GPS distribution differed by a mean growth in reading ability of approximately two school years. It seems certain that polygenic scores will be used to predict strengths and weaknesses in education.
Reliability and validity of the new Tanaka B Intelligence Scale scores: a group intelligence test.

Science.gov (United States)

Uno, Yota; Mizukami, Hitomi; Ando, Masahiko; Yukihiro, Ryoji; Iwasaki, Yoko; Ozaki, Norio

2014-01-01

The present study evaluated the reliability and concurrent validity of the new Tanaka B Intelligence Scale, which is an intelligence test that can be administered on groups within a short period of time. The new Tanaka B Intelligence Scale and Wechsler Intelligence Scale for Children-Third Edition were administered to 81 subjects (mean age ± SD 15.2 ± 0.7 years) residing in a juvenile detention home; reliability was assessed using Cronbach's alpha coefficient, and concurrent validity was assessed using the one-way analysis of variance intraclass correlation coefficient. Moreover, receiver operating characteristic analysis for screening for individuals who have a deficit in intellectual function (an FIQIntelligence Scale IQ (BIQ) was 0.86, and the intraclass correlation coefficient with FIQ was 0.83. Receiver operating characteristic analysis demonstrated an area under the curve of 0.89 (95% CI: 0.85-0.96). In addition, the stratum-specific likelihood ratio for the BIQ≤65 stratum was 13.8 (95% CI: 3.9-48.9), and the stratum-specific likelihood ratio for the BIQ≥76 stratum was 0.1 (95% CI: 0.03-0.4). Thus, intellectual disability could be ruled out or determined. The present results demonstrated that the new Tanaka B Intelligence Scale score had high reliability and concurrent validity with the Wechsler Intelligence Scale for Children-Third Edition score. Moreover, the post-test probability for the BIQ could be calculated when screening for individuals who have a deficit in intellectual function. The new Tanaka B Intelligence Test is convenient and can be administered within a variety of settings. This enables evaluation of intellectual development even in settings where performing intelligence tests have previously been difficult.
Transfer of test samples and wastes between post-irradiation test facilities (FMF, AGF, MMF)

International Nuclear Information System (INIS)

Ishida, Yasukazu; Suzuki, Kazuhisa; Ebihara, Hikoe; Matsushima, Yasuyoshi; Kashiwabara, Hidechiyo

1975-02-01

Wide review is given on the problems associated with the transfer of test samples and wastes between post-irradiation test facilities, FMF (Fuel Monitoring Facility), AGF (Alpha Gamma Facility), and MMF (Material Monitoring Facility) at the Oarai Engineering Center, PNC. The test facilities are connected with the JOYO plant, an experimental fast reactor being constructed at Oarai. As introductory remarks, some special features of transferring irradiated materials are described. In the second part, problems on the management of nuclear materials and radio isotopes are described item by item. In the third part, the specific materials that are envisaged to be transported between JOYO and the test facilities are listed together with their geometrical shapes, dimensions, etc. In the fourth part, various routes and methods of transportation are explained with many block charts and figures. Brief explanation with lists and drawings is also given to transportation casks and vessels. Finally, some future problems are discussed, such as the prevention of diffusive contamination, ease of decontamination, and the identification of test samples. (Aoki, K.)
In-Pipe Wireless Communication for Underground Sampling and Testing

NARCIS (Netherlands)

Nguyen, Nhan D.T.; Le, Duc V.; Meratnia, Nirvana; Havinga, Paul J.M.

2017-01-01

In this paper, we present an effective and low- cost wireless communication system for extremely long and narrow pipes that can replay the extant wire system in underground sensor network applications such as soil sampling and testing with the Cone Penetration Test (CPT), the most widely used
Test plan for K Basin Sludge Canister and Floor Sampling Device

International Nuclear Information System (INIS)

Meling, T.A.

1995-01-01

This document provides the test plan and procedure forms for conducting the functional and operational acceptance testing of the K Basin Sludge Canister and Floor Sampling Device(s). These samplers samples sludge off the floor of the 100K Basins and out of 100K fuel storage canisters
The Bootstrap, the Jackknife, and the Randomization Test: A Sampling Taxonomy.

Science.gov (United States)

Rodgers, J L

1999-10-01

A simple sampling taxonomy is defined that shows the differences between and relationships among the bootstrap, the jackknife, and the randomization test. Each method has as its goal the creation of an empirical sampling distribution that can be used to test statistical hypotheses, estimate standard errors, and/or create confidence intervals. Distinctions between the methods can be made based on the sampling approach (with replacement versus without replacement) and the sample size (replacing the whole original sample versus replacing a subset of the original sample). The taxonomy is useful for teaching the goals and purposes of resampling schemes. An extension of the taxonomy implies other possible resampling approaches that have not previously been considered. Univariate and multivariate examples are presented.

Psychometric properties of the Hare Psychopathy Checklist-Revised (PCL-R) in a representative sample of Canadian federal offenders.

Science.gov (United States)

Storey, Jennifer E; Hart, Stephen D; Cooke, David J; Michie, Christine

2016-04-01

The Hare Psychopathy Checklist-Revised (PCL-R; Hare, 2003) is a commonly used psychological test for assessing traits of psychopathic personality disorder. Despite the abundance of research using the PCL-R, the vast majority of research used samples of convenience rather than systematic methods to minimize sampling bias and maximize the generalizability of findings. This potentially complicates the interpretation of test scores and research findings, including the "norms" for offenders from the United States and Canada included in the PCL-R manual. In the current study, we evaluated the psychometric properties of PCL-R scores for all male offenders admitted to a regional reception center of the Correctional Service of Canada during a 1-year period (n = 375). Because offenders were admitted for assessment prior to institutional classification, they comprise a sample that was heterogeneous with respect to correctional risks and needs yet representative of all offenders in that region of the service. We examined the distribution of PCL-R scores, classical test theory indices of its structural reliability, the factor structure of test items, and the external correlates of test scores. The findings were highly consistent with those typically reported in previous studies. We interpret these results as indicating it is unlikely any sampling limitations of past research using the PCL-R resulted in findings that were, overall, strongly biased or unrepresentative. (c) 2016 APA, all rights reserved).
Developing an international scoring system for a consensus-based social cognition measure: MSCEIT-managing emotions.

Science.gov (United States)

Hellemann, G S; Green, M F; Kern, R S; Sitarenios, G; Nuechterlein, K H

2017-10-01

Measures of social cognition are increasingly being applied to psychopathology, including studies of schizophrenia and other psychotic disorders. Tests of social cognition present unique challenges for international adaptations. The Mayer-Salovey-Caruso Emotional Intelligence Test, Managing Emotions Branch (MSCEIT-ME) is a commonly-used social cognition test that involves the evaluation of social scenarios presented in vignettes. This paper presents evaluations of translations of this test in six different languages based on representative samples from the relevant countries. The goal was to identify items from the MSCEIT-ME that show different response patterns across countries using indices of discrepancy and content validity criteria. An international version of the MSCEIT-ME scoring was developed that excludes items that showed undesirable properties across countries. We then confirmed that this new version had better performance (i.e. less discrepancy across regions) in international samples than the version based on the original norms. Additionally, it provides scores that are comparable to ratings based on local norms. This paper shows that it is possible to adapt complex social cognitive tasks so they can provide valid data across different cultural contexts.
Naive scoring of human sleep based on a hidden Markov model of the electroencephalogram.

Science.gov (United States)

Yaghouby, Farid; Modur, Pradeep; Sunderam, Sridhar

2014-01-01

Clinical sleep scoring involves tedious visual review of overnight polysomnograms by a human expert. Many attempts have been made to automate the process by training computer algorithms such as support vector machines and hidden Markov models (HMMs) to replicate human scoring. Such supervised classifiers are typically trained on scored data and then validated on scored out-of-sample data. Here we describe a methodology based on HMMs for scoring an overnight sleep recording without the benefit of a trained initial model. The number of states in the data is not known a priori and is optimized using a Bayes information criterion. When tested on a 22-subject database, this unsupervised classifier agreed well with human scores (mean of Cohen's kappa > 0.7). The HMM also outperformed other unsupervised classifiers (Gaussian mixture models, k-means, and linkage trees), that are capable of naive classification but do not model dynamics, by a significant margin (p < 0.05).
Post-bronchoscopy pneumonia in patients suffering from lung cancer: Development and validation of a risk prediction score.

Science.gov (United States)

Takiguchi, Hiroto; Hayama, Naoki; Oguma, Tsuyoshi; Harada, Kazuki; Sato, Masako; Horio, Yukihiro; Tanaka, Jun; Tomomatsu, Hiromi; Tomomatsu, Katsuyoshi; Takihara, Takahisa; Niimi, Kyoko; Nakagawa, Tomoki; Masuda, Ryota; Aoki, Takuya; Urano, Tetsuya; Iwazaki, Masayuki; Asano, Koichiro

2017-05-01

The incidence, risk factors, and consequences of pneumonia after flexible bronchoscopy in patients with lung cancer have not been studied in detail. We retrospectively analyzed the data from 237 patients with lung cancer who underwent diagnostic bronchoscopy between April 2012 and July 2013 (derivation sample) and 241 patients diagnosed between August 2013 and July 2014 (validation sample) in a tertiary referral hospital in Japan. A score predictive of post-bronchoscopy pneumonia was developed in the derivation sample and tested in the validation sample. Pneumonia developed after bronchoscopy in 6.3% and 4.1% of patients in the derivation and validation samples, respectively. Patients who developed post-bronchoscopy pneumonia needed to change or cancel their planned cancer therapy more frequently than those without pneumonia (56% vs. 6%, ppneumonia, which we added to develop our predictive score. The incidence of pneumonia associated with scores=0, 1, and ≥2 was 0, 3.7, and 13.4% respectively in the derivation sample (p=0.003), and 0, 2.9, and 9.7% respectively in the validation sample (p=0.016). The incidence of post-bronchoscopy pneumonia in patients with lung cancer was not rare and associated with adverse effects on the clinical course. A simple 3-point predictive score identified patients with lung cancer at high risk of post-bronchoscopy pneumonia prior to the procedure. Copyright © 2017 The Japanese Respiratory Society. Published by Elsevier B.V. All rights reserved.
Gas liquid sampling for closed canisters in KW Basin - test plan

International Nuclear Information System (INIS)

Pitkoff, C.C.

1995-01-01

Test procedures for the gas/liquid sampler. Characterization of the Spent Nuclear Fuel, SNF, sealed in canisters at KW-Basin is needed to determine the state of storing SNF wet. Samples of the liquid and the gas in the closed canisters will be taken to gain characterization information. Sampling equipment has been designed to retrieve gas and liquid from the closed canisters in KW basin. This plan is written to outline the test requirements for this developmental sampling equipment
Investigating the Value of Section Scores for the "TOEFL iBT"® Test. "TOEFL iBT"® Research Report. TOEFL iBT-21. ETS Research Report RR-13-35

Science.gov (United States)

Sawaki, Yasuyo; Sinharay, Sandip

2013-01-01

This study investigates the value of reporting the reading, listening, speaking, and writing section scores for the "TOEFL iBT"® test, focusing on 4 related aspects of the psychometric quality of the TOEFL iBT section scores: reliability of the section scores, dimensionality of the test, presence of distinct score profiles, and the…
21 CFR 864.3260 - OTC test sample collection systems for drugs of abuse testing.

Science.gov (United States)

2010-04-01

... 21 Food and Drugs 8 2010-04-01 2010-04-01 false OTC test sample collection systems for drugs of abuse testing. 864.3260 Section 864.3260 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED) MEDICAL DEVICES HEMATOLOGY AND PATHOLOGY DEVICES Pathology...
Experimental Investigation Of Microbially Induced Corrosion Of Test Samples And Effect Of Self-assembled Hydrophobic Monolayers. Exposure Of Test Samples To Continuous Microbial Cultures, Chemical Analysis, And Biochemical Studies

CERN Document Server

Laurinavichius, K S

1998-01-01

Experimental Investigation Of Microbially Induced Corrosion Of Test Samples And Effect Of Self-assembled Hydrophobic Monolayers. Exposure Of Test Samples To Continuous Microbial Cultures, Chemical Analysis, And Biochemical Studies
Predicting Pre-Service Classroom Teachers' Civil Servant Recruitment Examination's Educational Sciences Test Scores Using Artificial Neural Networks

Science.gov (United States)

Demir, Metin

2015-01-01

This study predicts the number of correct answers given by pre-service classroom teachers in Civil Servant Recruitment Examination's (CSRE) educational sciences test based on their high school grade point averages, university entrance scores, and grades (mid-term and final exams) from their undergraduate educational courses. This study was…
[Propensity score matching in SPSS].

Science.gov (United States)

Huang, Fuqiang; DU, Chunlin; Sun, Menghui; Ning, Bing; Luo, Ying; An, Shengli

2015-11-01

To realize propensity score matching in PS Matching module of SPSS and interpret the analysis results. The R software and plug-in that could link with the corresponding versions of SPSS and propensity score matching package were installed. A PS matching module was added in the SPSS interface, and its use was demonstrated with test data. Score estimation and nearest neighbor matching was achieved with the PS matching module, and the results of qualitative and quantitative statistical description and evaluation were presented in the form of a graph matching. Propensity score matching can be accomplished conveniently using SPSS software.
[Prognostic scores for pulmonary embolism].

Science.gov (United States)

Junod, Alain

2016-03-23

Nine prognostic scores for pulmonary embolism (PE), based on retrospective and prospective studies, published between 2000 and 2014, have been analyzed and compared. Most of them aim at identifying PE cases with a low risk to validate their ambulatory care. Important differences in the considered outcomes: global mortality, PE-specific mortality, other complications, sizes of low risk groups, exist between these scores. The most popular score appears to be the PESI and its simplified version. Few good quality studies have tested the applicability of these scores to PE outpatient care, although this approach tends to already generalize in the medical practice.
Two-Sample Statistics for Testing the Equality of Survival Functions Against Improper Semi-parametric Accelerated Failure Time Alternatives: An Application to the Analysis of a Breast Cancer Clinical Trial

Science.gov (United States)

BROËT, PHILIPPE; TSODIKOV, ALEXANDER; DE RYCKE, YANN; MOREAU, THIERRY

2010-01-01

This paper presents two-sample statistics suited for testing equality of survival functions against improper semi-parametric accelerated failure time alternatives. These tests are designed for comparing either the short- or the long-term effect of a prognostic factor, or both. These statistics are obtained as partial likelihood score statistics from a time-dependent Cox model. As a consequence, the proposed tests can be very easily implemented using widely available software. A breast cancer clinical trial is presented as an example to demonstrate the utility of the proposed tests. PMID:15293627
Two-sample statistics for testing the equality of survival functions against improper semi-parametric accelerated failure time alternatives: an application to the analysis of a breast cancer clinical trial.

Science.gov (United States)

Broët, Philippe; Tsodikov, Alexander; De Rycke, Yann; Moreau, Thierry

2004-06-01

This paper presents two-sample statistics suited for testing equality of survival functions against improper semi-parametric accelerated failure time alternatives. These tests are designed for comparing either the short- or the long-term effect of a prognostic factor, or both. These statistics are obtained as partial likelihood score statistics from a time-dependent Cox model. As a consequence, the proposed tests can be very easily implemented using widely available software. A breast cancer clinical trial is presented as an example to demonstrate the utility of the proposed tests.
Conceptual Scoring and Classification Accuracy of Vocabulary Testing in Bilingual Children

Science.gov (United States)

Anaya, Jissel B.; Peña, Elizabeth D.; Bedore, Lisa M.

2018-01-01

Purpose: This study examined the effects of single-language and conceptual scoring on the vocabulary performance of bilingual children with and without specific language impairment. We assessed classification accuracy across 3 scoring methods. Method: Participants included Spanish-English bilingual children (N = 247) aged 5;1 (years;months) to…
Differences in physical-fitness test scores between actively and passively recruited older adults : Consequences for norm-based classification

NARCIS (Netherlands)

van Heuvelen, M.J.G.; Stevens, M.; Kempen, G.I.J.M.

This study investigated differences in physical-fitness test scores between actively and passively recruited older adults and the consequences thereof for norm-based classification of individuals. Walking endurance, grip strength, hip flexibility, balance, manual dexterity, and reaction time were
Acceptability of self-collected vaginal samples for HPV testing in an ...

African Journals Online (AJOL)

Objective: To evaluate the acceptability of self-collected vaginal samples for HPV testing in women living in rural and urban areas of ... Conclusion: Acceptability of self-sampling for HPV testing was similarly excellent in both groups despite their difference in terms ... cancer is the leading cause of death caused by cancer in.
Validation of the Six Sigma Z-score for the quality assessment of clinical laboratory timeliness.

Science.gov (United States)

Ialongo, Cristiano; Bernardini, Sergio

2018-03-28

The International Federation of Clinical Chemistry and Laboratory Medicine has introduced in recent times the turnaround time (TAT) as mandatory quality indicator for the postanalytical phase. Classic TAT indicators, namely, average, median, 90th percentile and proportion of acceptable test (PAT), are in use since almost 40 years and to date represent the mainstay for gauging the laboratory timeliness. In this study, we investigated the performance of the Six Sigma Z-score, which was previously introduced as a device for the quantitative assessment of timeliness. A numerical simulation was obtained modeling the actual TAT data set using the log-logistic probability density function. Five thousand replicates for each size of the artificial TAT random sample (n=20, 50, 250 and 1000) were generated, and different laboratory conditions were simulated manipulating the PDF in order to generate more or less variable data. The Z-score and the classic TAT indicators were assessed for precision (%CV), robustness toward right-tailing (precision at different sample variability), sensitivity and specificity. Z-score showed sensitivity and specificity comparable to PAT (≈80% with n≥250), but superior precision that ranged within 20% by moderately small sized samples (n≥50); furthermore, Z-score was less affected by the value of the cutoff used for setting the acceptable TAT, as well as by the sample variability that reflected into the magnitude of right-tailing. The Z-score was a valid indicator of laboratory timeliness and a suitable device to improve as well as to maintain the achieved quality level.
Approaches to Learning and Hispanic Children's Math Scores: The Moderating Role of English Proficiency

Science.gov (United States)

Bumgarner, Erin; Martin, Anne; Brooks-Gunn, Jeanne

2013-01-01

Accumulating evidence suggests that children's approaches to learning (ATL) at kindergarten entry predict their academic achievement years later. However, the gains associated with ATL may be diminished for Hispanic immigrant children, many of whom are English language learners (ELLs). We tested whether ATL predicted math scores in a sample of…
Fatigue testing on samples from Zircaloy-4 tubes type SEU-43

International Nuclear Information System (INIS)

Olaru, V.; Ionescu, V.; Nitu, A.; Ionescu, D.; Voicu, F.

2016-01-01

The paper presents the testing of samples worked from Zicaloy-4 tubes (as-received.. metallurgical state), utilized in the composition of the CANDU SEU-43 fuel bundle. These tests are intended to simulate their behaviour in a power cycling process inside the reactor. The testing process is of low cycle fatigue type, done outside of the reactor, on ''C-ring'' samples, cut along the transversal direction. These samples are tested at 1%, 2% and 3% amplitude deformation, at room temperature. The calibration curves for both types of tube (small and big diameter) are determined by using the finite element analyses with the ANSYS computer code. The cycling test results are in the form of a fatigue life curve (N-e) for zircaloy-4 used in the SEU-43 fuel bundle. The curve is determined by the experimental dependency between the number of cycles to fracture and the deformation amplitude. The low cycle fatigue mechanical tests done at room temperature together with electronic microscopy analyses have reflected the characteristic behaviour of the zircaloy-4 metal in the given environment conditions. (authors)
Modelling the predictive performance of credit scoring

Directory of Open Access Journals (Sweden)

Shi-Wei Shen

2013-07-01

Research purpose: The purpose of this empirical paper was to examine the predictive performance of credit scoring systems in Taiwan. Motivation for the study: Corporate lending remains a major business line for financial institutions. However, in light of the recent global financial crises, it has become extremely important for financial institutions to implement rigorous means of assessing clients seeking access to credit facilities. Research design, approach and method: Using a data sample of 10 349 observations drawn between 1992 and 2010, logistic regression models were utilised to examine the predictive performance of credit scoring systems. Main findings: A test of Goodness of fit demonstrated that credit scoring models that incorporated the Taiwan Corporate Credit Risk Index (TCRI, micro- and also macroeconomic variables possessed greater predictive power. This suggests that macroeconomic variables do have explanatory power for default credit risk. Practical/managerial implications: The originality in the study was that three models were developed to predict corporate firms’ defaults based on different microeconomic and macroeconomic factors such as the TCRI, asset growth rates, stock index and gross domestic product. Contribution/value-add: The study utilises different goodness of fits and receiver operator characteristics during the examination of the robustness of the predictive power of these factors.

Relationship Between Broiler Body Weights, Eimeria maxima Gross Lesion Scores, and Microscores in Three Anticoccidial Sensitivity Tests.

Science.gov (United States)

Barrios, Miguel A; Da Costa, Manuel; Kimminau, Emily; Fuller, Lorraine; Clark, Steven; Pesti, Gene; Beckstead, Robert

2017-06-01

Anticoccidial sensitivity tests (ASTs) serve to determine the efficacy of anticoccidial drugs against Eimeria field isolates in a controlled laboratory setting. The most commonly measured parameters are body weight gain, feed conversion ratio, gross intestinal lesion scores, and mortality. Due to the difficulty in reliably scoring gross lesion scores of Eimeria maxima , microscopic analysis of intestinal scrapings (microscores) can be used in the field to indicate the presence of this particular Eimeria. The goal of this study was to determine the relationship between E. maxima microscores and broiler body weights and gross E. maxima lesion scores in three ASTs. Day-old broiler chicks were raised for 12 days on a standard corn-soy diet. On Day 12, chicks were placed in Petersime batteries and treatment diets were provided. There were six birds per pen, four pens per treatment, and 12 treatments, for a total of 288 chicks per AST. The treatments were as follows: 1) nonmedicated, noninfected; 2) nonmedicated, infected; 3) lasalocid, infected; 4) salinomycin, infected; 5) diclazuril, infected; 6) monensin, infected; 7) decoquinate, infected; 8) narasin + nicarbazin, infected; 9) narasin, infected; 10) nicarbazin, infected; 11) robenidine, infected; and 12) zoalene, infected. On Day 14, chicks were challenged with an Eimeria field isolate by oral gavage. On Day 20, broilers were weighed, and gross lesion scores and microscores were classified from 0 to 4 depending on the severity of the gross lesion scores and E. maxima microscores. Data from three trials using different field isolates were statistically analyzed using a logarithmic regression model. There was no relationship (P = 0.1224) between microscores and body weight gain. There was a positive relationship between microscores and gross lesion scores (P = 0.004). However, there was also an interaction between isolate and treatment (P Eimeria or the amount of E. maxima in the inoculum.
42 CFR 493.901 - Approval of proficiency testing programs.

Science.gov (United States)

2010-10-01

...) Distribute the samples, using rigorous quality control to assure that samples mimic actual patient specimens... gynecologic cytology and on individual laboratory performance on testing events, cumulative reports and scores...
Calculating a Continuous Metabolic Syndrome Score Using Nationally Representative Reference Values.

Science.gov (United States)

Guseman, Emily Hill; Eisenmann, Joey C; Laurson, Kelly R; Cook, Stephen R; Stratbucker, William

2018-02-26

The prevalence of metabolic syndrome in youth varies on the basis of the classification system used, prompting implementation of continuous scores; however, the use of these scores is limited to the sample from which they were derived. We sought to describe the derivation of the continuous metabolic syndrome score using nationally representative reference values in a sample of obese adolescents and a national sample obtained from National Health and Nutrition Examination Survey (NHANES) 2011-2012. Clinical data were collected from 50 adolescents seeking obesity treatment at a stage 3 weight management center. A second analysis relied on data from adolescents included in NHANES 2011-2012, performed for illustrative purposes. The continuous metabolic syndrome score was calculated by regressing individual values onto nationally representative age- and sex-specific standards (NHANES III). Resultant z scores were summed to create a total score. The final sample included 42 obese adolescents (15 male and 35 female subjects; mean age, 14.8 ± 1.9 years) and an additional 445 participants from NHANES 2011-2012. Among the clinical sample, the mean continuous metabolic syndrome score was 4.16 ± 4.30, while the NHANES sample mean was quite a bit lower, at -0.24 ± 2.8. We provide a method to calculate the continuous metabolic syndrome by comparing individual risk factor values to age- and sex-specific percentiles from a nationally representative sample. Copyright © 2018 Academic Pediatric Association. Published by Elsevier Inc. All rights reserved.
Solubility testing of actinides on breathing-zone and area air samples

International Nuclear Information System (INIS)

Metzger, R.L.; Jessop, B.H.; McDowell, B.L.

1996-02-01

A solubility testing method for several common actinides has been developed with sufficient sensitivity to allow profiles to be determined from routine breathing zone and area air samples in the workplace. Air samples are covered with a clean filter to form a filter-sample-filter sandwich which is immersed in an extracellular lung serum simulant solution. The sample is moved to a fresh beaker of the lung fluid simulant each day for one week, and then weekly until the end of the 28 day test period. The soak solutions are wet ashed with nitric acid and hydrogen peroxide to destroy the organic components of the lung simulant solution prior to extraction of the nuclides of interest directly into an extractive scintillator for subsequent counting on a Photon-Electron Rejecting Alpha Liquid Scintillation (PERALS reg-sign) spectrometer. Solvent extraction methods utilizing the extractive scintillators have been developed for the isotopes of uranium, plutonium, and curium. The procedures normally produce an isotopic recovery greater than 95% and have been used to develop solubility profiles from air samples with 40 pCi or less of U 3 O 8 . Profiles developed for U 3 O 8 samples show good agreement with in vitro and in vivo tests performed by other investigators on samples from the same uranium mills
Compressive and Flexural Tests on Adobe Samples Reinforced with Wire Mesh

Science.gov (United States)

Jokhio, G. A.; Al-Tawil, Y. M. Y.; Syed Mohsin, S. M.; Gul, Y.; Ramli, N. I.

2018-03-01

Adobe is an economical, naturally available, and environment friendly construction material that offers excellent thermal and sound insulations as well as indoor air quality. It is important to understand and enhance the mechanical properties of this material, where a high degree of variation is reported in the literature owing to lack of research and standardization in this field. The present paper focuses first on the understanding of mechanical behaviour of adobe subjected to compressive stresses as well as flexure and then on enhancing the same with the help of steel wire mesh as reinforcement. A total of 22 samples were tested out of which, 12 cube samples were tested for compressive strength, whereas 10 beams samples were tested for modulus of rupture. Half of the samples in each category were control samples i.e. without wire mesh reinforcement, whereas the remaining half were reinforced with a single layer of wire mesh per sample. It has been found that the compressive strength of adobe increases by about 43% after adding a single layer of wire mesh reinforcement. The flexural response of adobe has also shown improvement with the addition of wire mesh reinforcement.
Reliability and validity of the new Tanaka B Intelligence Scale scores: a group intelligence test.

Directory of Open Access Journals (Sweden)

Yota Uno

Full Text Available OBJECTIVE: The present study evaluated the reliability and concurrent validity of the new Tanaka B Intelligence Scale, which is an intelligence test that can be administered on groups within a short period of time. METHODS: The new Tanaka B Intelligence Scale and Wechsler Intelligence Scale for Children-Third Edition were administered to 81 subjects (mean age ± SD 15.2 ± 0.7 years residing in a juvenile detention home; reliability was assessed using Cronbach's alpha coefficient, and concurrent validity was assessed using the one-way analysis of variance intraclass correlation coefficient. Moreover, receiver operating characteristic analysis for screening for individuals who have a deficit in intellectual function (an FIQ<70 was performed. In addition, stratum-specific likelihood ratios for detection of intellectual disability were calculated. RESULTS: The Cronbach's alpha for the new Tanaka B Intelligence Scale IQ (BIQ was 0.86, and the intraclass correlation coefficient with FIQ was 0.83. Receiver operating characteristic analysis demonstrated an area under the curve of 0.89 (95% CI: 0.85-0.96. In addition, the stratum-specific likelihood ratio for the BIQ≤65 stratum was 13.8 (95% CI: 3.9-48.9, and the stratum-specific likelihood ratio for the BIQ≥76 stratum was 0.1 (95% CI: 0.03-0.4. Thus, intellectual disability could be ruled out or determined. CONCLUSION: The present results demonstrated that the new Tanaka B Intelligence Scale score had high reliability and concurrent validity with the Wechsler Intelligence Scale for Children-Third Edition score. Moreover, the post-test probability for the BIQ could be calculated when screening for individuals who have a deficit in intellectual function. The new Tanaka B Intelligence Test is convenient and can be administered within a variety of settings. This enables evaluation of intellectual development even in settings where performing intelligence tests have previously been difficult.
Scoring system for differentiating perforated and non-perforated pediatric appendicitis.

Science.gov (United States)

Blumfield, Einat; Yang, Daniel; Grossman, Joshua

2017-10-01

Appendicitis is the most common indication for emergency pediatric surgery and its most significant complication is perforation. Perforated appendicitis (PA) may be managed conservatively, whereas non-perforated appendicitis (NP) is managed surgically. Recent studies have shown that ultrasound (US) is effective for differentiating between PA and NP, and does not expose pediatric patients to ionizing radiation. The purpose of this study is to enhance the accuracy of differentiation with a novel scoring system based on clinical, laboratory, and US findings. This retrospective study included 243 patients aged 2-17 years who presented between 2006 and 2013 with surgically proven appendicitis, of whom 60 had perforation. Clinical and laboratory data were collected and US images evaluated by a pediatric radiologist. To create the scoring system, point values were assigned to each parameter. A randomly selected training sample of 137 subjects was used to create a scoring prediction model. The model was tested on the remaining 106 patients. Scores of ≥6, ≥11, and ≥15 yielded specificities of 64, 91, and 99%, and sensitivities of 96, 61, and 29%, respectively (p < 0.001). We have designed a scoring system incorporating clinical, laboratory, and sonographic findings which can differentiate PA from NP with high specificity.
Preliminary testing of the reliability and feasibility of SAGE: a system to measure and score engagement with and use of research in health policies and programs.

Science.gov (United States)

Makkar, Steve R; Williamson, Anna; D'Este, Catherine; Redman, Sally

2017-12-19

Few measures of research use in health policymaking are available, and the reliability of such measures has yet to be evaluated. A new measure called the Staff Assessment of Engagement with Evidence (SAGE) incorporates an interview that explores policymakers' research use within discrete policy documents and a scoring tool that quantifies the extent of policymakers' research use based on the interview transcript and analysis of the policy document itself. We aimed to conduct a preliminary investigation of the usability, sensitivity, and reliability of the scoring tool in measuring research use by policymakers. Nine experts in health policy research and two independent coders were recruited. Each expert used the scoring tool to rate a random selection of 20 interview transcripts, and each independent coder rated 60 transcripts. The distribution of scores among experts was examined, and then, interrater reliability was tested within and between the experts and independent coders. Average- and single-measure reliability coefficients were computed for each SAGE subscales. Experts' scores ranged from the limited to extensive scoring bracket for all subscales. Experts as a group also exhibited at least a fair level of interrater agreement across all subscales. Single-measure reliability was at least fair except for three subscales: Relevance Appraisal, Conceptual Use, and Instrumental Use. Average- and single-measure reliability among independent coders was good to excellent for all subscales. Finally, reliability between experts and independent coders was fair to excellent for all subscales. Among experts, the scoring tool was comprehensible, usable, and sensitive to discriminate between documents with varying degrees of research use. Secondly, the scoring tool yielded scores with good reliability among the independent coders. There was greater variability among experts, although as a group, the tool was fairly reliable. The alignment between experts' and independent
Associations between MMPI-2-RF validity scale scores and extra-test measures of personality and psychopathology.

Science.gov (United States)

Forbey, Johnathan D; Lee, Tayla T C; Ben-Porath, Yossef S; Arbisi, Paul A; Gartland, Diane

2013-08-01

The current study explored associations between two potentially invalidating self-report styles detected by the Validity scales of the Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF), over-reporting and under-reporting, and scores on the MMPI-2-RF substantive, as well as eight collateral self-report measures administered either at the same time or within 1 to 10 days of MMPI-2-RF administration. Analyses were conducted with data provided by college students, male prisoners, and male psychiatric outpatients from a Veterans Administration facility. Results indicated that if either an over- or under-reporting response style was suggested by the MMPI-2-RF Validity scales, scores on the majority of the MMPI-2-RF substantive scales, as well as a number of collateral measures, were significantly affected in all three groups in the expected directions. Test takers who were identified as potentially engaging in an over- or under-reporting response style by the MMPI-2-RF Validity scales appeared to approach extra-test measures similarly regardless of when these measures were administered in relation to the MMPI-2-RF. Limitations and suggestions for future study are discussed.
Inert medium (helium) irradiation testing of pressure tube samples

International Nuclear Information System (INIS)

Ancuta, M.; Radu, V.; Stefan, V.; Preda, M.

2001-01-01

Irradiation tests currently performed in C-5 capsule aim at obtaining data and information concerning behavior to irradiation of pressure tubes of CANDU type fuel channel, to evidence the factors limiting operation life span. A calculation code for analysis and prediction of pressure tube behavior should be based upon periodical inspection results, post irradiation examination of the removed from reactor pressure tubes as well as on the experimental results obtained with materials subjected to irradiation conditions identical with the operational ones. Mechanical behavior analysis should focus both complex thermal-mechanical type stresses and mechanical properties alteration under irradiation. The experimental results should be applied: - to evaluate the irradiation effects upon mechanical properties of Zr-2.5% Nb exposed to fluences up to 10 21 n·cm -2 ; - to gather data concerning the real stress / real deformation characteristic from which characteristic quantities can be deduced as, for instance, elasticity modulus, plasticity modulus, exponent of stress term in the Tsu-Berteles relation, to be used within the CANTUP simulation code describing pressure tube behavior, currently developed at INR Pitesti; - to develop prediction methods of pressure tube behavior and merging with in-service inspection procedure in order to forecast the life span and the proper timing for replacement before major failures occur. The samples irradiated in C-5 capsule were extracted from the ends of Zr-2.5% Nb pressure tubes resulting from Cernavoda NPP Unit 1. The samples for tensile tests were extracted on longitudinal and transversal directions of the pressure tube. The tests were carried out under following conditions: - test environment temperature, 260 - 280 deg.C; - testing medium, helium at 1 - 6 b pressure; - neutron flux (E n > 1 MeV), 1 - 2 · 10 13 ncm -2 s -1 ; - neutron fluence (E n > 1 MeV), 4 · 10 20 ncm -2 . The following characteristics were obtained from tensile
Differential Predictive Validity of High School GPA and College Entrance Test Scores for University Students in Yemen

Science.gov (United States)

Al-Hattami, Abdulghani Ali Dawod

2012-01-01

High school grade point average and college entrance test scores are two admission criteria that are currently used by most colleges in Yemen to select their prospective students. Given their widespread use, it is important to investigate their predictive validity to ensure the accuracy of the admission decisions in these institutions. This study…
Chronic obstructive pulmonary disease (COPD) assessment test scores corresponding to modified Medical Research Council grades among COPD patients.

Science.gov (United States)

Lee, Chang-Hoon; Lee, Jinwoo; Park, Young Sik; Lee, Sang-Min; Yim, Jae-Joon; Kim, Young Whan; Han, Sung Koo; Yoo, Chul-Gyu

2015-09-01

In assigning patients with chronic obstructive pulmonary disease (COPD) to subgroups according to the updated guidelines of the Global Initiative for Chronic Obstructive Lung Disease, discrepancies have been noted between the COPD assessment test (CAT) criteria and modified Medical Research Council (mMRC) criteria. We investigated the determinants of symptom and risk groups and sought to identify a better CAT criterion. This retrospective study included COPD patients seen between June 20, 2012, and December 5, 2012. The CAT score that can accurately predict an mMRC grade ≥ 2 versus COPD patients, the percentages of patients classified into subgroups A, B, C, and D were 24.5%, 47.2%, 4.2%, and 24.1% based on CAT criteria and 49.3%, 22.4%, 8.9%, and 19.4% based on mMRC criteria, respectively. More than 90% of the patients who met the mMRC criteria for the 'more symptoms group' also met the CAT criteria. AUROC and CART analyses suggested that a CAT score ≥ 15 predicted an mMRC grade ≥ 2 more accurately than the current CAT score criterion. During follow-up, patients with CAT scores of 10 to 14 did not have a different risk of exacerbation versus those with CAT scores COPD patients.
Development of an objective gene expression panel as an alternative to self-reported symptom scores in human influenza challenge trials.

Science.gov (United States)

Muller, Julius; Parizotto, Eneida; Antrobus, Richard; Francis, James; Bunce, Campbell; Stranks, Amanda; Nichols, Marshall; McClain, Micah; Hill, Adrian V S; Ramasamy, Adaikalavan; Gilbert, Sarah C

2017-06-08

Influenza challenge trials are important for vaccine efficacy testing. Currently, disease severity is determined by self-reported scores to a list of symptoms which can be highly subjective. A more objective measure would allow for improved data analysis. Twenty-one volunteers participated in an influenza challenge trial. We calculated the daily sum of scores (DSS) for a list of 16 influenza symptoms. Whole blood collected at baseline and 24, 48, 72 and 96 h post challenge was profiled on Illumina HT12v4 microarrays. Changes in gene expression most strongly correlated with DSS were selected to train a Random Forest model and tested on two independent test sets consisting of 41 individuals profiled on a different microarray platform and 33 volunteers assayed by qRT-PCR. 1456 probes are significantly associated with DSS at 1% false discovery rate. We selected 19 genes with the largest fold change to train a random forest model. We observed good concordance between predicted and actual scores in the first test set (r = 0.57; RMSE = -16.1%) with the greatest agreement achieved on samples collected approximately 72 h post challenge. Therefore, we assayed samples collected at baseline and 72 h post challenge in the second test set by qRT-PCR and observed good concordance (r = 0.81; RMSE = -36.1%). We developed a 19-gene qRT-PCR panel to predict DSS, validated on two independent datasets. A transcriptomics based panel could provide a more objective measure of symptom scoring in future influenza challenge studies. Trial registration Samples were obtained from a clinical trial with the ClinicalTrials.gov Identifier: NCT02014870, first registered on December 5, 2013.
An Argument against Using Standardized Test Scores for Placement of International Undergraduate Students in English as a Second Language (ESL) Courses

Science.gov (United States)

Kokhan, Kateryna

2013-01-01

Development and administration of institutional ESL placement tests require a great deal of financial and human resources. Due to a steady increase in the number of international students studying in the United States, some US universities have started to consider using standardized test scores for ESL placement. The English Placement Test (EPT)…
CASP10-BCL::Fold efficiently samples topologies of large proteins.

Science.gov (United States)

Heinze, Sten; Putnam, Daniel K; Fischer, Axel W; Kohlmann, Tim; Weiner, Brian E; Meiler, Jens

2015-03-01

During CASP10 in summer 2012, we tested BCL::Fold for prediction of free modeling (FM) and template-based modeling (TBM) targets. BCL::Fold assembles the tertiary structure of a protein from predicted secondary structure elements (SSEs) omitting more flexible loop regions early on. This approach enables the sampling of conformational space for larger proteins with more complex topologies. In preparation of CASP11, we analyzed the quality of CASP10 models throughout the prediction pipeline to understand BCL::Fold's ability to sample the native topology, identify native-like models by scoring and/or clustering approaches, and our ability to add loop regions and side chains to initial SSE-only models. The standout observation is that BCL::Fold sampled topologies with a GDT_TS score > 33% for 12 of 18 and with a topology score > 0.8 for 11 of 18 test cases de novo. Despite the sampling success of BCL::Fold, significant challenges still exist in clustering and loop generation stages of the pipeline. The clustering approach employed for model selection often failed to identify the most native-like assembly of SSEs for further refinement and submission. It was also observed that for some β-strand proteins model refinement failed as β-strands were not properly aligned to form hydrogen bonds removing otherwise accurate models from the pool. Further, BCL::Fold samples frequently non-natural topologies that require loop regions to pass through the center of the protein. © 2015 Wiley Periodicals, Inc.
Estimation of sample size and testing power (Part 4).

Science.gov (United States)

Hu, Liang-ping; Bao, Xiao-lei; Guan, Xue; Zhou, Shi-guo

2012-01-01

Sample size estimation is necessary for any experimental or survey research. An appropriate estimation of sample size based on known information and statistical knowledge is of great significance. This article introduces methods of sample size estimation of difference test for data with the design of one factor with two levels, including sample size estimation formulas and realization based on the formulas and the POWER procedure of SAS software for quantitative data and qualitative data with the design of one factor with two levels. In addition, this article presents examples for analysis, which will play a leading role for researchers to implement the repetition principle during the research design phase.
Less Truth Than Error: Massachusetts Teacher Tests

Directory of Open Access Journals (Sweden)

Walt Haney

1999-02-01

Full Text Available Scores on the Massachusetts Teacher Tests of reading and writing are highly unreliable. The tests' margin of error is close to double to triple the range found on well-developed tests. A person retaking the MTT several times could have huge fluctuations in their scores even if their skill level did not change significantly. In fact, the 9 to 17 point margin of error calculated for the tests represents more than 10 percent of the grading scale (assumed to be 0 to 100. The large margin of error means there is both a high false-pass rate and a high false-failure rate. For example, a person who received a score of 72 on the writing test could have scored an 89 or a 55 simply because of the unreliability of the test. Since adults' reading and writing skills do not change a great deal over several months, this range of scores on the same test should not be possible. While this test is being touted as an accurate assessment of a person's fitness to be a teacher, one would expect the scores to accurately reflect a test-taker's verbal ability level. In addition to the large margin of error, the MTT contain questionable content that make them poor tools for measuring test-takers' reading and writing skills. The content and lack of correlation between the reading and writing scores reduces the meaningfulness, or validity, of the tests. The validity is affected not just by the content, but by a host of factors, such as the conditions under which tests were administered and how they were scored. Interviews with a small sample of test-takers confirmed published reports concerning problems with the content and administration.
Specificity and false positive rates of the Test of Memory Malingering, Rey 15-item Test, and Rey Word Recognition Test among forensic inpatients with intellectual disabilities.

Science.gov (United States)

Love, Christopher M; Glassmire, David M; Zanolini, Shanna Jordan; Wolf, Amanda

2014-10-01

This study evaluated the specificity and false positive (FP) rates of the Rey 15-Item Test (FIT), Word Recognition Test (WRT), and Test of Memory Malingering (TOMM) in a sample of 21 forensic inpatients with mild intellectual disability (ID). The FIT demonstrated an FP rate of 23.8% with the standard quantitative cutoff score. Certain qualitative error types on the FIT showed promise and had low FP rates. The WRT obtained an FP rate of 0.0% with previously reported cutoff scores. Finally, the TOMM demonstrated low FP rates of 4.8% and 0.0% on Trial 2 and the Retention Trial, respectively, when applying the standard cutoff score. FP rates are reported for a range of cutoff scores and compared with published research on individuals diagnosed with ID. Results indicated that although the quantitative variables on the FIT had unacceptably high FP rates, the TOMM and WRT had low FP rates, increasing the confidence clinicians can place in scores reflecting poor effort on these measures during ID evaluations. © The Author(s) 2014.
Microbiological analyses of samples from the H-Area injection well test site

International Nuclear Information System (INIS)

Wilde, E.W.; Franck, M.M.

1997-01-01

Microbial populations in well water from monitoring wells at the test site were one to three orders of magnitude higher than well water from the Cretaceous aquifer (used as dilution water for the tests) or from a control well adjacent to the test site facility. Coupons samples placed in monitoring and control wells demonstrated progressive adhesion by microbes to materials used in well construction. Samples of material scraped from test well components during abandonment of the test site project revealed the presence of a variety of attached microbes including iron bacteria. Although the injection wells at the actual remediation facility for the F- and H-Area seepage basins remediation project are expected to be subjected to somewhat different conditions (e.g. considerably lower iron concentrations) than was the case at the test site, the potential for microbiologically mediated clogging and fouling within the process should be considered. A sampling program that includes microbiological testing is highly recommended
Genetic risk scores link body fat distribution with specific cardiometabolic profiles

DEFF Research Database (Denmark)

Svendstrup, Mathilde; Sandholt, Camilla H; Andersson Galijatovic, Ehm Astrid

2016-01-01

, including fasting serum triglyceride (β = 0.98% mmol/L, P = 3.33 × 10(-) (8) ) and Matsuda index (β = -0.74%, P = 1.29 × 10(-) (4) ). No similar associations for Clusters 2 and 3 were found. The three clusters showed different patterns of association with waist circumference, hip circumference, and height......OBJECTIVE: Forty-nine known single nucleotide polymorphisms (SNPs) associating with body mass index (BMI)-adjusted waist-hip-ratio (WHR) (WHRadjBMI) were recently suggested to cluster into three groups with different associations to cardiometabolic traits. Genetic risk scores of the clusters...... risk scores and anthropometry and blood samples at fasting and during an oral glucose tolerance test were tested. Analyses were adjusted for age, sex, and BMI. RESULTS: Cluster 1 associated with an increased risk of diabetes (HR = 1.05, P = 2.74 × 10(-) (4) ) and with a poor metabolic profile...

Soetomo score: score model in early identification of acute haemorrhagic stroke

Directory of Open Access Journals (Sweden)

Moh Hasan Machfoed

2016-06-01

Full Text Available Aim of the study: On financial or facility constraints of brain imaging, score model is used to predict the occurrence of acute haemorrhagic stroke. Accordingly, this study attempts to develop a new score model, called Soetomo score. Material and methods: The researchers performed a cross-sectional study of 176 acute stroke patients with onset of ≤24 hours who visited emergency unit of Dr. Soetomo Hospital from July 14th to December 14th, 2014. The diagnosis of haemorrhagic stroke was confirmed by head computed tomography scan. There were seven predictors of haemorrhagic stroke which were analysed by using bivariate and multivariate analyses. Furthermore, a multiple discriminant analysis resulted in an equation of Soetomo score model. The receiver operating characteristic procedure resulted in the values of area under curve and intersection point identifying haemorrhagic stroke. Afterward, the diagnostic test value was determined. Results: The equation of Soetomo score model was (3 × loss of consciousness + (3.5 × headache + (4 × vomiting − 4.5. Area under curve value of this score was 88.5% (95% confidence interval = 83.3–93.7%. In the Soetomo score model value of ≥−0.75, the score reached the sensitivity of 82.9%, specificity of 83%, positive predictive value of 78.8%, negative predictive value of 86.5%, positive likelihood ratio of 4.88, negative likelihood ratio of 0.21, false negative of 17.1%, false positive of 17%, and accuracy of 83%. Conclusions: The Soetomo score model value of ≥−0.75 can identify acute haemorrhagic stroke properly on the financial or facility constrains of brain imaging.
Differences in Neuropsychological Functioning Between Homicidal and Nonviolent Schizophrenia Samples.

Science.gov (United States)

Stratton, John; Cobia, Derin J; Reilly, James; Brook, Michael; Hanlon, Robert E

2018-02-07

Few studies have compared performance on neurocognitive measures between violent and nonviolent schizophrenia samples. A better understanding of neurocognitive dysfunction in violent individuals with schizophrenia could increase the efficacy of violence reduction strategies and aid in risk assessment and adjudication processes. This study aimed to compare neuropsychological performance between 25 homicide offenders with schizophrenia and 25 nonviolent schizophrenia controls. The groups were matched for age, race, sex, and handedness. Independent t-tests and Mann-Whitney U-tests were used to compare the schizophrenia groups' performance on measures of cognition, including composite scores assessing domain level functioning and individual neuropsychological tests. Results indicated the violent schizophrenia group performed worse on measures of memory and executive functioning, and the Intellectual Functioning composite score, when compared to the nonviolent schizophrenia sample. These findings replicate previous research documenting neuropsychological deficits specific to violent individuals with schizophrenia and support research implicating fronto-limbic dysfunction among violent offenders with schizophrenia. © 2018 American Academy of Forensic Sciences.
Correlating continuous assessment scores to junior secondary ...

African Journals Online (AJOL)

This study investigated the relationship between continuous assessment scores and junior secondary school certificate examination(JSCE) final scores in Imo State. A sample of four hundred students were purposively selected from thirty eight thousand students who took the 1997 JSCE in Imo State. The data used were ...
Assessing working memory in children with ADHD: Minor administration and scoring changes may improve digit span backward's construct validity.

Science.gov (United States)

Wells, Erica L; Kofler, Michael J; Soto, Elia F; Schaefer, Hillary S; Sarver, Dustin E

2018-01-01

Pediatric ADHD is associated with impairments in working memory, but these deficits often go undetected when using clinic-based tests such as digit span backward. The current study pilot-tested minor administration/scoring modifications to improve digit span backward's construct and predictive validities in a well-characterized sample of children with ADHD. WISC-IV digit span was modified to administer all trials (i.e., ignore discontinue rule) and count digits rather than trials correct. Traditional and modified scores were compared to a battery of criterion working memory (construct validity) and academic achievement tests (predictive validity) for 34 children with ADHD ages 8-13 (M=10.41; 11 girls). Traditional digit span backward scores failed to predict working memory or KTEA-2 achievement (allns). Alternate administration/scoring of digit span backward significantly improved its associations with working memory reordering (r=.58), working memory dual-processing (r=.53), working memory updating (r=.28), and KTEA-2 achievement (r=.49). Consistent with prior work, these findings urge caution when interpreting digit span performance. Minor test modifications may address test validity concerns, and should be considered in future test revisions. Digit span backward becomes a valid measure of working memory at exactly the point that testing is traditionally discontinued. Copyright © 2017 Elsevier Ltd. All rights reserved.
External Validation of the Simple Clinical Score and the HOTEL Score, Two Scores for Predicting Short-Term Mortality after Admission to an Acute Medical Unit

DEFF Research Database (Denmark)

Stræde, Mia; Brabrand, Mikkel

2014-01-01

with the objective of validating the Simple Clinical Score (SCS) and the HOTEL score, two existing risk stratification systems that predict mortality for medical patients based solely on clinical information, but not only vital signs. METHODS: Pre-planned prospective observational cohort study. SETTING: Danish 460.......932 to 0.988) for 24-hours mortality and 0.826 (95% CI, 0.774-0.879) for 30-day mortality, and goodness-of-fit test, χ2 = 2.68 (10 degrees of freedom), P = 0.998 and χ2 = 4.00, P = 0.947, respectively. We included 1470 patients when calculating the HOTEL score. Discriminatory power (AUROC) was 0.931 (95......% CI, 0.901-0.962) for 24-hours mortality and goodness-of-fit test, χ2 = 5.56 (10 degrees of freedom), P = 0.234. CONCLUSION: We find that both the SCS and HOTEL scores showed an excellent to outstanding ability in identifying patients at high risk of dying with good or acceptable precision....
A Case for Adjusting Subjectively Rated Scores in the Advanced Placement Tests. Program Statistics Research. Technical Report No. 94-5.

Science.gov (United States)

Longford, Nicholas T.

A case is presented for adjusting the scores for free response items in the Advanced Placement (AP) tests. Using information about the rating process from the reliability studies, administrations of the AP test for three subject areas, psychology, computer science, and English language and composition, are analyzed. In the reliability studies, 299…
Attributes of diagnostic tests to increase uptake of dual testing for syphilis and HIV in Port-au-Prince, Haiti.

Science.gov (United States)

Bristow, Claire C; Lee, Sung-Jae; Severe, Linda; William Pape, Jean; Javanbakht, Marjan; Scott Comulada, Warren; Klausner, Jeffrey D

2017-03-01

Introduction Syphilis and HIV screening is highly recommended for pregnant women and those at risk for infection. We used conjoint analysis to identify factors associated with testing preferences for HIV and syphilis infection. Methods We recruited 298 men and women 18 years and over seeking testing or care at GHESKIO (Haitian Study Group for Kaposi's Sarcoma and Opportunistic Infections) clinics. We created eight hypothetical dual HIV-syphilis test profiles varying across six dichotomous attributes. Participants were asked to rate each profile using Likert preference scales. An impact score was generated for each attribute by taking the difference between the preference scores for the preferred and non-preferred level of each attribute. Two-sided one-sample t-test was used to generate p values. Results Of 298 study participants, 61 (20.5%) were male. Of 237 females, 49 (20.7%) were pregnant. Cost (free vs. US$4; p syphilis testing preferences for this study sample in Port-au-Prince prioritized cost, single fingerprick, laboratory-based testing and timeliness.
Pre-season adductor squeeze test and HAGOS function sport and recreation subscale scores predict groin injury in Gaelic football players.

Science.gov (United States)

Delahunt, Eamonn; Fitzpatrick, Helen; Blake, Catherine

2017-01-01

To determine if pre-season adductor squeeze test and HAGOS function, sport and recreation subscale scores can identify Gaelic football players at risk of developing groin injury. Prospective study. Senior inter-county Gaelic football team. Fifty-five male elite Gaelic football players (age = 24.0 ± 2.8 years, body mass = 84.48 ± 7.67 kg, height = 1.85 ± 0.06 m, BMI = 24.70 ± 1.77 kg/m 2 ) from a single senior inter-county Gaelic football team. Occurrence of groin injury during the season. Ten time-loss groin injuries were registered representing 13% of all injuries. The odds ratio for sustaining a groin injury if pre-season adductor squeeze test score was below 225 mmHg, was 7.78. The odds ratio for sustaining a groin injury if pre-season HAGOS function, sport and recreation subscale score was football players at risk of developing groin injury. Copyright © 2016 Elsevier Ltd. All rights reserved.
Operability test report for core sample truck number one flammable gas modifications

International Nuclear Information System (INIS)

Akers, J.C.

1997-01-01

This report primarily consists of the original test procedure used for the Operability Testing of the flammable gas modifications to Core Sample Truck No. One. Included are exceptions, resolutions, comments, and test results. This report consists of the original, completed, test procedure used for the Operability Testing of the flammable gas modifications to the Push Mode Core Sample Truck No. 1. Prior to the Acceptance/Operability test the truck No. 1 operations procedure (TO-080-503) was revised to be more consistent with the other core sample truck procedures and to include operational steps/instructions for the SR weather cover pressurization system. A draft copy of the operations procedure was used to perform the Operability Test Procedure (OTP). A Document Acceptance Review Form is included with this report (last page) indicating the draft status of the operations procedure during the OTP. During the OTP 11 test exceptions were encountered. Of these exceptions four were determined to affect Acceptance Criteria as listed in the OTP, Section 4.7 ACCEPTANCE CRITERIA
Page sample size in web accessibility testing: how many pages is enough?

NARCIS (Netherlands)

Velleman, Eric Martin; van der Geest, Thea

2013-01-01

Various countries and organizations use a different sampling approach and sample size of web pages in accessibility conformance tests. We are conducting a systematic analysis to determine how many pages is enough for testing whether a website is compliant with standard accessibility guidelines. This
Different goodness of fit tests for Rayleigh distribution in ranked set sampling

Directory of Open Access Journals (Sweden)

Amer Al-Omari

2016-03-01

Full Text Available In this paper, different goodness of fit tests for the Rayleigh distribution are considered based on simple random sampling (SRS and ranked set sampling (RSS techniques. The performance of the suggested estimators is evaluated in terms of the power of the tests by using Monte Carlo simulation. It is found that the suggested RSS tests perform better than their counterparts in SRS.
A Third Moment Adjusted Test Statistic for Small Sample Factor Analysis.

Science.gov (United States)

Lin, Johnny; Bentler, Peter M

2012-01-01

Goodness of fit testing in factor analysis is based on the assumption that the test statistic is asymptotically chi-square; but this property may not hold in small samples even when the factors and errors are normally distributed in the population. Robust methods such as Browne's asymptotically distribution-free method and Satorra Bentler's mean scaling statistic were developed under the presumption of non-normality in the factors and errors. This paper finds new application to the case where factors and errors are normally distributed in the population but the skewness of the obtained test statistic is still high due to sampling error in the observed indicators. An extension of Satorra Bentler's statistic is proposed that not only scales the mean but also adjusts the degrees of freedom based on the skewness of the obtained test statistic in order to improve its robustness under small samples. A simple simulation study shows that this third moment adjusted statistic asymptotically performs on par with previously proposed methods, and at a very small sample size offers superior Type I error rates under a properly specified model. Data from Mardia, Kent and Bibby's study of students tested for their ability in five content areas that were either open or closed book were used to illustrate the real-world performance of this statistic.
Performance on large-scale science tests: Item attributes that may impact achievement scores

Science.gov (United States)

Gordon, Janet Victoria

Significant differences in achievement among ethnic groups persist on the eighth-grade science Washington Assessment of Student Learning (WASL). The WASL measures academic performance in science using both scenario and stand-alone question types. Previous research suggests that presenting target items connected to an authentic context, like scenario question types, can increase science achievement scores especially in underrepresented groups and thus help to close the achievement gap. The purpose of this study was to identify significant differences in performance between gender and ethnic subgroups by question type on the 2005 eighth-grade science WASL. MANOVA and ANOVA were used to examine relationships between gender and ethnic subgroups as independent variables with achievement scores on scenario and stand-alone question types as dependent variables. MANOVA revealed no significant effects for gender, suggesting that the 2005 eighth-grade science WASL was gender neutral. However, there were significant effects for ethnicity. ANOVA revealed significant effects for ethnicity and ethnicity by gender interaction in both question types. Effect sizes were negligible for the ethnicity by gender interaction. Large effect sizes between ethnicities on scenario question types became moderate to small effect sizes on stand-alone question types. This indicates the score advantage the higher performing subgroups had over the lower performing subgroups was not as large on stand-alone question types compared to scenario question types. A further comparison examined performance on multiple-choice items only within both question types. Similar achievement patterns between ethnicities emerged; however, achievement patterns between genders changed in boys' favor. Scenario question types appeared to register differences between ethnic groups to a greater degree than stand-alone question types. These differences may be attributable to individual differences in cognition
Research on test of product based on spatial sampling criteria and variable step sampling mechanism

Science.gov (United States)

Li, Ruihong; Han, Yueping

2014-09-01

This paper presents an effective approach for online testing the assembly structures inside products using multiple views technique and X-ray digital radiography system based on spatial sampling criteria and variable step sampling mechanism. Although there are some objects inside one product to be tested, there must be a maximal rotary step for an object within which the least structural size to be tested is predictable. In offline learning process, Rotating the object by the step and imaging it and so on until a complete cycle is completed, an image sequence is obtained that includes the full structural information for recognition. The maximal rotary step is restricted by the least structural size and the inherent resolution of the imaging system. During online inspection process, the program firstly finds the optimum solutions to all different target parts in the standard sequence, i.e., finds their exact angles in one cycle. Aiming at the issue of most sizes of other targets in product are larger than that of the least structure, the paper adopts variable step-size sampling mechanism to rotate the product specific angles with different steps according to different objects inside the product and match. Experimental results show that the variable step-size method can greatly save time compared with the traditional fixed-step inspection method while the recognition accuracy is guaranteed.
Do medical students’ scores using different assessment instruments predict their scores in clinical reasoning using a computer-based simulation?

Directory of Open Access Journals (Sweden)

Fida M

2015-02-01

Full Text Available Mariam Fida,1 Salah Eldin Kassab2 1Department of Molecular Medicine, College of Medicine and Medical Sciences, Arabian Gulf University, Manama, Bahrain; 2Department of Medical Education, Faculty of Medicine, Suez Canal University, Ismailia, Egypt Purpose: The development of clinical problem-solving skills evolves over time and requires structured training and background knowledge. Computer-based case simulations (CCS have been used for teaching and assessment of clinical reasoning skills. However, previous studies examining the psychometric properties of CCS as an assessment tool have been controversial. Furthermore, studies reporting the integration of CCS into problem-based medical curricula have been limited. Methods: This study examined the psychometric properties of using CCS software (DxR Clinician for assessment of medical students (n=130 studying in a problem-based, integrated multisystem module (Unit IX during the academic year 2011–2012. Internal consistency reliability of CCS scores was calculated using Cronbach's alpha statistics. The relationships between students' scores in CCS components (clinical reasoning, diagnostic performance, and patient management and their scores in other examination tools at the end of the unit including multiple-choice questions, short-answer questions, objective structured clinical examination (OSCE, and real patient encounters were analyzed using stepwise hierarchical linear regression. Results: Internal consistency reliability of CCS scores was high (α=0.862. Inter-item correlations between students' scores in different CCS components and their scores in CCS and other test items were statistically significant. Regression analysis indicated that OSCE scores predicted 32.7% and 35.1% of the variance in clinical reasoning and patient management scores, respectively (P<0.01. Multiple-choice question scores, however, predicted only 15.4% of the variance in diagnostic performance scores (P<0.01, while
A closer look at the effect of preliminary goodness-of-fit testing for normality for the one-sample t-test.

Science.gov (United States)

Rochon, Justine; Kieser, Meinhard

2011-11-01

Student's one-sample t-test is a commonly used method when inference about the population mean is made. As advocated in textbooks and articles, the assumption of normality is often checked by a preliminary goodness-of-fit (GOF) test. In a paper recently published by Schucany and Ng it was shown that, for the uniform distribution, screening of samples by a pretest for normality leads to a more conservative conditional Type I error rate than application of the one-sample t-test without preliminary GOF test. In contrast, for the exponential distribution, the conditional level is even more elevated than the Type I error rate of the t-test without pretest. We examine the reasons behind these characteristics. In a simulation study, samples drawn from the exponential, lognormal, uniform, Student's t-distribution with 2 degrees of freedom (t(2) ) and the standard normal distribution that had passed normality screening, as well as the ingredients of the test statistics calculated from these samples, are investigated. For non-normal distributions, we found that preliminary testing for normality may change the distribution of means and standard deviations of the selected samples as well as the correlation between them (if the underlying distribution is non-symmetric), thus leading to altered distributions of the resulting test statistics. It is shown that for skewed distributions the excess in Type I error rate may be even more pronounced when testing one-sided hypotheses. ©2010 The British Psychological Society.
The Veterans Affairs Cardiac Risk Score: Recalibrating the Atherosclerotic Cardiovascular Disease Score for Applied Use.

Science.gov (United States)

Sussman, Jeremy B; Wiitala, Wyndy L; Zawistowski, Matthew; Hofer, Timothy P; Bentley, Douglas; Hayward, Rodney A

2017-09-01

Accurately estimating cardiovascular risk is fundamental to good decision-making in cardiovascular disease (CVD) prevention, but risk scores developed in one population often perform poorly in dissimilar populations. We sought to examine whether a large integrated health system can use their electronic health data to better predict individual patients' risk of developing CVD. We created a cohort using all patients ages 45-80 who used Department of Veterans Affairs (VA) ambulatory care services in 2006 with no history of CVD, heart failure, or loop diuretics. Our outcome variable was new-onset CVD in 2007-2011. We then developed a series of recalibrated scores, including a fully refit "VA Risk Score-CVD (VARS-CVD)." We tested the different scores using standard measures of prediction quality. For the 1,512,092 patients in the study, the Atherosclerotic cardiovascular disease risk score had similar discrimination as the VARS-CVD (c-statistic of 0.66 in men and 0.73 in women), but the Atherosclerotic cardiovascular disease model had poor calibration, predicting 63% more events than observed. Calibration was excellent in the fully recalibrated VARS-CVD tool, but simpler techniques tested proved less reliable. We found that local electronic health record data can be used to estimate CVD better than an established risk score based on research populations. Recalibration improved estimates dramatically, and the type of recalibration was important. Such tools can also easily be integrated into health system's electronic health record and can be more readily updated.
Performance on cognitive tests, instrumental activities of daily living and depressive symptoms of a community-based sample of elderly adults in Rio de Janeiro, Brazil

Science.gov (United States)

Lima, Christina Martins Borges; Alves, Heloisa Veiga Dias; Mograbi, Daniel Correa; Pereira, Flávia Furtado; Fernandez, Jesus Landeira; Charchat-Fichman, Helenice

2017-01-01

Objective To describe the performance on basic cognitive tasks, instrumental activities of daily living, and depressive symptoms of a community-based sample of elderly adults in Rio de Janeiro (Brazil) who participated in multiple physical, social, and cognitive activities at government-run community centers. Methods A total of 264 educated older adults (> 60 years of age of both genders) were evaluated by the Brief Cognitive Screening Battery (BCSB), Lawton's and Pfeffer's activities of daily living indexes, and the Geriatric Depressive Scale (GDS). Results The mean age of the sample was 75.7 years. The participants had a mean of 9.3 years of formal education. With the exception of the Clock Drawing Test (CDT), mean scores on the cognitive tests were consistent with the values in the literature. Only 6.4% of the sample had some kind of dependence for activities of daily living. The results of the Geriatric Depression Scale (GDS-15) indicated mild symptoms of depression in 16.8% of the sample Conclusion This study provided important demographic, cognitive, and functional characteristics of a specific community-based sample of elderly adults in Rio de Janeiro, Brazil. PMID:29213494
Performance on cognitive tests, instrumental activities of daily living and depressive symptoms of a community-based sample of elderly adults in Rio de Janeiro, Brazil

Directory of Open Access Journals (Sweden)

Christina Martins Borges Lima

Full Text Available ABSTRACT Objective: To describe the performance on basic cognitive tasks, instrumental activities of daily living, and depressive symptoms of a community-based sample of elderly adults in Rio de Janeiro (Brazil who participated in multiple physical, social, and cognitive activities at government-run community centers. Methods: A total of 264 educated older adults (> 60 years of age of both genders were evaluated by the Brief Cognitive Screening Battery (BCSB, Lawton's and Pfeffer's activities of daily living indexes, and the Geriatric Depressive Scale (GDS . Results: The mean age of the sample was 75.7 years. The participants had a mean of 9.3 years of formal education. With the exception of the Clock Drawing Test (CDT, mean scores on the cognitive tests were consistent with the values in the literature. Only 6.4% of the sample had some kind of dependence for activities of daily living. The results of the Geriatric Depression Scale (GDS-15 indicated mild symptoms of depression in 16.8% of the sample. Conclusion: This study provided important demographic, cognitive, and functional characteristics of a specific community-based sample of elderly adults in Rio de Janeiro, Brazil.
An immunohistochemical and fluorescence in situ hybridization-based comparison between the Oracle HER2 Bond Immunohistochemical System, Dako HercepTest, and Vysis PathVysion HER2 FISH using both commercially validated and modified ASCO/CAP and United Kingdom HER2 IHC scoring guidelines.

LENUS (Irish Health Repository)

O'Grady, Anthony

2010-12-01

Immunohistochemistry (IHC) is used as the frontline assay to determine HER2 status in invasive breast cancer patients. The aim of the study was to compare the performance of the Leica Oracle HER2 Bond IHC System (Oracle) with the current most readily accepted Dako HercepTest (HercepTest), using both commercially validated and modified ASCO\\/CAP and UK HER2 IHC scoring guidelines. A total of 445 breast cancer samples from 3 international clinical HER2 referral centers were stained with the 2 test systems and scored in a blinded fashion by experienced pathologists. The overall agreement between the 2 tests in a 3×3 (negative, equivocal and positive) analysis shows a concordance of 86.7% and 86.3%, respectively when analyzed using commercially validated and modified ASCO\\/CAP and UK HER2 IHC scoring guidelines. There is a good concordance between the Oracle and the HercepTest. The advantages of a complete fully automated test such as the Oracle include standardization of key analytical factors and improved turn around time. The implementation of the modified ASCO\\/CAP and UK HER2 IHC scoring guidelines has minimal effect on either assay interpretation, showing that Oracle can be used as a methodology for accurately determining HER2 IHC status in formalin fixed, paraffin-embedded breast cancer tissue.

An immunohistochemical and fluorescence in situ hybridization-based comparison between the Oracle HER2 Bond Immunohistochemical System, Dako HercepTest, and Vysis PathVysion HER2 FISH using both commercially validated and modified ASCO/CAP and United Kingdom HER2 IHC scoring guidelines.

Science.gov (United States)

O'Grady, Anthony; Allen, David; Happerfield, Lisa; Johnson, Nicola; Provenzano, Elena; Pinder, Sarah E; Tee, Lilian; Gu, Mai; Kay, Elaine W

2010-12-01

Immunohistochemistry (IHC) is used as the frontline assay to determine HER2 status in invasive breast cancer patients. The aim of the study was to compare the performance of the Leica Oracle HER2 Bond IHC System (Oracle) with the current most readily accepted Dako HercepTest (HercepTest), using both commercially validated and modified ASCO/CAP and UK HER2 IHC scoring guidelines. A total of 445 breast cancer samples from 3 international clinical HER2 referral centers were stained with the 2 test systems and scored in a blinded fashion by experienced pathologists. The overall agreement between the 2 tests in a 3×3 (negative, equivocal and positive) analysis shows a concordance of 86.7% and 86.3%, respectively when analyzed using commercially validated and modified ASCO/CAP and UK HER2 IHC scoring guidelines. There is a good concordance between the Oracle and the HercepTest. The advantages of a complete fully automated test such as the Oracle include standardization of key analytical factors and improved turn around time. The implementation of the modified ASCO/CAP and UK HER2 IHC scoring guidelines has minimal effect on either assay interpretation, showing that Oracle can be used as a methodology for accurately determining HER2 IHC status in formalin fixed, paraffin-embedded breast cancer tissue.
The Machine Scoring of Writing

Science.gov (United States)

McCurry, Doug

2010-01-01

This article provides an introduction to the kind of computer software that is used to score student writing in some high stakes testing programs, and that is being promoted as a teaching and learning tool to schools. It sketches the state of play with machines for the scoring of writing, and describes how these machines work and what they do.…
On the matched pairs sign test using bivariate ranked set sampling ...

African Journals Online (AJOL)

BVRSS) is introduced and investigated. We show that this test is asymptotically more efficient than its counterpart sign test based on a bivariate simple random sample (BVSRS). The asymptotic null distribution and the efficiency of the test are derived.
Empirical Sampling Distributions of Equating Coefficients for Graded and Nominal Response Instruments.

Science.gov (United States)

Baker, Frank B.

1997-01-01

Examined the sampling distributions of equating coefficients produced by the characteristic curve method for tests using graded and nominal response scoring using simulated data. For both models and across all three equating situations, the sampling distributions were generally bell-shaped and peaked, and occasionally had a small degree of…
Prognostic Accuracy of the GRACE Score in Octogenarians and Nonagenarians with Acute Coronary Syndromes

Directory of Open Access Journals (Sweden)

Antonio Mauricio dos Santos Cerqueira Junior

2018-02-01

Full Text Available Abstract Background: The GRACE Score was derived and validated from a cohort in which octogenarians and nonagenarians were poorly represented. Objective: To test the accuracy of the GRACE score in predicting in-hospital mortality of very elderly individuals with acute coronary syndromes (ACS. Methods: Prospective observational study conducted in the intensive coronary care unit of a tertiary center from September 2011 to August 2016. Patients consecutively admitted due to ACS were selected, and the very elderly group was defined by age ≥ 80 years. The GRACE Score was based on admission data and its accuracy was tested regarding prediction of in-hospital death. Statistical significance was defined by p value < 0,05. Results: A total of 994 individuals was studied, 57% male, 77% with non-ST elevation myocardial infarction and 173 (17% very elderly patients. The mean age of the sample was 65 ± 13 years, and the mean age of very elderly patients subgroup was 85 ± 3.7 years. The C-statistics of the GRACE Score in very elderly patients was 0.86 (95% CI = 0.78 - 0.93, with no difference when compared to the value for younger individuals 0.83 (95% CI = 0.75 - 0.91, with p = 0.69. The calibration of the score in very elderly patients was described by χ2 test of Hosmer-Lemeshow = 2.2 (p = 0.98, while the remaining patients presented χ2 = 9.0 (p = 0.35. Logistic regression analysis for death prediction did not show interaction between GRACE Score and variable of very elderly patients (p = 0.25. Conclusion: The GRACE Score in very elderly patients is accurate in predicting in-hospital ACS mortality, similarly to younger patients.
A Direct Comparison of the MM-GB/SA Scoring Procedure and Free-Energy Perturbation Calculations Using Carbonic Anhydrase as a Test Case: Strengths and Pitfalls of Each Approach.

Science.gov (United States)

Guimarães, Cristiano R W

2011-07-12

MM-GB/SA scoring and free energy perturbation (FEP) calculations have emerged as reliable methodologies to understand structural and energetic relationships to binding. In spite of successful applications to elucidate the structure-activity relationships for few pairs of ligands, the reality is that the performance of FEP calculations has rarely been tested for more than a handful of compounds. In this work, a series of 13 benzene sulfonamide inhibitors of carbonic anhydrase with binding free energies determined by isothermal titration calorimetry was selected as a test case. R(2) values of 0.70, 0.71, and 0.49 with the experiment were obtained with MM-GB/SA and FEP simulations run with MCPRO+ and Desmond, respectively. All methods work well, but the results obtained with Desmond are inferior to MM-GB/SA and MCPRO+. The main contrast between the methods is the level of sampling, ranging from full to restricted flexibility to single conformation for the complexes in Desmond, MCPRO+, and MM-GB/SA, respectively. The current and historical results obtained with MM-GB/SA qualify this approach as a more attractive alternative for rank-ordering; it can achieve equivalent or superior predictive accuracy and handle more structurally dissimilar ligands at a fraction of the computational cost of the rigorous free-energy methods. As for the large theoretical dynamic range for the binding energies, that seems to be a direct result of the degree of sampling in the simulations since MCPRO+ as well as MM-GB/SA are plagued by this. Van't Hoff analysis for selected pairs of ligands suggests that the wider scoring spread is not only affected by missing entropic contributions due to restricted sampling but also exaggerated enthalpic separation between the weak and potent compounds caused by diminished shielding of electrostatic interactions, thermal effects, and protein relaxation/strain.
40 CFR 205.171-2 - Test exhaust system sample selection and preparation.

Science.gov (United States)

2010-07-01

... Systems § 205.171-2 Test exhaust system sample selection and preparation. (a)(1) Exhaust systems... 40 Protection of Environment 24 2010-07-01 2010-07-01 false Test exhaust system sample selection and preparation. 205.171-2 Section 205.171-2 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY...
Detecting determinism with improved sensitivity in time series: rank-based nonlinear predictability score.

Science.gov (United States)

Naro, Daniel; Rummel, Christian; Schindler, Kaspar; Andrzejak, Ralph G

2014-09-01

The rank-based nonlinear predictability score was recently introduced as a test for determinism in point processes. We here adapt this measure to time series sampled from time-continuous flows. We use noisy Lorenz signals to compare this approach against a classical amplitude-based nonlinear prediction error. Both measures show an almost identical robustness against Gaussian white noise. In contrast, when the amplitude distribution of the noise has a narrower central peak and heavier tails than the normal distribution, the rank-based nonlinear predictability score outperforms the amplitude-based nonlinear prediction error. For this type of noise, the nonlinear predictability score has a higher sensitivity for deterministic structure in noisy signals. It also yields a higher statistical power in a surrogate test of the null hypothesis of linear stochastic correlated signals. We show the high relevance of this improved performance in an application to electroencephalographic (EEG) recordings from epilepsy patients. Here the nonlinear predictability score again appears of higher sensitivity to nonrandomness. Importantly, it yields an improved contrast between signals recorded from brain areas where the first ictal EEG signal changes were detected (focal EEG signals) versus signals recorded from brain areas that were not involved at seizure onset (nonfocal EEG signals).
Operability test report for rotary mode core sampling system number 3

International Nuclear Information System (INIS)

Corbett, J.E.

1996-01-01

This report documents the successful completion of operability testing for the Rotary Mode Core Sampling (RMCS) system number-sign 3. The Report includes the test procedure (WHC-SD-WM-OTP-174), exception resolutions, data sheets, and a test report summary
TEST-RETEST RELIABILITY OF THE CLOSED KINETIC CHAIN UPPER EXTREMITY STABILITY TEST (CKCUEST) IN ADOLESCENTS: RELIABILITY OF CKCUEST IN ADOLESCENTS.

Science.gov (United States)

de Oliveira, Valéria M A; Pitangui, Ana C R; Nascimento, Vinícius Y S; da Silva, Hítalo A; Dos Passos, Muana H P; de Araújo, Rodrigo C

2017-02-01

The Closed Kinetic Chain Upper Extremity Stability Test (CKCUEST) has been proposed as an option to assess upper limb function and stability; however, there are few studies that support the use of this test in adolescents. The purpose of the present study was to investigate the intersession reliability and agreement of three CKCUEST scores in adolescents and establish clinimetric values for this test. Test-retest reliability. Twenty-five healthy adolescents of both sexes were evaluated. The subjects performed two CKCUEST with an interval of one week between the tests. An intraclass correlation coefficient (ICC 3,3 ) two-way mixed model with a 95% interval of confidence was utilized to determine intersession reliability. A Bland-Altman graph was plotted to analyze the agreement between assessments. The presence of systematic error was evaluated by a one-sample t test. The difference between the evaluation and reevaluation was observed using a paired-sample t test. The level of significance was set at 0.05. Standard error of measurements and minimum detectable changes were calculated. The intersession reliability of the average touches score, normalized score, and power score were 0.68, 0.68 and 0.87, the standard error of measurement were 2.17, 1.35 and 6.49, and the minimal detectable change was 6.01, 3.74 and 17.98, respectively. The presence of systematic error (p test with moderate to excellent reliability when used with adolescents. The CKCUEST is a measurement with moderate to excellent reliability for adolescents. 2b.
Italian normative data and validation of two neuropsychological tests of face recognition: Benton Facial Recognition Test and Cambridge Face Memory Test.

Science.gov (United States)

Albonico, Andrea; Malaspina, Manuela; Daini, Roberta

2017-09-01

The Benton Facial Recognition Test (BFRT) and Cambridge Face Memory Test (CFMT) are two of the most common tests used to assess face discrimination and recognition abilities and to identify individuals with prosopagnosia. However, recent studies highlighted that participant-stimulus match ethnicity, as much as gender, has to be taken into account in interpreting results from these tests. Here, in order to obtain more appropriate normative data for an Italian sample, the CFMT and BFRT were administered to a large cohort of young adults. We found that scores from the BFRT are not affected by participants' gender and are only slightly affected by participant-stimulus ethnicity match, whereas both these factors seem to influence the scores of the CFMT. Moreover, the inclusion of a sample of individuals with suspected face recognition impairment allowed us to show that the use of more appropriate normative data can increase the BFRT efficacy in identifying individuals with face discrimination impairments; by contrast, the efficacy of the CFMT in classifying individuals with a face recognition deficit was confirmed. Finally, our data show that the lack of inversion effect (the difference between the total score of the upright and inverted versions of the CFMT) could be used as further index to assess congenital prosopagnosia. Overall, our results confirm the importance of having norms derived from controls with a similar experience of faces as the "potential" prosopagnosic individuals when assessing face recognition abilities.
Development and validation of a composite scoring system for robot-assisted surgical training--the Robotic Skills Assessment Score.

Science.gov (United States)

Chowriappa, Ashirwad J; Shi, Yi; Raza, Syed Johar; Ahmed, Kamran; Stegemann, Andrew; Wilding, Gregory; Kaouk, Jihad; Peabody, James O; Menon, Mani; Hassett, James M; Kesavadas, Thenkurussi; Guru, Khurshid A

2013-12-01

A standardized scoring system does not exist in virtual reality-based assessment metrics to describe safe and crucial surgical skills in robot-assisted surgery. This study aims to develop an assessment score along with its construct validation. All subjects performed key tasks on previously validated Fundamental Skills of Robotic Surgery curriculum, which were recorded, and metrics were stored. After an expert consensus for the purpose of content validation (Delphi), critical safety determining procedural steps were identified from the Fundamental Skills of Robotic Surgery curriculum and a hierarchical task decomposition of multiple parameters using a variety of metrics was used to develop Robotic Skills Assessment Score (RSA-Score). Robotic Skills Assessment mainly focuses on safety in operative field, critical error, economy, bimanual dexterity, and time. Following, the RSA-Score was further evaluated for construct validation and feasibility. Spearman correlation tests performed between tasks using the RSA-Scores indicate no cross correlation. Wilcoxon rank sum tests were performed between the two groups. The proposed RSA-Score was evaluated on non-robotic surgeons (n = 15) and on expert-robotic surgeons (n = 12). The expert group demonstrated significantly better performance on all four tasks in comparison to the novice group. Validation of the RSA-Score in this study was carried out on the Robotic Surgical Simulator. The RSA-Score is a valid scoring system that could be incorporated in any virtual reality-based surgical simulator to achieve standardized assessment of fundamental surgical tents during robot-assisted surgery. Copyright © 2013 Elsevier Inc. All rights reserved.
Evaluation of Approaches to Analyzing Continuous Correlated Eye Data When Sample Size Is Small.

Science.gov (United States)

Huang, Jing; Huang, Jiayan; Chen, Yong; Ying, Gui-Shuang

2018-02-01

To evaluate the performance of commonly used statistical methods for analyzing continuous correlated eye data when sample size is small. We simulated correlated continuous data from two designs: (1) two eyes of a subject in two comparison groups; (2) two eyes of a subject in the same comparison group, under various sample size (5-50), inter-eye correlation (0-0.75) and effect size (0-0.8). Simulated data were analyzed using paired t-test, two sample t-test, Wald test and score test using the generalized estimating equations (GEE) and F-test using linear mixed effects model (LMM). We compared type I error rates and statistical powers, and demonstrated analysis approaches through analyzing two real datasets. In design 1, paired t-test and LMM perform better than GEE, with nominal type 1 error rate and higher statistical power. In design 2, no test performs uniformly well: two sample t-test (average of two eyes or a random eye) achieves better control of type I error but yields lower statistical power. In both designs, the GEE Wald test inflates type I error rate and GEE score test has lower power. When sample size is small, some commonly used statistical methods do not perform well. Paired t-test and LMM perform best when two eyes of a subject are in two different comparison groups, and t-test using the average of two eyes performs best when the two eyes are in the same comparison group. When selecting the appropriate analysis approach the study design should be considered.
Characterization of electron microscopes with binary pseudo-random multilayer test samples

Science.gov (United States)

Yashchuk, Valeriy V.; Conley, Raymond; Anderson, Erik H.; Barber, Samuel K.; Bouet, Nathalie; McKinney, Wayne R.; Takacs, Peter Z.; Voronov, Dmitriy L.

2011-09-01

Verification of the reliability of metrology data from high quality X-ray optics requires that adequate methods for test and calibration of the instruments be developed. For such verification for optical surface profilometers in the spatial frequency domain, a modulation transfer function (MTF) calibration method based on binary pseudo-random (BPR) gratings and arrays has been suggested [1,2] and proven to be an effective calibration method for a number of interferometric microscopes, a phase shifting Fizeau interferometer, and a scatterometer [5]. Here we describe the details of development of binary pseudo-random multilayer (BPRML) test samples suitable for characterization of scanning (SEM) and transmission (TEM) electron microscopes. We discuss the results of TEM measurements with the BPRML test samples fabricated from a WiSi 2/Si multilayer coating with pseudo-randomly distributed layers. In particular, we demonstrate that significant information about the metrological reliability of the TEM measurements can be extracted even when the fundamental frequency of the BPRML sample is smaller than the Nyquist frequency of the measurements. The measurements demonstrate a number of problems related to the interpretation of the SEM and TEM data. Note that similar BPRML test samples can be used to characterize X-ray microscopes. Corresponding work with X-ray microscopes is in progress.
Characterization of electron microscopes with binary pseudo-random multilayer test samples

International Nuclear Information System (INIS)

Yashchuk, Valeriy V.; Conley, Raymond; Anderson, Erik H.; Barber, Samuel K.; Bouet, Nathalie; McKinney, Wayne R.; Takacs, Peter Z.; Voronov, Dmitriy L.

2011-01-01

Verification of the reliability of metrology data from high quality X-ray optics requires that adequate methods for test and calibration of the instruments be developed. For such verification for optical surface profilometers in the spatial frequency domain, a modulation transfer function (MTF) calibration method based on binary pseudo-random (BPR) gratings and arrays has been suggested and proven to be an effective calibration method for a number of interferometric microscopes, a phase shifting Fizeau interferometer, and a scatterometer [5]. Here we describe the details of development of binary pseudo-random multilayer (BPRML) test samples suitable for characterization of scanning (SEM) and transmission (TEM) electron microscopes. We discuss the results of TEM measurements with the BPRML test samples fabricated from a WiSi 2 /Si multilayer coating with pseudo-randomly distributed layers. In particular, we demonstrate that significant information about the metrological reliability of the TEM measurements can be extracted even when the fundamental frequency of the BPRML sample is smaller than the Nyquist frequency of the measurements. The measurements demonstrate a number of problems related to the interpretation of the SEM and TEM data. Note that similar BPRML test samples can be used to characterize X-ray microscopes. Corresponding work with X-ray microscopes is in progress.
Might the Rorschach be a projective test after all? Social projection of an undesired trait alters Rorschach Oral Dependency scores.

Science.gov (United States)

Bornstein, Robert F

2007-06-01

The degree to which projection plays a role in Rorschach (Rorschach, 1921/1942) responding remains controversial, in part because extant data have yielded inconclusive results. In this investigation, I examined the impact of social projection on Rorschach Oral Dependency (ROD) scores using methods adapted from social cognition research. In Study 1, I prescreened 85 college students (40 women and 45 men) with the ROD scale and a widely used self-report measure of dependency, the Interpersonal Dependency Inventory (IDI; Hirschfeld et al., 1977). Results show that informing participants who scored low on the IDI that they were in fact highly dependent led to significant increases in ROD scores; I did not obtain parallel ROD increases for participants who scored high on the IDI or for participants who received low-dependent feedback. In Study 2, I examined a separate sample of 80 prescreened college students (40 women and 40 men) and showed that providing low self-report participants an opportunity to attribute dependency to a fictional target person prior to Rorschach responding attenuated the impact of high-dependent feedback on ROD scores. These results suggest that projection played a role in at least one domain of Rorschach responding. I discuss theoretical, clinical, and empirical implications of these results.
9 CFR 147.8 - Procedures for preparing egg yolk samples for diagnostic tests.

Science.gov (United States)

2010-01-01

... samples for diagnostic tests. 147.8 Section 147.8 Animals and Animal Products ANIMAL AND PLANT HEALTH... IMPROVEMENT PLAN Blood Testing Procedures § 147.8 Procedures for preparing egg yolk samples for diagnostic... for diagnostic testing. (b) The authorized laboratory must identify each egg as to the breeding flock...
The soluble transcobalamin receptor (sCD320) in relation to Alzheimer's disease and cognitive scores

DEFF Research Database (Denmark)

Abuyaman, Omar; Combrinck, Marc; Smith, A David

2017-01-01

The soluble transcobalamin receptor (sCD320) is present in cerebrospinal fluid and correlates with the dementia-related biomarkers phospho-tau and total-tau. Here we present data on the relation of sCD320 to Alzheimer's disease and scores of cognitive tests. Lumbar cerebrospinal fluid samples from...... 42 pathologically-confirmed cases of Alzheimer's disease and 25 non-demented controls were analyzed for sCD320 employing an in-house ELISA. The participants' cognitive functions were tested using the Cambridge Cognition Examination (CAMCOG) and the Mini-Mental State Examination (MMSE...... be employed as a biomarker for differentiating Alzheimer dementia patients from controls. Further studies are warranted to explore the non-linear correlations between sCD320 and scores of cognitive function....
Survival analysis of colorectal cancer patients with tumor recurrence using global score test methodology

Energy Technology Data Exchange (ETDEWEB)

Zain, Zakiyah, E-mail: zac@uum.edu.my; Ahmad, Yuhaniz, E-mail: yuhaniz@uum.edu.my [School of Quantitative Sciences, Universiti Utara Malaysia, UUM Sintok 06010, Kedah (Malaysia); Azwan, Zairul, E-mail: zairulazwan@gmail.com, E-mail: farhanaraduan@gmail.com, E-mail: drisagap@yahoo.com; Raduan, Farhana, E-mail: zairulazwan@gmail.com, E-mail: farhanaraduan@gmail.com, E-mail: drisagap@yahoo.com; Sagap, Ismail, E-mail: zairulazwan@gmail.com, E-mail: farhanaraduan@gmail.com, E-mail: drisagap@yahoo.com [Surgery Department, Universiti Kebangsaan Malaysia Medical Centre, Jalan Yaacob Latif, 56000 Bandar Tun Razak, Kuala Lumpur (Malaysia); Aziz, Nazrina, E-mail: nazrina@uum.edu.my

2014-12-04

Colorectal cancer is the third and the second most common cancer worldwide in men and women respectively, and the second in Malaysia for both genders. Surgery, chemotherapy and radiotherapy are among the options available for treatment of patients with colorectal cancer. In clinical trials, the main purpose is often to compare efficacy between experimental and control treatments. Treatment comparisons often involve several responses or endpoints, and this situation complicates the analysis. In the case of colorectal cancer, sets of responses concerned with survival times include: times from tumor removal until the first, the second and the third tumor recurrences, and time to death. For a patient, the time to recurrence is correlated to the overall survival. In this study, global score test methodology is used in combining the univariate score statistics for comparing treatments with respect to each survival endpoint into a single statistic. The data of tumor recurrence and overall survival of colorectal cancer patients are taken from a Malaysian hospital. The results are found to be similar to those computed using the established Wei, Lin and Weissfeld method. Key factors such as ethnic, gender, age and stage at diagnose are also reported.
Derivation and Cross-Validation of Cutoff Scores for Patients With Schizophrenia Spectrum Disorders on WAIS-IV Digit Span-Based Performance Validity Measures.

Science.gov (United States)

Glassmire, David M; Toofanian Ross, Parnian; Kinney, Dominique I; Nitch, Stephen R

2016-06-01

Two studies were conducted to identify and cross-validate cutoff scores on the Wechsler Adult Intelligence Scale-Fourth Edition Digit Span-based embedded performance validity (PV) measures for individuals with schizophrenia spectrum disorders. In Study 1, normative scores were identified on Digit Span-embedded PV measures among a sample of patients (n = 84) with schizophrenia spectrum diagnoses who had no known incentive to perform poorly and who put forth valid effort on external PV tests. Previously identified cutoff scores resulted in unacceptable false positive rates and lower cutoff scores were adopted to maintain specificity levels ≥90%. In Study 2, the revised cutoff scores were cross-validated within a sample of schizophrenia spectrum patients (n = 96) committed as incompetent to stand trial. Performance on Digit Span PV measures was significantly related to Full Scale IQ in both studies, indicating the need to consider the intellectual functioning of examinees with psychotic spectrum disorders when interpreting scores on Digit Span PV measures. © The Author(s) 2015.

Scoring Strategies for the TOEFL iBT A Complete Guide

CERN Document Server

Stirling, Bruce

2012-01-01

TOEFL students all ask: How can I get a high TOEFL iBT score? Answer: Learn argument scoring strategies. Why? Because the TOEFL iBT recycles opinion-based and fact-based arguments for testing purposes from start to finish. In other words, the TOEFL iBT is all arguments. That's right, all arguments. If you want a high score, you need essential argument scoring strategies. That is what Scoring Strategies for the TOEFL iBT gives you, and more!. TEST-PROVEN STRATEGIES. Learn essential TOEFL iBT scoring strategies developed in American university classrooms and proven successful on the TOEFL iBT. R
How to calculate an MMSE score from a MODA score (and vice versa) in patients with Alzheimer's disease.

Science.gov (United States)

Cazzaniga, R; Francescani, A; Saetti, C; Spinnler, H

2003-11-01

The aim of the present study was to provide a statistically sound way of reciprocally converting scores of the mini-mental state examination (MMSE) and the Milan overall dementia assessment (MODA). A consecutive series of 182 patients with "probable" Alzheimer's disease patients was examined with both tests. MODA and MMSE scores proved to be highly correlated. A formula for converting MODA and MMSE scores was generated.
Sample test cases using the environmental computer code NECTAR

International Nuclear Information System (INIS)

Ponting, A.C.

1984-06-01

This note demonstrates a few of the many different ways in which the environmental computer code NECTAR may be used. Four sample test cases are presented and described to show how NECTAR input data are structured. Edited output is also presented to illustrate the format of the results. Two test cases demonstrate how NECTAR may be used to study radio-isotopes not explicitly included in the code. (U.K.)
A Method for Choosing the Best Samples for Mars Sample Return.

Science.gov (United States)

Gordon, Peter R; Sephton, Mark A

2018-05-01

Success of a future Mars Sample Return mission will depend on the correct choice of samples. Pyrolysis-FTIR can be employed as a triage instrument for Mars Sample Return. The technique can thermally dissociate minerals and organic matter for detection. Identification of certain mineral types can determine the habitability of the depositional environment, past or present, while detection of organic matter may suggest past or present habitation. In Mars' history, the Theiikian era represents an attractive target for life search missions and the acquisition of samples. The acidic and increasingly dry Theiikian may have been habitable and followed a lengthy neutral and wet period in Mars' history during which life could have originated and proliferated to achieve relatively abundant levels of biomass with a wide distribution. Moreover, the sulfate minerals produced in the Theiikian are also known to be good preservers of organic matter. We have used pyrolysis-FTIR and samples from a Mars analog ferrous acid stream with a thriving ecosystem to test the triage concept. Pyrolysis-FTIR identified those samples with the greatest probability of habitability and habitation. A three-tier scoring system was developed based on the detection of (i) organic signals, (ii) carbon dioxide and water, and (iii) sulfur dioxide. The presence of each component was given a score of A, B, or C depending on whether the substance had been detected, tentatively detected, or not detected, respectively. Single-step (for greatest possible sensitivity) or multistep (for more diagnostic data) pyrolysis-FTIR methods informed the assignments. The system allowed the highest-priority samples to be categorized as AAA (or A*AA if the organic signal was complex), while the lowest-priority samples could be categorized as CCC. Our methods provide a mechanism with which to rank samples and identify those that should take the highest priority for return to Earth during a Mars Sample Return mission. Key Words
Salmonella testing of pooled pre-enrichment broth cultures for screening multiple food samples.

Science.gov (United States)

Price, W R; Olsen, R A; Hunter, J E

1972-04-01

A method has been described for testing multiple food samples for Salmonella without loss in sensitivity. The method pools multiple pre-enrichment broth cultures into single enrichment broths. The subsequent stages of the Salmonella analysis are not altered. The method was found applicable to several dry food materials including nonfat dry milk, dried egg albumin, cocoa, cottonseed flour, wheat flour, and shredded coconut. As many as 25 pre-enrichment broth cultures were pooled without apparent loss in the sensitivity of Salmonella detection as compared to individual sample analysis. The procedure offers a simple, yet effective, way to increase sample capacity in the Salmonella testing of foods, particularly where a large proportion of samples ordinarily is negative. It also permits small portions of pre-enrichment broth cultures to be retained for subsequent individual analysis if positive tests are found. Salmonella testing of pooled pre-enrichment broths provides increased consumer protection for a given amount of analytical effort as compared to individual sample analysis.
The accuracy of Internet search engines to predict diagnoses from symptoms can be assessed with a validated scoring system.

Science.gov (United States)

Shenker, Bennett S

2014-02-01

To validate a scoring system that evaluates the ability of Internet search engines to correctly predict diagnoses when symptoms are used as search terms. We developed a five point scoring system to evaluate the diagnostic accuracy of Internet search engines. We identified twenty diagnoses common to a primary care setting to validate the scoring system. One investigator entered the symptoms for each diagnosis into three Internet search engines (Google, Bing, and Ask) and saved the first five webpages from each search. Other investigators reviewed the webpages and assigned a diagnostic accuracy score. They rescored a random sample of webpages two weeks later. To validate the five point scoring system, we calculated convergent validity and test-retest reliability using Kendall's W and Spearman's rho, respectively. We used the Kruskal-Wallis test to look for differences in accuracy scores for the three Internet search engines. A total of 600 webpages were reviewed. Kendall's W for the raters was 0.71 (psearch engines is a valid and reliable instrument. The scoring system may be used in future Internet research. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Confidence Intervals for Weighted Composite Scores under the Compound Binomial Error Model

Science.gov (United States)

Kim, Kyung Yong; Lee, Won-Chan

2018-01-01

Reporting confidence intervals with test scores helps test users make important decisions about examinees by providing information about the precision of test scores. Although a variety of estimation procedures based on the binomial error model are available for computing intervals for test scores, these procedures assume that items are randomly…
Tank 241-AZ-101 Mixer Pump Test Vapor Sampling and Analysis Plan

International Nuclear Information System (INIS)

TEMPLETON, A.M.

2000-01-01

This sampling and analysis plan (SAP) identifies characterization objectives pertaining to sample collection, laboratory analytical evaluation, and reporting requirements for vapor samples obtained during the operation of mixer pumps in tank 241-AZ-101. The primary purpose of the mixer pump test (MPT) is to demonstrate that the two 300 horsepower mixer pumps installed in tank 241-AZ-101 can mobilize the settled sludge so that it can be retrieved for treatment and vitrification. Sampling will be performed in accordance with Tank 241-AZ-101 Mixer Pump Test Data Quality Objective (Banning 1999) and Data Quality Objectives for Regulatory Requirements for Hazardous and Radioactive Air Emissions Sampling and Analysis (Mulkey 1999). The sampling will verify if current air emission estimates used in the permit application are correct and provide information for future air permit applications
Automated Scoring for the "TOEFL Junior"® Comprehensive Writing and Speaking Test. Research Report. ETS RR-15-09

Science.gov (United States)

Evanini, Keelan; Heilman, Michael; Wang, Xinhao; Blanchard, Daniel

2015-01-01

This report describes the initial automated scoring results that were obtained using the constructed responses from the Writing and Speaking sections of the pilot forms of the "TOEFL Junior"® Comprehensive test administered in late 2011. For all of the items except one (the edit item in the Writing section), existing automated scoring…
Acceptance sampling for attributes via hypothesis testing and the hypergeometric distribution

Science.gov (United States)

Samohyl, Robert Wayne

2017-10-01

This paper questions some aspects of attribute acceptance sampling in light of the original concepts of hypothesis testing from Neyman and Pearson (NP). Attribute acceptance sampling in industry, as developed by Dodge and Romig (DR), generally follows the international standards of ISO 2859, and similarly the Brazilian standards NBR 5425 to NBR 5427 and the United States Standards ANSI/ASQC Z1.4. The paper evaluates and extends the area of acceptance sampling in two directions. First, by suggesting the use of the hypergeometric distribution to calculate the parameters of sampling plans avoiding the unnecessary use of approximations such as the binomial or Poisson distributions. We show that, under usual conditions, discrepancies can be large. The conclusion is that the hypergeometric distribution, ubiquitously available in commonly used software, is more appropriate than other distributions for acceptance sampling. Second, and more importantly, we elaborate the theory of acceptance sampling in terms of hypothesis testing rigorously following the original concepts of NP. By offering a common theoretical structure, hypothesis testing from NP can produce a better understanding of applications even beyond the usual areas of industry and commerce such as public health and political polling. With the new procedures, both sample size and sample error can be reduced. What is unclear in traditional acceptance sampling is the necessity of linking the acceptable quality limit (AQL) exclusively to the producer and the lot quality percent defective (LTPD) exclusively to the consumer. In reality, the consumer should also be preoccupied with a value of AQL, as should the producer with LTPD. Furthermore, we can also question why type I error is always uniquely associated with the producer as producer risk, and likewise, the same question arises with consumer risk which is necessarily associated with type II error. The resolution of these questions is new to the literature. The
A diagnostic scoring system for myxedema coma.

Science.gov (United States)

Popoveniuc, Geanina; Chandra, Tanu; Sud, Anchal; Sharma, Meeta; Blackman, Marc R; Burman, Kenneth D; Mete, Mihriye; Desale, Sameer; Wartofsky, Leonard

2014-08-01

To develop diagnostic criteria for myxedema coma (MC), a decompensated state of extreme hypothyroidism with a high mortality rate if untreated, in order to facilitate its early recognition and treatment. The frequencies of characteristics associated with MC were assessed retrospectively in patients from our institutions in order to derive a semiquantitative diagnostic point scale that was further applied on selected patients whose data were retrieved from the literature. Logistic regression analysis was used to test the predictive power of the score. Receiver operating characteristic (ROC) curve analysis was performed to test the discriminative power of the score. Of the 21 patients examined, 7 were reclassified as not having MC (non-MC), and they were used as controls. The scoring system included a composite of alterations of thermoregulatory, central nervous, cardiovascular, gastrointestinal, and metabolic systems, and presence or absence of a precipitating event. All 14 of our MC patients had a score of ≥60, whereas 6 of 7 non-MC patients had scores of 25 to 50. A total of 16 of 22 MC patients whose data were retrieved from the literature had a score ≥60, and 6 of 22 of these patients scored between 45 and 55. The odds ratio per each score unit increase as a continuum was 1.09 (95% confidence interval [CI], 1.01 to 1.16; P = .019); a score of 60 identified coma, with an odds ratio of 1.22. The area under the ROC curve was 0.88 (95% CI, 0.65 to 1.00), and the score of 60 had 100% sensitivity and 85.71% specificity. A score ≥60 in the proposed scoring system is potentially diagnostic for MC, whereas scores between 45 and 59 could classify patients at risk for MC.
Agreement and conversion formula between mini-mental state examination and montreal cognitive assessment in an outpatient sample.

Science.gov (United States)

Helmi, Luqman; Meagher, David; O'Mahony, Edmond; O'Neill, Donagh; Mulligan, Owen; Murthy, Sutha; McCarthy, Geraldine; Adamis, Dimitrios

2016-09-22

To explore the agreement between the mini-mental state examination (MMSE) and montreal cognitive assessment (MoCA) within community dwelling older patients attending an old age psychiatry service and to derive and test a conversion formula between the two scales. Prospective study of consecutive patients attending outpatient services. Both tests were administered by the same researcher on the same day in random order. The total sample (n = 135) was randomly divided into two groups. One to derive a conversion rule (n = 70), and a second (n = 65) in which this rule was tested. The agreement (Pearson's r) of MMSE and MoCA was 0.86 (P < 0.001), and Lin's concordance correlation coefficient (CCC) was 0.57 (95%CI: 0.45-0.66). In the second sample MoCA scores were converted to MMSE scores according to a conversion rule from the first sample which achieved agreement with the original MMSE scores of 0.89 (Pearson's r, P < 0.001) and CCC of 0.88 (95%CI: 0.82-0.92). Although the two scales overlap considerably, the agreement is modest. The conversion rule derived herein demonstrated promising accuracy and warrants further testing in other populations.
Normative Data for the Balance Error Scoring System in Adults

Directory of Open Access Journals (Sweden)

Grant L. Iverson

2013-01-01

Full Text Available Background. The balance error scoring system (BESS is a brief, easily administered test of static balance. The purpose of this study is to develop normative data for this test. Study Design. Cross-sectional, descriptive, and cohort design. Methods. The sample was drawn from a population of clients taking part in a comprehensive preventive health screen at a multidisciplinary healthcare center. Community-dwelling adults aged 20–69 (N=1,236 were administered the BESS within the context of a fitness evaluation. They did not have significant medical, neurological, or lower extremity problems that might have an adverse effect on balance. Results. There was a significant positive correlation between BESS scores and age (r=.34. BESS performance was similar for participants between the ages of 20 and 49 and significantly declined between ages 50 and 69. Men performed slightly better than women on the BESS. Women who were overweight performed significantly more poorly on the test compared to women who were not overweight (P<.0001; Cohen's d=.62. The BESS normative data are stratified by age and sex. Conclusions. These normative data provide a frame of reference for interpreting BESS performance in adults who sustain traumatic brain injuries and adults with diverse neurological or vestibular problems.
Robustness to non-normality of various tests for the one-sample location problem

Directory of Open Access Journals (Sweden)

Michelle K. McDougall

2004-01-01

Full Text Available This paper studies the effect of the normal distribution assumption on the power and size of the sign test, Wilcoxon's signed rank test and the t-test when used in one-sample location problems. Power functions for these tests under various skewness and kurtosis conditions are produced for several sample sizes from simulated data using the g-and-k distribution of MacGillivray and Cannon [5].
External validation of the simple clinical score and the HOTEL score, two scores for predicting short-term mortality after admission to an acute medical unit.

Science.gov (United States)

Stræde, Mia; Brabrand, Mikkel

2014-01-01

Clinical scores can be of aid to predict early mortality after admission to a medical admission unit. A developed scoring system needs to be externally validated to minimise the risk of the discriminatory power and calibration to be falsely elevated. We performed the present study with the objective of validating the Simple Clinical Score (SCS) and the HOTEL score, two existing risk stratification systems that predict mortality for medical patients based solely on clinical information, but not only vital signs. Pre-planned prospective observational cohort study. Danish 460-bed regional teaching hospital. We included 3046 consecutive patients from 2 October 2008 until 19 February 2009. 26 (0.9%) died within one calendar day and 196 (6.4%) died within 30 days. We calculated SCS for 1080 patients. We found an AUROC of 0.960 (95% confidence interval [CI], 0.932 to 0.988) for 24-hours mortality and 0.826 (95% CI, 0.774-0.879) for 30-day mortality, and goodness-of-fit test, χ(2) = 2.68 (10 degrees of freedom), P = 0.998 and χ(2) = 4.00, P = 0.947, respectively. We included 1470 patients when calculating the HOTEL score. Discriminatory power (AUROC) was 0.931 (95% CI, 0.901-0.962) for 24-hours mortality and goodness-of-fit test, χ(2) = 5.56 (10 degrees of freedom), P = 0.234. We find that both the SCS and HOTEL scores showed an excellent to outstanding ability in identifying patients at high risk of dying with good or acceptable precision.
An alternative to the balance error scoring system: using a low-cost balance board to improve the validity/reliability of sports-related concussion balance testing.

Science.gov (United States)

Chang, Jasper O; Levy, Susan S; Seay, Seth W; Goble, Daniel J

2014-05-01

Recent guidelines advocate sports medicine professionals to use balance tests to assess sensorimotor status in the management of concussions. The present study sought to determine whether a low-cost balance board could provide a valid, reliable, and objective means of performing this balance testing. Criterion validity testing relative to a gold standard and 7 day test-retest reliability. University biomechanics laboratory. Thirty healthy young adults. Balance ability was assessed on 2 days separated by 1 week using (1) a gold standard measure (ie, scientific grade force plate), (2) a low-cost Nintendo Wii Balance Board (WBB), and (3) the Balance Error Scoring System (BESS). Validity of the WBB center of pressure path length and BESS scores were determined relative to the force plate data. Test-retest reliability was established based on intraclass correlation coefficients. Composite scores for the WBB had excellent validity (r = 0.99) and test-retest reliability (R = 0.88). Both the validity (r = 0.10-0.52) and test-retest reliability (r = 0.61-0.78) were lower for the BESS. These findings demonstrate that a low-cost balance board can provide improved balance testing accuracy/reliability compared with the BESS. This approach provides a potentially more valid/reliable, yet affordable, means of assessing sports-related concussion compared with current methods.
Micronucleus test for radiation biodosimetry in mass casualty events: Evaluation of visual and automated scoring

Energy Technology Data Exchange (ETDEWEB)

Bolognesi, Claudia, E-mail: claudia.bolognesi@istge.i [Environmental Carcinogenesis Unit, National Cancer Research Institute, Largo R. Benzi 10, 16132 Genoa (Italy); Balia, Cristina; Roggieri, Paola [Environmental Carcinogenesis Unit, National Cancer Research Institute, Largo R. Benzi 10, 16132 Genoa (Italy); Cardinale, Francesco [Clinical Epidemiology Unit, National Cancer Research Institute, Largo R. Benzi 10, 16132 Genoa (Italy); Department of Health Sciences, University of Genoa, Genoa (Italy); Bruzzi, Paolo [Clinical Epidemiology Unit, National Cancer Research Institute, Largo R. Benzi 10, 16132 Genoa (Italy); Sorcinelli, Francesca [Environmental Carcinogenesis Unit, National Cancer Research Institute, Largo R. Benzi 10, 16132 Genoa (Italy); Laboratory of Genetics, Histology and Molecular Biology Section, Army Medical and Veterinary, Research Center, Via Santo Stefano Rotondo 4, 00184 Roma (Italy); Lista, Florigio [Laboratory of Genetics, Histology and Molecular Biology Section, Army Medical and Veterinary, Research Center, Via Santo Stefano Rotondo 4, 00184 Roma (Italy); D' Amelio, Raffaele [Sapienza, Universita di Roma II Facolta di Medicina e Chirurgia and Ministero della Difesa, Direzione Generale Sanita Militare (Italy); Righi, Enzo [Frascati National Laboratories, National Institute of Nuclear Physics, Via Enrico Fermi 40, 00044 Frascati, Rome (Italy)

2011-02-15

In the case of a large-scale nuclear or radiological incidents a reliable estimate of dose is an essential tool for providing timely assessment of radiation exposure and for making life-saving medical decisions. Cytogenetics is considered as the 'gold standard' for biodosimetry. The dicentric analysis (DA) represents the most specific cytogenetic bioassay. The micronucleus test (MN) applied in interphase in peripheral lymphocytes is an alternative and simpler approach. A dose-effect calibration curve for the MN frequency in peripheral lymphocytes from 27 adult donors was established after in vitro irradiation at a dose range 0.15-8 Gy of {sup 137}Cs gamma rays (dose rate 6 Gy min{sup -1}). Dose prediction by visual scoring in a dose-blinded study (0.15-4.0 Gy) revealed a high level of accuracy (R = 0.89). The scoring of MN is time consuming and requires adequate skills and expertise. Automated image analysis is a feasible approach allowing to reduce the time and to increase the accuracy of the dose estimation decreasing the variability due to subjective evaluation. A good correlation (R = 0.705) between visual and automated scoring with visual correction was observed over the dose range 0-2 Gy. Almost perfect discrimination power for exposure to 1-2 Gy, and a satisfactory power for 0.6 Gy were detected. This threshold level can be considered sufficient for identification of sub lethally exposed individuals by automated CBMN assay.
Harmonisation of microbial sampling and testing methods for distillate fuels

Energy Technology Data Exchange (ETDEWEB)

Hill, G.C.; Hill, E.C. [ECHA Microbiology Ltd., Cardiff (United Kingdom)

1995-05-01

Increased incidence of microbial infection in distillate fuels has led to a demand for organisations such as the Institute of Petroleum to propose standards for microbiological quality, based on numbers of viable microbial colony forming units. Variations in quality requirements, and in the spoilage significance of contaminating microbes plus a tendency for temporal and spatial changes in the distribution of microbes, makes such standards difficult to implement. The problem is compounded by a diversity in the procedures employed for sampling and testing for microbial contamination and in the interpretation of the data obtained. The following paper reviews these problems and describes the efforts of The Institute of Petroleum Microbiology Fuels Group to address these issues and in particular to bring about harmonisation of sampling and testing methods. The benefits and drawbacks of available test methods, both laboratory based and on-site, are discussed.
Propensity-score matching in the cardiovascular surgery literature from 2004 to 2006: a systematic review and suggestions for improvement.

Science.gov (United States)

Austin, Peter C

2007-11-01

I conducted a systematic review of the use of propensity score matching in the cardiovascular surgery literature. I examined the adequacy of reporting and whether appropriate statistical methods were used. I examined 60 articles published in the Annals of Thoracic Surgery, European Journal of Cardio-thoracic Surgery, Journal of Cardiovascular Surgery, and the Journal of Thoracic and Cardiovascular Surgery between January 1, 2004, and December 31, 2006. Thirty-one of the 60 studies did not provide adequate information on how the propensity score-matched pairs were formed. Eleven (18%) of studies did not report on whether matching on the propensity score balanced baseline characteristics between treated and untreated subjects in the matched sample. No studies used appropriate methods to compare baseline characteristics between treated and untreated subjects in the propensity score-matched sample. Eight (13%) of the 60 studies explicitly used statistical methods appropriate for the analysis of matched data when estimating the effect of treatment on the outcomes. Two studies used appropriate methods for some outcomes, but not for all outcomes. Thirty-nine (65%) studies explicitly used statistical methods that were inappropriate for matched-pairs data when estimating the effect of treatment on outcomes. Eleven studies did not report the statistical tests that were used to assess the statistical significance of the treatment effect. Analysis of propensity score-matched samples tended to be poor in the cardiovascular surgery literature. Most statistical analyses ignored the matched nature of the sample. I provide suggestions for improving the reporting and analysis of studies that use propensity score matching.
Comparison of the Abbott RealTime High Risk HPV test and the Roche cobas 4800 HPV test using urine samples.

Science.gov (United States)

Lim, Myong Cheol; Lee, Do-Hoon; Hwang, Sang-Hyun; Hwang, Na Rae; Lee, Bomyee; Shin, Hye Young; Jun, Jae Kwan; Yoo, Chong Woo; Lee, Dong Ock; Seo, Sang-Soo; Park, Sang-Yoon; Joo, Jungnam

2017-05-01

Human papillomavirus (HPV) testing based on cervical samples is important for use in cervical cancer screening. However, cervical sampling is invasive. Therefore, non-invasive methods for detecting HPV, such as urine samples, are needed. For HPV detection in urine samples, two real-time PCR (RQ-PCR) tests, Roche cobas 4800 test (Roche_HPV; Roche Molecular Diagnostics) and Abbott RealTime High Risk HPV test (Abbott_HPV; Abbott Laboratories) were compared to standard cervical samples. The performance of Roche_HPV and Abbott_HPV for HPV detection was evaluated at the National Cancer Center using 100 paired cervical and urine samples. The tests were also compared using urine samples stored at various temperatures and for a range of durations. The overall agreement between the Roche_HPV and Abbott_HPV tests using urine samples for any hrHPV type was substantial (86.0% with a kappa value of 0.7173), and that for HPV 16/18 was nearly perfect (99.0% with a kappa value of 0.9668). The relative sensitivities (based on cervical samples) for HPV 16/18 detection using Roche_HPV and Abbott_HPV with urine samples were 79.2% (95% CI; 57.9-92.9%) and 81.8% (95% CI; 59.7-94.8%), respectively. When the cut-off C T value for Abbott_HPV was extended to 40 for urine samples, the relative sensitivity of Abbott_HPV increased to 91.7% from 81.8% for HPV16/18 detection and to 87.0% from 68.5% for other hrHPV detection. The specificity was not affected by the change in the C T threshold. Roche_HPV and Abbott_HPV showed high concordance. However, HPV DNA detection using urine samples was inferior to HPV DNA detection using cervical samples. Interestingly, when the cut-off C T value was set to 40, Abbott_HPV using urine samples showed high sensitivity and specificity, comparable to those obtained using cervical samples. Fully automated DNA extraction and detection systems, such as Roche_HPV and Abbott_HPV, could reduce the variability in HPV detection and accelerate the standardization of HPV

Characterization of electron microscopes with binary pseudo-random multilayer test samples

International Nuclear Information System (INIS)

Yashchuk, Valeriy V.; Conley, Raymond; Anderson, Erik H.; Barber, Samuel K.; Bouet, Nathalie; McKinney, Wayne R.; Takacs, Peter Z.; Voronov, Dmitriy L.

2010-01-01

We discuss the results of SEM and TEM measurements with the BPRML test samples fabricated from a BPRML (WSi2/Si with fundamental layer thickness of 3 nm) with a Dual Beam FIB (focused ion beam)/SEM technique. In particular, we demonstrate that significant information about the metrological reliability of the TEM measurements can be extracted even when the fundamental frequency of the BPRML sample is smaller than the Nyquist frequency of the measurements. The measurements demonstrate a number of problems related to the interpretation of the SEM and TEM data. Note that similar BPRML test samples can be used to characterize x-ray microscopes. Corresponding work with x-ray microscopes is in progress.
Linkage between company scores and stock returns

Directory of Open Access Journals (Sweden)

Saban Celik

2017-12-01

Full Text Available Previous studies on company scores conducted at firm-level, generally concluded that there exists a positive relation between company scores and stock returns. Motivated by these studies, this study examines the relationship between company scores (Corporate Governance Score, Economic Score, Environmental Score, and Social Score and stock returns, both at portfolio-level analysis and firm-level cross-sectional regressions. In portfolio-level analysis, stocks are sorted based on each company scores and quintile portfolio are formed with different levels of company scores. Then, existence and significance of raw returns and risk-adjusted returns difference between portfolios with the extreme company scores (portfolio 10 and portfolio 1 is tested. In addition, firm-level cross-sectional regression is performed to examine the significance of company scores effects with control variables. While portfolio-level analysis results indicate that there is no significant relation between company scores and stock returns; firm-level analysis indicates that economic, environmental, and social scores have effect on stock returns, however, significance and direction of these effects change, depending on the included control variables in the cross-sectional regression.
Internal consistency, test-retest reliability and measurement error of the self-report version of the social skills rating system in a sample of Australian adolescents.

Directory of Open Access Journals (Sweden)

Sharmila Vaz

Full Text Available The social skills rating system (SSRS is used to assess social skills and competence in children and adolescents. While its characteristics based on United States samples (US are published, corresponding Australian figures are unavailable. Using a 4-week retest design, we examined the internal consistency, retest reliability and measurement error (ME of the SSRS secondary student form (SSF in a sample of Year 7 students (N = 187, from five randomly selected public schools in Perth, western Australia. Internal consistency (IC of the total scale and most subscale scores (except empathy on the frequency rating scale was adequate to permit independent use. On the importance rating scale, most IC estimates for girls fell below the benchmark. Test-retest estimates of the total scale and subscales were insufficient to permit reliable use. ME of the total scale score (frequency rating for boys was equivalent to the US estimate, while that for girls was lower than the US error. ME of the total scale score (importance rating was larger than the error using the frequency rating scale. The study finding supports the idea of using multiple informants (e.g. teacher and parent reports, not just student as recommended in the manual. Future research needs to substantiate the clinical meaningfulness of the MEs calculated in this study by corroborating them against the respective Minimum Clinically Important Difference (MCID.
Internal consistency, test-retest reliability and measurement error of the self-report version of the social skills rating system in a sample of Australian adolescents.

Science.gov (United States)

Vaz, Sharmila; Parsons, Richard; Passmore, Anne Elizabeth; Andreou, Pantelis; Falkmer, Torbjörn

2013-01-01

The social skills rating system (SSRS) is used to assess social skills and competence in children and adolescents. While its characteristics based on United States samples (US) are published, corresponding Australian figures are unavailable. Using a 4-week retest design, we examined the internal consistency, retest reliability and measurement error (ME) of the SSRS secondary student form (SSF) in a sample of Year 7 students (N = 187), from five randomly selected public schools in Perth, western Australia. Internal consistency (IC) of the total scale and most subscale scores (except empathy) on the frequency rating scale was adequate to permit independent use. On the importance rating scale, most IC estimates for girls fell below the benchmark. Test-retest estimates of the total scale and subscales were insufficient to permit reliable use. ME of the total scale score (frequency rating) for boys was equivalent to the US estimate, while that for girls was lower than the US error. ME of the total scale score (importance rating) was larger than the error using the frequency rating scale. The study finding supports the idea of using multiple informants (e.g. teacher and parent reports), not just student as recommended in the manual. Future research needs to substantiate the clinical meaningfulness of the MEs calculated in this study by corroborating them against the respective Minimum Clinically Important Difference (MCID).
ISSUE PAPER: What Do Test Scores in Texas Tell Us?

National Research Council Canada - National Science Library

Klein, Stephen

2000-01-01

...) about possible unintended consequences of these programs. We conducted several analyses to examine the issue of whether TAAS scores can be trusted to provide an accurate index of student skills and abilities...
Test-retest reliability and predictive validity of the Implicit Association Test in children.

Science.gov (United States)

Rae, James R; Olson, Kristina R

2018-02-01

The Implicit Association Test (IAT) is increasingly used in developmental research despite minimal evidence of whether children's IAT scores are reliable across time or predictive of behavior. When test-retest reliability and predictive validity have been assessed, the results have been mixed, and because these studies have differed on many factors simultaneously (lag-time between testing administrations, domain, etc.), it is difficult to discern what factors may explain variability in existing test-retest reliability and predictive validity estimates. Across five studies (total N = 519; ages 6- to 11-years-old), we manipulated two factors that have varied in previous developmental research-lag-time and domain. An internal meta-analysis of these studies revealed that, across three different methods of analyzing the data, mean test-retest (rs of .48, .38, and .34) and predictive validity (rs of .46, .20, and .10) effect sizes were significantly greater than zero. While lag-time did not moderate the magnitude of test-retest coefficients, whether we observed domain differences in test-retest reliability and predictive validity estimates was contingent on other factors, such as how we scored the IAT or whether we included estimates from a unique sample (i.e., a sample containing gender typical and gender diverse children). Recommendations are made for developmental researchers that utilize the IAT in their research. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Filtration and Leach Testing for REDOX Sludge and S-Saltcake Actual Waste Sample Composites

Energy Technology Data Exchange (ETDEWEB)

Shimskey, Rick W.; Billing, Justin M.; Buck, Edgar C.; Daniel, Richard C.; Draper, Kathryn E.; Edwards, Matthew K.; Geeting, John GH; Hallen, Richard T.; Jenson, Evan D.; Kozelisky, Anne E.; MacFarlan, Paul J.; Peterson, Reid A.; Snow, Lanee A.; Swoboda, Robert G.

2009-02-20

A testing program evaluating actual tank waste was developed in response to Task 4 from the M-12 External Flowsheet Review Team (EFRT) issue response plan.( ) The test program was subdivided into logical increments. The bulk water-insoluble solid wastes that are anticipated to be delivered to the Waste Treatment and Immobilization Plant (WTP) were identified according to type such that the actual waste testing could be targeted to the relevant categories. Under test plan TP-RPP-WTP-467, eight broad waste groupings were defined. Samples available from the 222S archive were identified and obtained for testing. Under this test plan, a waste-testing program was implemented that included: • Homogenizing the archive samples by group as defined in the test plan • Characterizing the homogenized sample groups • Performing parametric leaching testing on each group for compounds of interest • Performing bench-top filtration/leaching tests in the hot cell for each group to simulate filtration and leaching activities if they occurred in the UFP2 vessel of the WTP Pretreatment Facility. This report focuses on filtration/leaching tests performed on two of the eight waste composite samples and follow-on parametric tests to support aluminum leaching results from those tests.
Science Scores in Title I Elementary Schools in North Georgia: A Project Study

Science.gov (United States)

Frias, Ramon

The No Child Left Behind Act (NCLB)'s emphasis of reading, language arts, and mathematics (RLA&M) and its de-emphasis of science has been a source of great concern among educators. Through an objectivist and constructionist framework, this study explored the unforeseen effects of the NCLB on public science education among Title I (TI) and non-Title I (NTI) students. The research questions focused on the effects of NCLB on Criterion Referenced Competency Test (CRCT) scores in the high-stakes subjects of reading, language arts, mathematics and the low stakes subject of science among TI and NTI 3rd, 4th, and 5th grade students in a north Georgia County during the 2010/2011 school year. This study also compared instructional time TI and NTI teachers dedicated to science. A causal-comparative quantitative methodology was used to analyze Georgia's public domain CRCT scores. Three independent-samples t tests showed that TI schools exhibited significantly lower Science CRCT scores than did NTI students at all grade levels (p need students with strong science foundations. Further study is recommended to analyze the factors associated with this science gap between TI and NTI students.
High-Throughput Scoring of Seed Germination.

Science.gov (United States)

Ligterink, Wilco; Hilhorst, Henk W M

2017-01-01

High-throughput analysis of seed germination for phenotyping large genetic populations or mutant collections is very labor intensive and would highly benefit from an automated setup. Although very often used, the total germination percentage after a nominated period of time is not very informative as it lacks information about start, rate, and uniformity of germination, which are highly indicative of such traits as dormancy, stress tolerance, and seed longevity. The calculation of cumulative germination curves requires information about germination percentage at various time points. We developed the GERMINATOR package: a simple, highly cost-efficient, and flexible procedure for high-throughput automatic scoring and evaluation of germination that can be implemented without the use of complex robotics. The GERMINATOR package contains three modules: (I) design of experimental setup with various options to replicate and randomize samples; (II) automatic scoring of germination based on the color contrast between the protruding radicle and seed coat on a single image; and (III) curve fitting of cumulative germination data and the extraction, recap, and visualization of the various germination parameters. GERMINATOR is a freely available package that allows the monitoring and analysis of several thousands of germination tests, several times a day by a single person.
Hematoma Shape, Hematoma Size, Glasgow Coma Scale Score and ICH Score: Which Predicts the 30-Day Mortality Better for Intracerebral Hematoma?

Science.gov (United States)

Wang, Chih-Wei; Liu, Yi-Jui; Lee, Yi-Hsiung; Hueng, Dueng-Yuan; Fan, Hueng-Chuen; Yang, Fu-Chi; Hsueh, Chun-Jen; Kao, Hung-Wen; Juan, Chun-Jung; Hsu, Hsian-He

2014-01-01

Purpose To investigate the performance of hematoma shape, hematoma size, Glasgow coma scale (GCS) score, and intracerebral hematoma (ICH) score in predicting the 30-day mortality for ICH patients. To examine the influence of the estimation error of hematoma size on the prediction of 30-day mortality. Materials and Methods This retrospective study, approved by a local institutional review board with written informed consent waived, recruited 106 patients diagnosed as ICH by non-enhanced computed tomography study. The hemorrhagic shape, hematoma size measured by computer-assisted volumetric analysis (CAVA) and estimated by ABC/2 formula, ICH score and GCS score was examined. The predicting performance of 30-day mortality of the aforementioned variables was evaluated. Statistical analysis was performed using Kolmogorov-Smirnov tests, paired t test, nonparametric test, linear regression analysis, and binary logistic regression. The receiver operating characteristics curves were plotted and areas under curve (AUC) were calculated for 30-day mortality. A P value less than 0.05 was considered as statistically significant. Results The overall 30-day mortality rate was 15.1% of ICH patients. The hematoma shape, hematoma size, ICH score, and GCS score all significantly predict the 30-day mortality for ICH patients, with an AUC of 0.692 (P = 0.0018), 0.715 (P = 0.0008) (by ABC/2) to 0.738 (P = 0.0002) (by CAVA), 0.877 (Phematoma shape, hematoma size, ICH scores and GCS score all significantly predict the 30-day mortality in an increasing order of AUC. The effect of overestimation of hematoma size by ABC/2 formula in predicting the 30-day mortality could be remedied by using ICH score. PMID:25029592
Direct concurrent comparison of multiple pediatric acute asthma scoring instruments.

Science.gov (United States)

Johnson, Michael D; Nkoy, Flory L; Sheng, Xiaoming; Greene, Tom; Stone, Bryan L; Garvin, Jennifer

2017-09-01

Appropriate delivery of Emergency Department (ED) treatment to children with acute asthma requires clinician assessment of acute asthma severity. Various clinical scoring instruments exist to standardize assessment of acute asthma severity in the ED, but their selection remains arbitrary due to few published direct comparisons of their properties. Our objective was to test the feasibility of directly comparing properties of multiple scoring instruments in a pediatric ED. Using a novel approach supported by a composite data collection form, clinicians categorized elements of five scoring instruments before and after initial treatment for 48 patients 2-18 years of age with acute asthma seen at the ED of a tertiary care pediatric hospital ED from August to December 2014. Scoring instruments were compared for inter-rater reliability between clinician types and their ability to predict hospitalization. Inter-rater reliability between clinician types was not different between instruments at any point and was lower (weighted kappa range 0.21-0.55) than values reported elsewhere. Predictive ability of most instruments for hospitalization was higher after treatment than before treatment (p < 0.05) and may vary between instruments after treatment (p = 0.054). We demonstrate the feasibility of comparing multiple clinical scoring instruments simultaneously in ED clinical practice. Scoring instruments had higher predictive ability for hospitalization after treatment than before treatment and may differ in their predictive ability after initial treatment. Definitive conclusions about the best instrument or meaningful comparison between instruments will require a study with a larger sample size.
Post-Decontamination Vapor Sampling and Analytical Test Methods

Science.gov (United States)

2015-08-12

is decontaminated that could pose an exposure hazard to unprotected personnel. The chemical contaminants may include chemical warfare agents (CWAs... decontamination process. Chemical contaminants can include chemical warfare agents (CWAs) or their simulants, nontraditional agents (NTAs), toxic industrial...a range of test articles from coupons, panels, and small fielded equipment items. 15. SUBJECT TERMS Vapor hazard; vapor sampling; chemical warfare
A test of Hartnett's revisions to the pubic symphysis and fourth rib methods on a modern sample.

Science.gov (United States)

Merritt, Catherine E

2014-05-01

Estimating age at death is one of the most important aspects of creating a biological profile. Most adult age estimation methods were developed on North American skeletal collections from the early to mid-20th century, and their applicability to modern populations has been questioned. In 2010, Hartnett used a modern skeletal collection from the Maricopia County Forensic Science Centre to revise the Suchey-Brooks pubic symphysis method and the İşcan et al. fourth rib methods. The current study tests Hartnett's revised methods as well as the original Suchey-Brooks and İşcan et al. methods on a modern sample from the William Bass Skeletal Collection (N = 313, mean age = 58.5, range 19-92). Results show that the Suchey-Brooks and İşcan et al. methods assign individuals to the correct phase 70.8% and 57.5% of the time compared with Hartnett's revised methods at 58.1% and 29.7%, respectively, with correctness scores based on one standard deviation of the mean rather than the entire age range. Accuracy and bias scores are significantly improved for Hartnett's revised pubic symphysis method and marginally better for Hartnett's revised fourth rib method, suggesting that the revised mean ages at death of Hartnett's phases better reflect this modern population. Overall, both Hartnett's revised methods are reliable age estimation methods. For the pubic symphysis, there are significant improvements in accuracy and bias scores, especially for older individuals; however, for the fourth rib, the results are comparable to the original İşcan et al. methods, with some improvement for older individuals. © 2014 American Academy of Forensic Sciences.
Design of sample analysis device for iodine adsorption efficiency test in NPPs

International Nuclear Information System (INIS)

Ji Jinnan

2015-01-01

In nuclear power plants, iodine adsorption efficiency test is used to check the iodine adsorption efficiency of the iodine adsorber. The iodine adsorption efficiency can be calculated through the analysis of the test sample, and thus to determine if the performance of the adsorber meets the requirement on the equipment operation and emission. Considering the process of test and actual demand, in this paper, a special device for the analysis of this kind of test sample is designed. The application shows that the device is with convenient operation and high reliability and accurate calculation, and improves the experiment efficiency and reduces the experiment risk. (author)
More Issues in Observed-Score Equating

Science.gov (United States)

van der Linden, Wim J.

2013-01-01

This article is a response to the commentaries on the position paper on observed-score equating by van der Linden (this issue). The response focuses on the more general issues in these commentaries, such as the nature of the observed scores that are equated, the importance of test-theory assumptions in equating, the necessity to use multiple…
An Integrated Model of Academic Self-Concept Development: Academic Self-Concept, Grades, Test Scores, and Tracking over 6 Years

Science.gov (United States)

Marsh, Herbert W.; Pekrun, Reinhard; Murayama, Kou; Arens, A. Katrin; Parker, Philip D.; Guo, Jiesi; Dicke, Theresa

2018-01-01

Our newly proposed integrated academic self-concept model integrates 3 major theories of academic self-concept formation and developmental perspectives into a unified conceptual and methodological framework. Relations among math self-concept (MSC), school grades, test scores, and school-level contextual effects over 6 years, from the end of…
Family Functioning and Child Psychopathology: Individual Versus Composite Family Scores.

Science.gov (United States)

Mathijssen, Jolanda J. J. P.; Koot, Hans M.; Verhulst, Frank C.; De Bruyn, Eric E. J.; Oud, Johan H. L.

1997-01-01

Examines the relationship of individual family members' perceptions and family mean and discrepancy scores of cohesion and adaptability with child psychopathology in a sample of 138 families. Results indicate that family mean scores, contrary to family discrepancy scores, explain more of the variance in parent-reported child psychopathology than…
Respondent-Driven Sampling – Testing Assumptions: Sampling with Replacement

Directory of Open Access Journals (Sweden)

Barash Vladimir D.

2016-03-01

Full Text Available Classical Respondent-Driven Sampling (RDS estimators are based on a Markov Process model in which sampling occurs with replacement. Given that respondents generally cannot be interviewed more than once, this assumption is counterfactual. We join recent work by Gile and Handcock in exploring the implications of the sampling-with-replacement assumption for bias of RDS estimators. We differ from previous studies in examining a wider range of sampling fractions and in using not only simulations but also formal proofs. One key finding is that RDS estimates are surprisingly stable even in the presence of substantial sampling fractions. Our analyses show that the sampling-with-replacement assumption is a minor contributor to bias for sampling fractions under 40%, and bias is negligible for the 20% or smaller sampling fractions typical of field applications of RDS.
tscvh R Package: Computational of the two samples test on microarray-sequencing data

Science.gov (United States)

Fajriyah, Rohmatul; Rosadi, Dedi

2017-12-01

We present a new R package, a tscvh (two samples cross-variance homogeneity), as we called it. This package is a software of the cross-variance statistical test which has been proposed and introduced by Fajriyah ([3] and [4]), based on the cross-variance concept. The test can be used as an alternative test for the significance difference between two means when sample size is small, the situation which is usually appeared in the bioinformatics research. Based on its statistical distribution, the p-value can be also provided. The package is built under a homogeneity of variance between samples.
Preoptometry and optometry school grade point average and optometry admissions test scores as predictors of performance on the national board of examiners in optometry part I (basic science) examination.

Science.gov (United States)

Bailey, J E; Yackle, K A; Yuen, M T; Voorhees, L I

2000-04-01

To evaluate preoptometry and optometry school grade point averages and Optometry Admission Test (OAT) scores as predictors of performance on the National Board of Examiners in Optometry NBEO Part I (Basic Science) (NBEOPI) examination. Simple and multiple correlation coefficients were computed from data obtained from a sample of three consecutive classes of optometry students (1995-1997; n = 278) at Southern California College of Optometry. The GPA after year two of optometry school was the highest correlation (r = 0.75) among all predictor variables; the average of all scores on the OAT was the highest correlation among preoptometry predictor variables (r = 0.46). Stepwise regression analysis indicated a combination of the optometry GPA, the OAT Academic Average, and the GPA in certain optometry curricular tracks resulted in an improved correlation (multiple r = 0.81). Predicted NBEOPI scores were computed from the regression equation and then analyzed by receiver operating characteristic (roc) and statistic of agreement (kappa) methods. From this analysis, we identified the predicted score that maximized identification of true and false NBEOPI failures (71% and 10%, respectively). Cross validation of this result on a separate class of optometry students resulted in a slightly lower correlation between actual and predicted NBEOPI scores (r = 0.77) but showed the criterion-predicted score to be somewhat lax. The optometry school GPA after 2 years is a reasonably good predictor of performance on the full NBEOPI examination, but the prediction is enhanced by adding the Academic Average OAT score. However, predicting performance in certain subject areas of the NBEOPI examination, for example Psychology and Ocular/Visual Biology, was rather insubstantial. Nevertheless, predicting NBEOPI performance from the best combination of year two optometry GPAs and preoptometry variables is better than has been shown in previous studies predicting optometry GPA from the best

Some links on this page may take you to non-federal websites. Their policies may differ from this site.