total score reliability: Topics by WorldWideScience.org

Sample records for total score reliability

Interobserver Reliability of the Total Body Score System for Quantifying Human Decomposition.

Science.gov (United States)

Dabbs, Gretchen R; Connor, Melissa; Bytheway, Joan A

2016-03-01

Several authors have tested the accuracy of the Total Body Score (TBS) method for quantifying decomposition, but none have examined the reliability of the method as a scoring system by testing interobserver error rates. Sixteen participants used the TBS system to score 59 observation packets including photographs and written descriptions of 13 human cadavers in different stages of decomposition (postmortem interval: 2-186 days). Data analysis used a two-way random model intraclass correlation in SPSS (v. 17.0). The TBS method showed "almost perfect" agreement between observers, with average absolute correlation coefficients of 0.990 and average consistency correlation coefficients of 0.991. While the TBS method may have sources of error, scoring reliability is not one of them. Individual component scores were examined, and the influences of education and experience levels were investigated. Overall, the trunk component scores were the least concordant. Suggestions are made to improve the reliability of the TBS method. © 2016 American Academy of Forensic Sciences.
Reliability and validation of the Dutch Achilles tendon Total Rupture Score.

Science.gov (United States)

Opdam, K T M; Zwiers, R; Wiegerinck, J I; Kleipool, A E B; Haverlag, R; Goslings, J C; van Dijk, C N

2018-03-01

Patient-reported outcome measures (PROMs) have become a cornerstone for the evaluation of the effectiveness of treatment. The Achilles tendon Total Rupture Score (ATRS) is a PROM for outcome and assessment of an Achilles tendon rupture. The aim of this study was to translate the ATRS to Dutch and evaluate its reliability and validity in the Dutch population. A forward-backward translation procedure was performed according to the guidelines of cross-cultural adaptation process. The Dutch ATRS was evaluated for reliability and validity in patients treated for a total Achilles tendon rupture from 1 January 2012 to 31 December 2014 in one teaching hospital and one academic hospital. Reliability was assessed by the intraclass correlation coefficients (ICC), Cronbach's alpha and minimal detectable change (MDC). We assessed construct validity by calculation of Spearman's rho correlation coefficient with domains of the Foot and Ankle Outcome Score (FAOS), Victorian Institute of Sports Assessment-Achilles questionnaire (VISA-A) and Numeric Rating Scale (NRS) for pain in rest and during running. The Dutch ATRS had a good test-retest reliability (ICC = 0.852) and a high internal consistency (Cronbach's alpha = 0.96). MDC was 30.2 at individual level and 3.5 at group level. Construct validity was supported by 75 % of the hypothesized correlations. The Dutch ATRS had a strong correlation with NRS for pain during running (r = -0.746) and all the five subscales of the Dutch FAOS (r = 0.724-0.867). There was a moderate correlation with the VISA-A-NL (r = 0.691) and NRS for pain in rest (r = -0.580). The Dutch ATRS shows an adequate reliability and validity and can be used in the Dutch population for measuring the outcome of treatment of a total Achilles tendon rupture and for research purposes. Diagnostic study, Level I.
Validity and reliability of the Achilles tendon total rupture score.

Science.gov (United States)

Ganestam, Ann; Barfod, Kristoffer; Klit, Jakob; Troelsen, Anders

2013-01-01

The best treatment of acute Achilles tendon rupture remains debated. Patient-reported outcome measures have become cornerstones in treatment evaluations. The Achilles tendon total rupture score (ATRS) has been developed for this purpose but requires additional validation. The purpose of the present study was to validate a Danish translation of the ATRS. The ATRS was translated into Danish according to internationally adopted standards. Of 142 patients, 90 with previous rupture of the Achilles tendon participated in the validity study and 52 in the reliability study. The ATRS showed moderately strong correlations with the physical subscores of the Medical Outcomes Study 36-item Short-Form Health Survey (r = .70 to .75; p questionnaire (r = .71; p validity. For study and follow-up purposes, the ATRS seems reliable for comparisons of groups of patients. Its usability is limited for repeated assessment of individual patients. The development of analysis guidelines would be desirable. Copyright © 2013 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.
Lower bounds to the reliabilities of factor score estimators

NARCIS (Netherlands)

Hessen, D.J.

2017-01-01

Under the general common factor model, the reliabilities of factor score estimators might be of more interest than the reliability of the total score (the unweighted sum of item scores). In this paper, lower bounds to the reliabilities of Thurstone’s factor score estimators, Bartlett’s factor score
Validity and Reliability of the Achilles Tendon Total Rupture Score

DEFF Research Database (Denmark)

Ganestam, Ann; Barfod, Kristoffer; Klit, Jakob

2013-01-01

study was to validate a Danish translation of the ATRS. The ATRS was translated into Danish according to internationally adopted standards. Of 142 patients, 90 with previous rupture of the Achilles tendon participated in the validity study and 52 in the reliability study. The ATRS showed moderately......The best treatment of acute Achilles tendon rupture remains debated. Patient-reported outcome measures have become cornerstones in treatment evaluations. The Achilles tendon total rupture score (ATRS) has been developed for this purpose but requires additional validation. The purpose of the present...... = .07). The limits of agreement were ±18.53. A strong correlation was found between test and retest (intercorrelation coefficient .908); the standard error of measurement was 6.7, and the minimal detectable change was 18.5. The Danish version of the ATRS showed moderately strong criterion validity...
Lower Bounds to the Reliabilities of Factor Score Estimators.

Science.gov (United States)

Hessen, David J

2016-10-06

Under the general common factor model, the reliabilities of factor score estimators might be of more interest than the reliability of the total score (the unweighted sum of item scores). In this paper, lower bounds to the reliabilities of Thurstone's factor score estimators, Bartlett's factor score estimators, and McDonald's factor score estimators are derived and conditions are given under which these lower bounds are equal. The relative performance of the derived lower bounds is studied using classic example data sets. The results show that estimates of the lower bounds to the reliabilities of Thurstone's factor score estimators are greater than or equal to the estimates of the lower bounds to the reliabilities of Bartlett's and McDonald's factor score estimators.
Intra- and inter-rater reliability of the Knee Society Knee Score when used by two physiotherapists in patients post total knee arthroplasty

Directory of Open Access Journals (Sweden)

S. Gopal

2010-01-01

Full Text Available Background and Purpose: It has yet to be shown whether routine physiotherapy plays a role in the rehabilitation of patients post totalknee arthroplasty (Rajan et al 2004. Physiotherapists should be using validoutcome measures to provide evidence of the benefit of their intervention. The aim of this study was to establish the intra and inter-rater reliability of the Knee Society Knee Score, a scoring system developed by Insall et al(1989. The Knee Society Knee Score can be used to assess the integrity of theknee joint of patients undergoing total knee arthroplasty. Since the scoreinvolves clinical testing, the intra-rater reliability of the clinician should be established prior to using the scores as datain clinical research. W here multiple clinicians are involved, inter-rater reliability should also be established.Design: This was a correlation study.Subjects: A sample of thirty patients post total knee arthroplasty attending the arthroplasty clinic at Johannesburg Hospital between six weeks and twelve months postoperatively.M ethod: Recruited patients were evaluated twice with a time interval of one hour between each assessment. Statistical A nalysis: The intra- and inter-rater reliability were estimated using Intraclass Correlation Coefficient (ICC. R esults: The intra-rater reliability showed excellent reliability (h= 0.95 for Examiner A and good reliability (h= 0.71for Examiner B. The inter-rater reliability showed moderate reliability (h= 0.67 during test one and h= 0.66 during test two.Conclusion: The KSKS has good intra-rater reliability when tested within a period of one hour. The KSKS demonstrated moderate agreement for inter rater reliability.
Cross-cultural adaptation and validation of Persian Achilles tendon Total Rupture Score.

Science.gov (United States)

Ansari, Noureddin Nakhostin; Naghdi, Soofia; Hasanvand, Sahar; Fakhari, Zahra; Kordi, Ramin; Nilsson-Helander, Katarina

2016-04-01

To cross-culturally adapt the Achilles tendon Total Rupture Score (ATRS) to Persian language and to preliminary evaluate the reliability and validity of a Persian ATRS. A cross-sectional and prospective cohort study was conducted to translate and cross-culturally adapt the ATRS to Persian language (ATRS-Persian) following steps described in guidelines. Thirty patients with total Achilles tendon rupture and 30 healthy subjects participated in this study. Psychometric properties of floor/ceiling effects (responsiveness), internal consistency reliability, test-retest reliability, standard error of measurement (SEM), smallest detectable change (SDC), construct validity, and discriminant validity were tested. Factor analysis was performed to determine the ATRS-Persian structure. There were no floor or ceiling effects that indicate the content and responsiveness of ATRS-Persian. Internal consistency was high (Cronbach's α 0.95). Item-total correlations exceeded acceptable standard of 0.3 for the all items (0.58-0.95). The test-retest reliability was excellent [(ICC)agreement 0.98]. SEM and SDC were 3.57 and 9.9, respectively. Construct validity was supported by a significant correlation between the ATRS-Persian total score and the Persian Foot and Ankle Outcome Score (PFAOS) total score and PFAOS subscales (r = 0.55-0.83). The ATRS-Persian significantly discriminated between patients and healthy subjects. Explanatory factor analysis revealed 1 component. The ATRS was cross-culturally adapted to Persian and demonstrated to be a reliable and valid instrument to measure functional outcomes in Persian patients with Achilles tendon rupture. II.
Clinical use of the ABO-Scoring Index: reliability and subtraction frequency.

Science.gov (United States)

Lieber, William S; Carlson, Sean K; Baumrind, Sheldon; Poulton, Donald R

2003-10-01

This study tested the reliability and subtraction frequency of the study model-scoring system of the American Board of Orthodontists (ABO). We used a sample of 36 posttreatment study models that were selected randomly from six different orthodontic offices. Intrajudge and interjudge reliability was calculated using nonparametric statistics (Spearman rank coefficient, Wilcoxon, Kruskal-Wallis, and Mann-Whitney tests). We found differences ranging from 3 to 6 subtraction points (total score) for intrajudge scoring between two sessions. For overall total ABO score, the average correlation was .77. Intrajudge correlation was greatest for occlusal relationships and least for interproximal contacts. Interjudge correlation for ABO score averaged r = .85. Correlation was greatest for buccolingual inclination and least for overjet. The data show that some judges, on average, were much more lenient than others and that this resulted in a range of total scores between 19.7 and 27.5. Most of the deductions were found in the buccal segments and most were related to the second molars. We present these findings in the context of clinicians preparing for the ABO phase III examination and for orthodontists in their ongoing evaluation of clinical results.
Examining the reliability of ADAS-Cog change scores.

Science.gov (United States)

Grochowalski, Joseph H; Liu, Ying; Siedlecki, Karen L

2016-09-01

The purpose of this study was to estimate and examine ways to improve the reliability of change scores on the Alzheimer's Disease Assessment Scale, Cognitive Subtest (ADAS-Cog). The sample, provided by the Alzheimer's Disease Neuroimaging Initiative, included individuals with Alzheimer's disease (AD) (n = 153) and individuals with mild cognitive impairment (MCI) (n = 352). All participants were administered the ADAS-Cog at baseline and 1 year, and change scores were calculated as the difference in scores over the 1-year period. Three types of change score reliabilities were estimated using multivariate generalizability. Two methods to increase change score reliability were evaluated: reweighting the subtests of the scale and adding more subtests. Reliability of ADAS-Cog change scores over 1 year was low for both the AD sample (ranging from .53 to .64) and the MCI sample (.39 to .61). Reweighting the change scores from the AD sample improved reliability (.68 to .76), but lengthening provided no useful improvement for either sample. The MCI change scores had low reliability, even with reweighting and adding additional subtests. The ADAS-Cog scores had low reliability for measuring change. Researchers using the ADAS-Cog should estimate and report reliability for their use of the change scores. The ADAS-Cog change scores are not recommended for assessment of meaningful clinical change.
Test-retest reliability at the item level and total score level of the Norwegian version of the Spinal Cord Injury Falls Concern Scale (SCI-FCS).

Science.gov (United States)

Roaldsen, Kirsti Skavberg; Måøy, Åsa Blad; Jørgensen, Vivien; Stanghelle, Johan Kvalvik

2016-05-01

Translation of the Spinal Cord Injury Falls Concern Scale (SCI-FCS), and investigation of test-retest reliability on item-level and total-score-level. Translation, adaptation and test-retest study. A specialized rehabilitation setting in Norway. Fifty-four wheelchair users with a spinal cord injury. The median age of the cohort was 49 years, and the median number of years after injury was 13. Interventions/measurements: The SCI-FCS was translated and back-translated according to guidelines. Individuals answered the SCI-FCS twice over the course of one week. We investigated item-level test-retest reliability using Svensson's rank-based statistical method for disagreement analysis of paired ordinal data. For relative reliability, we analyzed the total-score-level test-retest reliability with intraclass correlation coefficients (ICC2.1), the standard error of measurement (SEM), and the smallest detectable change (SDC) for absolute reliability/measurement-error assessment and Cronbach's alpha for internal consistency. All items showed satisfactory percentage agreement (≥69%) between test and retest. There were small but non-negligible systematic disagreements among three items; we recovered an 11-13% higher chance for a lower second score. There was no disagreement due to random variance. The test-retest agreement (ICC2.1) was excellent (0.83). The SEM was 2.6 (12%), and the SDC was 7.1 (32%). The Cronbach's alpha was high (0.88). The Norwegian SCI-FCS is highly reliable for wheelchair users with chronic spinal cord injuries.
Reliable scar scoring system to assess photographs of burn patients.

Science.gov (United States)

Mecott, Gabriel A; Finnerty, Celeste C; Herndon, David N; Al-Mousawi, Ahmed M; Branski, Ludwik K; Hegde, Sachin; Kraft, Robert; Williams, Felicia N; Maldonado, Susana A; Rivero, Haidy G; Rodriguez-Escobar, Noe; Jeschke, Marc G

2015-12-01

Several scar-scoring scales exist to clinically monitor burn scar development and maturation. Although scoring scars through direct clinical examination is ideal, scars must sometimes be scored from photographs. No scar scale currently exists for the latter purpose. We modified a previously described scar scale (Yeong et al., J Burn Care Rehabil 1997) and tested the reliability of this new scale in assessing burn scars from photographs. The new scale consisted of three parameters as follows: scar height, surface appearance, and color mismatch. Each parameter was assigned a score of 1 (best) to 4 (worst), generating a total score of 3-12. Five physicians with burns training scored 120 representative photographs using the original and modified scales. Reliability was analyzed using coefficient of agreement, Cronbach alpha, intraclass correlation coefficient, variance, and coefficient of variance. Analysis of variance was performed using the Kruskal-Wallis test. Color mismatch and scar height scores were validated by analyzing actual height and color differences. The intraclass correlation coefficient, the coefficient of agreement, and Cronbach alpha were higher for the modified scale than those of the original scale. The original scale produced more variance than that in the modified scale. Subanalysis demonstrated that, for all categories, the modified scale had greater correlation and reliability than the original scale. The correlation between color mismatch scores and actual color differences was 0.84 and between scar height scores and actual height was 0.81. The modified scar scale is a simple, reliable, and useful scale for evaluating photographs of burn patients. Copyright © 2015 Elsevier Inc. All rights reserved.
Good validity and reliability of the forgotten joint score in evaluating the outcome of total knee arthroplasty

DEFF Research Database (Denmark)

Thomsen, Morten G; Latifi, Roshan; Kallemose, Thomas

2016-01-01

. We investigated the validity and reliability of the FJS. Patients and methods - A Danish version of the FJS questionnaire was created according to internationally accepted standards. 360 participants who underwent primary TKA were invited to participate in the study. Of these, 315 were included...... in a validity study and 150 in a reliability study. Correlation between the Oxford knee score (OKS) and the FJS was examined and test-retest evaluation was performed. A ceiling effect was defined as participants reaching a score within 15% of the maximum achievable score. Results - The validity study revealed...... of the FJS (ICC? 0.79). We found a high level of internal consistency (Cronbach's? = 0.96). The ceiling effect for the FJS was 16%, as compared to 37% for the OKS. Interpretation - The FJS showed good construct validity and test-retest reliability. It had a lower ceiling effect than the OKS. The FJS appears...
Reliability Generalization: Exploring Variation of Reliability Coefficients of MMPI Clinical Scales Scores.

Science.gov (United States)

Vacha-Haase, Tammi; Kogan, Lori R.; Tani, Crystal R.; Woodall, Renee A.

2001-01-01

Used reliability generalization to explore the variance of scores on 10 Minnesota Multiphasic Personality Inventory (MMPI) clinical scales drawing on 1,972 articles in the literature on the MMPI. Results highlight the premise that scores, not tests, are reliable or unreliable, and they show that study characteristics do influence scores on the…
The scoring of arousal in sleep: reliability, validity, and alternatives.

Science.gov (United States)

Bonnet, Michael H; Doghramji, Karl; Roehrs, Timothy; Stepanski, Edward J; Sheldon, Stephen H; Walters, Arthur S; Wise, Merrill; Chesson, Andrew L

2007-03-15

The reliability and validity of EEG arousals and other types of arousal are reviewed. Brief arousals during sleep had been observed for many years, but the evolution of sleep medicine in the 1980s directed new attention to these events. Early studies at that time in animals and humans linked brief EEG arousals and associated fragmentation of sleep to daytime sleepiness and degraded performance. Increasing interest in scoring of EEG arousals led the ASDA to publish a scoring manual in 1992. The current review summarizes numerous studies that have examined scoring reliability for these EEG arousals. Validity of EEG arousals was explored by review of studies that empirically varied arousals and found deficits similar to those found after total sleep deprivation depending upon the rate and extent of sleep fragmentation. Additional data from patients with clinical sleep disorders prior to and after effective treatment has also shown a continuing relationship between reduction in pathology-related arousals and improved sleep and daytime function. Finally, many suggestions have been made to refine arousal scoring to include additional elements (e.g., CAP), change the time frame, or focus on other physiological responses such as heart rate or blood pressure changes. Evidence to support the reliability and validity of these measures is presented. It was concluded that the scoring of EEG arousals has added much to our understanding of the sleep process but that significant work on the neurophysiology of arousal needs to be done. Additional refinement of arousal scoring will provide improved insight into sleep pathology and recovery.
How reliable are Functional Movement Screening scores? A systematic review of rater reliability.

Science.gov (United States)

Moran, Robert W; Schneiders, Anthony G; Major, Katherine M; Sullivan, S John

2016-05-01

Several physical assessment protocols to identify intrinsic risk factors for injury aetiology related to movement quality have been described. The Functional Movement Screen (FMS) is a standardised, field-expedient test battery intended to assess movement quality and has been used clinically in preparticipation screening and in sports injury research. To critically appraise and summarise research investigating the reliability of scores obtained using the FMS battery. Systematic literature review. Systematic search of Google Scholar, Scopus (including ScienceDirect and PubMed), EBSCO (including Academic Search Complete, AMED, CINAHL, Health Source: Nursing/Academic Edition), MEDLINE and SPORTDiscus. Studies meeting eligibility criteria were assessed by 2 reviewers for risk of bias using the Quality Appraisal of Reliability Studies checklist. Overall quality of evidence was determined using van Tulder's levels of evidence approach. 12 studies were appraised. Overall, there was a 'moderate' level of evidence in favour of 'acceptable' (intraclass correlation coefficient ≥0.6) inter-rater and intra-rater reliability for composite scores derived from live scoring. For inter-rater reliability of composite scores derived from video recordings there was 'conflicting' evidence, and 'limited' evidence for intra-rater reliability. For inter-rater reliability based on live scoring of individual subtests there was 'moderate' evidence of 'acceptable' reliability (κ≥0.4) for 4 subtests (Deep Squat, Shoulder Mobility, Active Straight-leg Raise, Trunk Stability Push-up) and 'conflicting' evidence for the remaining 3 (Hurdle Step, In-line Lunge, Rotary Stability). This review found 'moderate' evidence that raters can achieve acceptable levels of inter-rater and intra-rater reliability of composite FMS scores when using live ratings. Overall, there were few high-quality studies, and the quality of several studies was impacted by poor study reporting particularly in relation to
Inter-expert and intra-expert reliability in sleep spindle scoring

DEFF Research Database (Denmark)

Wendt, Sabrina Lyngbye; Welinder, Peter; Sørensen, Helge Bjarup Dissing

2015-01-01

Objectives To measure the inter-expert and intra-expert agreement in sleep spindle scoring, and to quantify how many experts are needed to build a reliable dataset of sleep spindle scorings. Methods The EEG dataset was comprised of 400 randomly selected 115 s segments of stage 2 sleep from 110...... with higher reliability than the estimation of spindle duration. Reliability of sleep spindle scoring can be improved by using qualitative confidence scores, rather than a dichotomous yes/no scoring system. Conclusions We estimate that 2–3 experts are needed to build a spindle scoring dataset...... with ‘substantial’ reliability (κ: 0.61–0.8), and 4 or more experts are needed to build a dataset with ‘almost perfect’ reliability (κ: 0.81–1). Significance Spindle scoring is a critical part of sleep staging, and spindles are believed to play an important role in development, aging, and diseases of the nervous...
Development and Reliability of a Preliminary Foot Osteoarthritis Magnetic Resonance Imaging Score.

Science.gov (United States)

Halstead, Jill; Martín-Hervás, Carmen; Hensor, Elizabeth M A; McGonagle, Dennis; Keenan, Anne-Maree; Redmond, Anthony C; Conaghan, Philip G

2017-08-01

Foot osteoarthritis (OA) is very common but underinvestigated musculoskeletal condition and there is little consensus as to common magnetic resonance imaging (MRI) features. The aim of this study was to develop a preliminary foot OA MRI score (FOAMRIS) and evaluate its reliability. This preliminary semiquantitative score included the hindfoot, midfoot, and metatarsophalangeal joints. Joints were scored for joint space narrowing (JSN; 0-3), osteophytes (0-3), joint effusion/synovitis, and bone cysts (present/absent). Erosions and bone marrow lesions (BML) were scored (0-3) and BML were evaluated adjacent to entheses and at sub-tendon sites (present/absent). Additionally, tenosynovitis (0-3) and midfoot ligament pathology (present/absent) were scored. Reliability was evaluated in 15 people with foot pain and MRI-detected OA using 3.0T MRI multi-sequence protocols, and assessed using ICC as an overall score and per anatomical site. Intrareader agreement (ICC) was generally good to excellent across the foot in joint features (JSN 0.90, osteophytes 0.90, effusion/synovitis 0.46, cysts 0.87), bone features (BML 0.83, erosion 0.66, BML entheses 0.66, BML sub-tendon 0.60) and soft tissue features (tenosynovitis 0.83, ligaments 0.77). Interreader agreement was lower for joint features (JSN 0.43, osteophytes 0.27, effusion/synovitis 0.02, cysts 0.48), bone features (BML 0.68, erosion 0.00, BML entheses 0.34, BML sub-tendon 0.13), and soft tissue features (tenosynovitis 0.35, ligaments 0.33). This preliminary FOAMRIS demonstrated good intrareader reliability and fair interreader reliability when assessing the total feature scores. Further development is required in cohorts with a range of pathologies and to assess the psychometric measurement properties.
Reliability, Validity, and Responsiveness of InFLUenza Patient-Reported Outcome (FLU-PRO©) Scores in Influenza-Positive Patients.

Science.gov (United States)

Powers, John H; Bacci, Elizabeth D; Guerrero, M Lourdes; Leidy, Nancy Kline; Stringer, Sonja; Kim, Katherine; Memoli, Matthew J; Han, Alison; Fairchok, Mary P; Chen, Wei-Ju; Arnold, John C; Danaher, Patrick J; Lalani, Tahaniyat; Ridoré, Michelande; Burgess, Timothy H; Millar, Eugene V; Hernández, Andrés; Rodríguez-Zulueta, Patricia; Smolskis, Mary C; Ortega-Gallegos, Hilda; Pett, Sarah; Fischer, William; Gillor, Daniel; Macias, Laura Moreno; DuVal, Anna; Rothman, Richard; Dugas, Andrea; Ruiz-Palacios, Guillermo M

2018-02-01

To assess the reliability, validity, and responsiveness of InFLUenza Patient-Reported Outcome (FLU-PRO©) scores for quantifying the presence and severity of influenza symptoms. An observational prospective cohort study of adults (≥18 years) with influenza-like illness in the United States, the United Kingdom, Mexico, and South America was conducted. Participants completed the 37-item draft FLU-PRO daily for up to 14 days. Item-level and factor analyses were used to remove items and determine factor structure. Reliability of the final tool was estimated using Cronbach α and intraclass correlation coefficients (2-day reliability). Convergent and known-groups validity and responsiveness were assessed using global assessments of influenza severity and return to usual health. Of the 536 patients enrolled, 221 influenza-positive subjects comprised the analytical sample. The mean age of the patients was 40.7 years, 60.2% were women, and 59.7% were white. The final 32-item measure has six factors/domains (nose, throat, eyes, chest/respiratory, gastrointestinal, and body/systemic), with a higher order factor representing symptom severity overall (comparative fit index = 0.92; root mean square error of approximation = 0.06). Cronbach α was high (total = 0.92; domain range = 0.71-0.87); test-retest reliability (intraclass correlation coefficient, day 1-day 2) was 0.83 for total scores and 0.57 to 0.79 for domains. Day 1 FLU-PRO domain and total scores were moderately to highly correlated (≥0.30) with Patient Global Rating of Flu Severity (except nose and throat). Consistent with known-groups validity, scores differentiated severity groups on the basis of global rating (total: F = 57.2, P FLU-PRO score improvement by day 7 than did those who did not, suggesting score responsiveness. Results suggest that FLU-PRO scores are reliable, valid, and responsive to change in influenza-positive adults. Copyright © 2018 International Society for Pharmacoeconomics and Outcomes
Revised scoring and improved reliability for the Communication Patterns Questionnaire.

Science.gov (United States)

Crenshaw, Alexander O; Christensen, Andrew; Baucom, Donald H; Epstein, Norman B; Baucom, Brian R W

2017-07-01

The Communication Patterns Questionnaire (CPQ; Christensen, 1987) is a widely used self-report measure of couple communication behavior and is well validated for assessing the demand/withdraw interaction pattern, which is a robust predictor of poor relationship and individual outcomes (Schrodt, Witt, & Shimkowski, 2014). However, no studies have examined the CPQ's factor structure using analytic techniques sufficient by modern standards, nor have any studies replicated the factor structure using additional samples. Further, the current scoring system uses fewer than half of the total items for its 4 subscales, despite the existence of unused items that have content conceptually consistent with those subscales. These characteristics of the CPQ have likely contributed to findings that subscale scores are often troubled by suboptimal psychometric properties such as low internal reliability (e.g., Christensen, Eldridge, Catta-Preta, Lim, & Santagata, 2006). The present study uses exploratory and confirmatory factor analyses on 4 samples to reexamine the factor structure of the CPQ to improve scale score reliability and to determine if including more items in the subscales is warranted. Results indicate that a 3-factor solution (constructive communication and 2 demand/withdraw scales) provides the best fit for the data. That factor structure was confirmed in the replication samples. Compared with the original scales, the revised scales include additional items that expand the conceptual range of the constructs, substantially improve reliability of scale scores, and demonstrate stronger associations with relationship satisfaction and sensitivity to change in therapy. Implications for research and treatment are discussed. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

Cross-cultural adaptation, reliability and validity of the Turkish version of the Hospital for Special Surgery (HSS) Knee Score.

Science.gov (United States)

Narin, Selnur; Unver, Bayram; Bakırhan, Serkan; Bozan, Ozgür; Karatosun, Vasfi

2014-01-01

The purpose of this study was to adapt the English version of the Hospital for Special Surgery (HSS) knee score for use in a Turkish population and to evaluate its validity, reliability and cultural adaptation. Standard forward-back translation of the HSS knee score was performed and the Turkish version was applied in 73 patients. The Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC), Mini-Mental State Examination and sit-to-stand test were also performed and analyzed. Internal consistency reliability was tested using Cronbach's alpha. The intraclass correlation coefficient (ICC) was used to calculate the test-retest reliability at one-week intervals. Validity was assessed by calculating the Pearson correlation between the HSS, WOMAC and sit-to-stand test scores. The ICC ranged from 0.98 to 0.99 with high internal consistency (Cronbach's alpha: 0.87). The WOMAC score correlated with total HSS score (r: -0.80, p<0.001) and sit-to-stand score (r: 0.12, p: 0.312). The Turkish version of the HSS knee score is reliable and valid in evaluating the total knee arthroplasty in Turkish patients.
Spinal appearance questionnaire: factor analysis, scoring, reliability, and validity testing.

Science.gov (United States)

Carreon, Leah Y; Sanders, James O; Polly, David W; Sucato, Daniel J; Parent, Stefan; Roy-Beaudry, Marjolaine; Hopkins, Jeffrey; McClung, Anna; Bratcher, Kelly R; Diamond, Beverly E

2011-08-15

Cross sectional. This study presents the factor analysis of the Spinal Appearance Questionnaire (SAQ) and its psychometric properties. Although the SAQ has been administered to a large sample of patients with adolescent idiopathic scoliosis (AIS) treated surgically, its psychometric properties have not been fully evaluated. This study presents the factor analysis and scoring of the SAQ and evaluates its psychometric properties. The SAQ and the Scoliosis Research Society-22 (SRS-22) were administered to AIS patients who were being observed, braced or scheduled for surgery. Standard demographic data and radiographic measures including Lenke type and curve magnitude were also collected. Of the 1802 patients, 83% were female; with a mean age of 14.8 years and mean initial Cobb angle of 55.8° (range, 0°-123°). From the 32 items of the SAQ, 15 loaded on two factors with consistent and significant correlations across all Lenke types. There is an Appearance (items 1-10) and an Expectations factor (items 12-15). Responses are summed giving a range of 5 to 50 for the Appearance domain and 5 to 20 for the Expectations domain. The Cronbach's α was 0.88 for both domains and Total score with a test-retest reliability of 0.81 for Appearance and 0.91 for Expectations. Correlations with major curve magnitude were higher for the SAQ Appearance and SAQ Total scores compared to correlations between the SRS Appearance and SRS Total scores. The SAQ and SRS-22 Scores were statistically significantly different in patients who were scheduled for surgery compared to those who were observed or braced. The SAQ is a valid measure of self-image in patients with AIS with greater correlation to curve magnitude than SRS Appearance and Total score. It also discriminates between patients who require surgery from those who do not.
Reliability and Validity of Composite Scores from the NIH Toolbox Cognition Battery in Adults

Science.gov (United States)

Heaton, Robert K.; Akshoomoff, Natacha; Tulsky, David; Mungas, Dan; Weintraub, Sandra; Dikmen, Sureyya; Beaumont, Jennifer; Casaletto, Kaitlin B.; Conway, Kevin; Slotkin, Jerry; Gershon, Richard

2014-01-01

This study describes psychometric properties of the NIH Toolbox Cognition Battery (NIHTB-CB) Composite Scores in an adult sample. The NIHTB-CB was designed for use in epidemiologic studies and clinical trials for ages 3 to 85. A total of 268 self-described healthy adults were recruited at four university-based sites, using stratified sampling guidelines to target demographic variability for age (20–85 years), gender, education, and ethnicity. The NIHTB-CB contains seven computer-based instruments assessing five cognitive sub-domains: Language, Executive Function, Episodic Memory, Processing Speed, and Working Memory. Participants completed the NIHTB-CB, corresponding gold standard validation measures selected to tap the same cognitive abilities, and sociodemographic questionnaires. Three Composite Scores were derived for both the NIHTB-CB and gold standard batteries: “Crystallized Cognition Composite,” “Fluid Cognition Composite,” and “Total Cognition Composite” scores. NIHTB Composite Scores showed acceptable internal consistency (Cronbach’s alphas = 0.84 Crystallized, 0.83 Fluid, 0.77 Total), excellent test–retest reliability (r: 0.86–0.92), strong convergent (r: 0.78–0.90) and discriminant (r: 0.19–0.39) validities versus gold standard composites, and expected age effects (r = 0.18 crystallized, r = − 0.68 fluid, r = − 0.26 total). Significant relationships with self-reported prior school difficulties and current health status, employment, and presence of a disability provided evidence of external validity. The NIH Toolbox Cognition Battery Composite Scores have excellent reliability and validity, suggesting they can be used effectively in epidemiologic and clinical studies. PMID:24960398
Validity and reliability of Nintendo Wii Fit balance scores.

Science.gov (United States)

Wikstrom, Erik A

2012-01-01

Interactive gaming systems have the potential to help rehabilitate patients with musculoskeletal conditions. The Nintendo Wii Balance Board, which is part of the Wii Fit game, could be an effective tool to monitor progress during rehabilitation because the board and game can provide objective measures of balance. However, the validity and reliability of Wii Fit balance scores remain unknown. To determine the concurrent validity of balance scores produced by the Wii Fit game and the intrasession and intersession reliability of Wii Fit balance scores. Descriptive laboratory study. Sports medicine research laboratory. Forty-five recreationally active participants (age = 27.0 ± 9.8 years, height = 170.9 ± 9.2 cm, mass = 72.4 ± 11.8 kg) with a heterogeneous history of lower extremity injury. Participants completed a single-limb-stance task on a force plate and the Star Excursion Balance Test (SEBT) during the first test session. Twelve Wii Fit balance activities were completed during 2 test sessions separated by 1 week. Postural sway in the anteroposterior (AP) and mediolateral (ML) directions and the AP, ML, and resultant center-of-pressure (COP) excursions were calculated from the single-limb stance. The normalized reach distance was recorded for the anterior, posteromedial, and posterolateral directions of the SEBT. Wii Fit balance scores that the game software generated also were recorded. All 96 of the calculated correlation coefficients among Wii Fit activity outcomes and established balance outcomes were interpreted as poor (r Wii Fit balance activity scores ranged from good (intraclass correlation coefficient [ICC] = 0.80) to poor (ICC = 0.39), with 8 activities having poor intrasession reliability. Similarly, 11 of the 12 Wii Fit balance activity scores demonstrated poor intersession reliability, with scores ranging from fair (ICC = 0.74) to poor (ICC = 0.29). Wii Fit balance activity scores had poor concurrent validity relative to COP outcomes and SEBT
How reliable are Psychopathy Checklist-Revised scores in Canadian criminal trials? A case law review.

Science.gov (United States)

Edens, John F; Cox, Jennifer; Smith, Shannon Toney; DeMatteo, David; Sörman, Karolina

2015-06-01

The Psychopathy Checklist-Revised (PCL-R; Hare, 2003) is a professional rating scale that enjoys widespread use in forensic and correctional settings, primarily as a tool to inform risk assessments in a variety of types of cases (e.g., parole determinations, sexually violent predator [SVP] civil commitment). Although widely described as "reliable and valid" in research reports, several recent field studies have suggested that PCL-R scores provided by examiners in forensic cases are significantly less reliable than the interrater reliability values reported in research studies. Most of these field studies, however, have had small samples and only examined SVP civil commitment cases. This study builds on existing research by examining the reliability of PCL-R scores provided by forensic examiners in a much more extensive sample of Canadian criminal cases. Using the LexisNexis database, we identified 102 cases in which at least 2 scores were reported (of 257 total PCL-R scores). The single-rater intraclass correlation coefficient (ICC(A1)) was .59, indicating that a large percentage of the variance in individual scores was attributable to some form of error. ICC values were somewhat higher for sexual offending cases (.66) than they were for nonsexual offending cases (.46), indicating that poor interrater reliability was not restricted specifically to the assessment of sexual offenders. These and earlier findings concerning field reliability in legal cases suggest that the standard error of measurement for PCL-R scores that are provided to the courts is likely to be much larger than the value of 2.90 reported in the instrument's manual. (c) 2015 APA, all rights reserved).
Validity and Reliability of Nintendo Wii Fit Balance Scores

Science.gov (United States)

Wikstrom, Erik A.

2012-01-01

Context: Interactive gaming systems have the potential to help rehabilitate patients with musculoskeletal conditions. The Nintendo Wii Balance Board, which is part of the Wii Fit game, could be an effective tool to monitor progress during rehabilitation because the board and game can provide objective measures of balance. However, the validity and reliability of Wii Fit balance scores remain unknown. Objective: To determine the concurrent validity of balance scores produced by the Wii Fit game and the intrasession and intersession reliability of Wii Fit balance scores. Design: Descriptive laboratory study. Setting: Sports medicine research laboratory. Patients or Other Participants: Forty-five recreationally active participants (age = 27.0 ± 9.8 years, height = 170.9 ± 9.2 cm, mass = 72.4 ± 11.8 kg) with a heterogeneous history of lower extremity injury. Intervention(s): Participants completed a single-limb–stance task on a force plate and the Star Excursion Balance Test (SEBT) during the first test session. Twelve Wii Fit balance activities were completed during 2 test sessions separated by 1 week. Main Outcome Measure(s): Postural sway in the anteroposterior (AP) and mediolateral (ML) directions and the AP, ML, and resultant center-of-pressure (COP) excursions were calculated from the single-limb stance. The normalized reach distance was recorded for the anterior, posteromedial, and posterolateral directions of the SEBT. Wii Fit balance scores that the game software generated also were recorded. Results: All 96 of the calculated correlation coefficients among Wii Fit activity outcomes and established balance outcomes were interpreted as poor (r Wii Fit balance activity scores ranged from good (intraclass correlation coefficient [ICC] = 0.80) to poor (ICC = 0.39), with 8 activities having poor intrasession reliability. Similarly, 11 of the 12 Wii Fit balance activity scores demonstrated poor intersession reliability, with
The Danish Prostatic Symptom Score (DAN-PSS-1) questionnaire is reliable in stroke patients

DEFF Research Database (Denmark)

Tibaek, Sigrid; Jensen, Rigmor; Klarskov, Peter

2006-01-01

. The questionnaire consists of 12 questions related to lower urinary tract symptoms (LUTS). The participants were asked to state the frequency and severity of their symptoms (symptom score) and its impact on their daily life (bother score). Seventy-one stroke patients were included and 59 (83%) answered...... the questionnaire twice. The reliability test was done in two aspects: (a) detecting the frequency of each symptom and its bother factor, the scores were reduced to a two-category scale (=0, >0) and simple kappa statistics was used; (b) detecting the severity of each symptom and its bother factor, the total scale...... (kappa(w) = 0.48) to good (kappa(w) = 0.68). CONCLUSIONS: The DAN-PSS-1 questionnaire had acceptable test-retest reliability and may be suitable for measuring the frequency and severity of LUTS and its bother factor in stroke patients....
Reliability of a consensus-based ultrasound score for tenosynovitis in rheumatoid arthritis

DEFF Research Database (Denmark)

Naredo, Esperanza; D'Agostino, Maria Antonietta; Wakefield, Richard J

2013-01-01

OBJECTIVE: To produce consensus-based scoring systems for ultrasound (US) tenosynovitis and to assess the intraobserver and interobserver reliability of these scoring systems in rheumatoid arthritis (RA). METHODS: We undertook a Delphi process on US-defined tenosynovitis and US scoring system...... recruited. Ten rheumatologists expert in MSUS blindly, independently and consecutively scored for tenosynovitis in B-mode and PD mode three wrist extensor compartments, two finger flexor tendons and two ankle tendons of each patient in two rounds in a blinded fashion. Intraobserver reliability was assessed...... Doppler signal within the synovial sheath. The intraobserver reliability for tenosynovitis scoring on B-mode and PD mode was good (κ value 0.72 for B-mode; κ value 0.78 for PD mode). Interobserver reliability assessment showed good κ values for PD tenosynovitis scoring (first round, 0.64; second round, 0...
Chest computed tomography-based scoring of thoracic sarcoidosis: Inter-rater reliability of CT abnormalities

Energy Technology Data Exchange (ETDEWEB)

Heuvel, D.A.V. den; Es, H.W. van; Heesewijk, J.P. van; Spee, M. [St. Antonius Hospital Nieuwegein, Department of Radiology, Nieuwegein (Netherlands); Jong, P.A. de [University Medical Center Utrecht, Department of Radiology, Utrecht (Netherlands); Zanen, P.; Grutters, J.C. [University Medical Center Utrecht, Division Heart and Lungs, Utrecht (Netherlands); St. Antonius Hospital Nieuwegein, Center of Interstitial Lung Diseases, Department of Pulmonology, Nieuwegein (Netherlands)

2015-09-15

To determine inter-rater reliability of sarcoidosis-related computed tomography (CT) findings that can be used for scoring of thoracic sarcoidosis. CT images of 51 patients with sarcoidosis were scored by five chest radiologists for various abnormal CT findings (22 in total) encountered in thoracic sarcoidosis. Using intra-class correlation coefficient (ICC) analysis, inter-rater reliability was analysed and reported according to the Guidelines for Reporting Reliability and Agreement Studies (GRRAS) criteria. A pre-specified sub-analysis was performed to investigate the effect of training. Scoring was trained in a distinct set of 15 scans in which all abnormal CT findings were represented. Median age of the 51 patients (36 men, 70 %) was 43 years (range 26 - 64 years). All radiographic stages were present in this group. ICC ranged from 0.91 for honeycombing to 0.11 for nodular margin (sharp versus ill-defined). The ICC was above 0.60 in 13 of the 22 abnormal findings. Sub-analysis for the best-trained observers demonstrated an ICC improvement for all abnormal findings and values above 0.60 for 16 of the 22 abnormalities. In our cohort, reliability between raters was acceptable for 16 thoracic sarcoidosis-related abnormal CT findings. (orig.)
Chest computed tomography-based scoring of thoracic sarcoidosis: Inter-rater reliability of CT abnormalities

International Nuclear Information System (INIS)

Heuvel, D.A.V. den; Es, H.W. van; Heesewijk, J.P. van; Spee, M.; Jong, P.A. de; Zanen, P.; Grutters, J.C.

2015-01-01

To determine inter-rater reliability of sarcoidosis-related computed tomography (CT) findings that can be used for scoring of thoracic sarcoidosis. CT images of 51 patients with sarcoidosis were scored by five chest radiologists for various abnormal CT findings (22 in total) encountered in thoracic sarcoidosis. Using intra-class correlation coefficient (ICC) analysis, inter-rater reliability was analysed and reported according to the Guidelines for Reporting Reliability and Agreement Studies (GRRAS) criteria. A pre-specified sub-analysis was performed to investigate the effect of training. Scoring was trained in a distinct set of 15 scans in which all abnormal CT findings were represented. Median age of the 51 patients (36 men, 70 %) was 43 years (range 26 - 64 years). All radiographic stages were present in this group. ICC ranged from 0.91 for honeycombing to 0.11 for nodular margin (sharp versus ill-defined). The ICC was above 0.60 in 13 of the 22 abnormal findings. Sub-analysis for the best-trained observers demonstrated an ICC improvement for all abnormal findings and values above 0.60 for 16 of the 22 abnormalities. In our cohort, reliability between raters was acceptable for 16 thoracic sarcoidosis-related abnormal CT findings. (orig.)
The Pooling-score (P-score): inter- and intra-rater reliability in endoscopic assessment of the severity of dysphagia.

Science.gov (United States)

Farneti, D; Fattori, B; Nacci, A; Mancini, V; Simonelli, M; Ruoppolo, G; Genovese, E

2014-04-01

This study evaluated the intra- and inter-rater reliability of the Pooling score (P-score) in clinical endoscopic evaluation of severity of swallowing disorder, considering excess residue in the pharynx and larynx. The score (minimum 4 - maximum 11) is obtained by the sum of the scores given to the site of the bolus, the amount and ability to control residue/bolus pooling, the latter assessed on the basis of cough, raclage, number of dry voluntary or reflex swallowing acts ( 5). Four judges evaluated 30 short films of pharyngeal transit of 10 solid (1/4 of a cracker), 11 creamy (1 tablespoon of jam) and 9 liquid (1 tablespoon of 5 cc of water coloured with methlyene blue, 1 ml in 100 ml) boluses in 23 subjects (10 M/13 F, age from 31 to 76 yrs, mean age 58.56±11.76 years) with different pathologies. The films were randomly distributed on two CDs, which differed in terms of the sequence of the films, and were given to judges (after an explanatory session) at time 0, 24 hours later (time 1) and after 7 days (time 2). The inter- and intra-rater reliability of the P-score was calculated using the intra-class correlation coefficient (ICC; 3,k). The possibility that consistency of boluses could affect the scoring of the films was considered. The ICC for site, amount, management and the P-score total was found to be, respectively, 0.999, 0.997, 1.00 and 0.999. Clinical evaluation of a criterion of severity of a swallowing disorder remains a crucial point in the management of patients with pathologies that predispose to complications. The P-score, derived from static and dynamic parameters, yielded a very high correlation among the scores attributed by the four judges during observations carried out at different times. Bolus consistencies did not affect the outcome of the test: the analysis of variance, performed to verify if the scores attributed by the four judges to the parameters selected, might be influenced by the different consistencies of the boluses, was not
Processes and Procedures for Estimating Score Reliability and Precision

Science.gov (United States)

Bardhoshi, Gerta; Erford, Bradley T.

2017-01-01

Precision is a key facet of test development, with score reliability determined primarily according to the types of error one wants to approximate and demonstrate. This article identifies and discusses several primary forms of reliability estimation: internal consistency (i.e., split-half, KR-20, a), test-retest, alternate forms, interscorer, and…
Does Changing Examiner Stations During UK Postgraduate Surgery Objective Structured Clinical Examinations Influence Examination Reliability and Candidates' Scores?

Science.gov (United States)

Brennan, Peter A; Croke, David T; Reed, Malcolm; Smith, Lee; Munro, Euan; Foulkes, John; Arnett, Richard

2016-01-01

Objective structured clinical examinations (OSCE) are widely used for summative assessment in surgery. Despite standardizing these as much as possible, variation, including examiner scoring, can occur which may affect reliability. In study of a high-stakes UK postgraduate surgical OSCE, we investigated whether examiners changing stations once during a long examining day affected marking, reliability, and overall candidates' scores compared with examiners who examined the same scenario all day. An observational study of 18,262 examiner-candidate interactions from the UK Membership of the Royal College of Surgeons examination was carried at 3 Surgical Colleges across the United Kingdom. Scores between examiners were compared using analysis of variance. Examination reliability was assessed with Cronbach's alpha, and the comparative distribution of total candidates' scores for each day was evaluated using t-tests of unit-weighted z scores. A significant difference was found in absolute scores differences awarded in the morning and afternoon sessions between examiners who changed stations at lunchtime and those who did not (p design and examiner experience in surgical OSCEs and beyond. Copyright © 2016 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.
A Latent Class Approach to Estimating Test-Score Reliability

Science.gov (United States)

van der Ark, L. Andries; van der Palm, Daniel W.; Sijtsma, Klaas

2011-01-01

This study presents a general framework for single-administration reliability methods, such as Cronbach's alpha, Guttman's lambda-2, and method MS. This general framework was used to derive a new approach to estimating test-score reliability by means of the unrestricted latent class model. This new approach is the latent class reliability…
A generic method for assignment of reliability scores applied to solvent accessibility predictions

Directory of Open Access Journals (Sweden)

Nielsen Morten

2009-07-01

Full Text Available Abstract Background Estimation of the reliability of specific real value predictions is nontrivial and the efficacy of this is often questionable. It is important to know if you can trust a given prediction and therefore the best methods associate a prediction with a reliability score or index. For discrete qualitative predictions, the reliability is conventionally estimated as the difference between output scores of selected classes. Such an approach is not feasible for methods that predict a biological feature as a single real value rather than a classification. As a solution to this challenge, we have implemented a method that predicts the relative surface accessibility of an amino acid and simultaneously predicts the reliability for each prediction, in the form of a Z-score. Results An ensemble of artificial neural networks has been trained on a set of experimentally solved protein structures to predict the relative exposure of the amino acids. The method assigns a reliability score to each surface accessibility prediction as an inherent part of the training process. This is in contrast to the most commonly used procedures where reliabilities are obtained by post-processing the output. Conclusion The performance of the neural networks was evaluated on a commonly used set of sequences known as the CB513 set. An overall Pearson's correlation coefficient of 0.72 was obtained, which is comparable to the performance of the currently best public available method, Real-SPINE. Both methods associate a reliability score with the individual predictions. However, our implementation of reliability scores in the form of a Z-score is shown to be the more informative measure for discriminating good predictions from bad ones in the entire range from completely buried to fully exposed amino acids. This is evident when comparing the Pearson's correlation coefficient for the upper 20% of predictions sorted according to reliability. For this subset, values of 0
Reliability of scored patient generated subjective global assessment ...

African Journals Online (AJOL)

Objective: Establish the reliability of the scored Patient Generated-Subjective Global Assessment (PG-SGA) in determining nutritional status among Antiretroviral Therapy (ART) naive HIV-infected adults. Methods: A descriptive, cross sectional study among outpatient medical clinics, in The AIDS Support Organization ...
Interrater reliability of Violence Risk Appraisal Guide scores provided in Canadian criminal proceedings.

Science.gov (United States)

Edens, John F; Penson, Brittany N; Ruchensky, Jared R; Cox, Jennifer; Smith, Shannon Toney

2016-12-01

Published research suggests that most violence risk assessment tools have relatively high levels of interrater reliability, but recent evidence of inconsistent scores among forensic examiners in adversarial settings raises concerns about the "field reliability" of such measures. This study specifically examined the reliability of Violence Risk Appraisal Guide (VRAG) scores in Canadian criminal cases identified in the legal database, LexisNexis. Over 250 reported cases were located that made mention of the VRAG, with 42 of these cases containing 2 or more scores that could be submitted to interrater reliability analyses. Overall, scores were skewed toward higher risk categories. The intraclass correlation (ICCA1) was .66, with pairs of forensic examiners placing defendants into the same VRAG risk "bin" in 68% of the cases. For categorical risk statements (i.e., low, moderate, high), examiners provided converging assessment results in most instances (86%). In terms of potential predictors of rater disagreement, there was no evidence for adversarial allegiance in our sample. Rater disagreement in the scoring of 1 VRAG item (Psychopathy Checklist-Revised; Hare, 2003), however, strongly predicted rater disagreement in the scoring of the VRAG (r = .58). (PsycINFO Database Record (c) 2016 APA, all rights reserved).
a locally adapted functional outcome measurement score for total

African Journals Online (AJOL)

Results and success of total hip arthroplasty are often measured using a functional outcome scoring system. Most current scores were developed in Europe and. North America (1-3). During the evaluation of a Total. Hip Replacement (THR) project in Ouagadougou,. Burkina Faso (4) it was felt that these scores were not.
Validity and reliability of a novel immunosuppressive adverse effects scoring system in renal transplant recipients.

Science.gov (United States)

Meaney, Calvin J; Arabi, Ziad; Venuto, Rocco C; Consiglio, Joseph D; Wilding, Gregory E; Tornatore, Kathleen M

2014-06-12

After renal transplantation, many patients experience adverse effects from maintenance immunosuppressive drugs. When these adverse effects occur, patient adherence with immunosuppression may be reduced and impact allograft survival. If these adverse effects could be prospectively monitored in an objective manner and possibly prevented, adherence to immunosuppressive regimens could be optimized and allograft survival improved. Prospective, standardized clinical approaches to assess immunosuppressive adverse effects by health care providers are limited. Therefore, we developed and evaluated the application, reliability and validity of a novel adverse effects scoring system in renal transplant recipients receiving calcineurin inhibitor (cyclosporine or tacrolimus) and mycophenolic acid based immunosuppressive therapy. The scoring system included 18 non-renal adverse effects organized into gastrointestinal, central nervous system and aesthetic domains developed by a multidisciplinary physician group. Nephrologists employed this standardized adverse effect evaluation in stable renal transplant patients using physical exam, review of systems, recent laboratory results, and medication adherence assessment during a clinic visit. Stable renal transplant recipients in two clinical studies were evaluated and received immunosuppressive regimens comprised of either cyclosporine or tacrolimus with mycophenolic acid. Face, content, and construct validity were assessed to document these adverse effect evaluations. Inter-rater reliability was determined using the Kappa statistic and intra-class correlation. A total of 58 renal transplant recipients were assessed using the adverse effects scoring system confirming face validity. Nephrologists (subject matter experts) rated the 18 adverse effects as: 3.1 ± 0.75 out of 4 (maximum) regarding clinical importance to verify content validity. The adverse effects scoring system distinguished 1.75-fold increased gastrointestinal adverse
Increasing the reliability of the fluid/crystallized difference score from the Kaufman Adolescent and Adult Intelligence Test with reliable component analysis.

Science.gov (United States)

Caruso, J C

2001-06-01

The unreliability of difference scores is a well documented phenomenon in the social sciences and has led researchers and practitioners to interpret differences cautiously, if at all. In the case of the Kaufman Adult and Adolescent Intelligence Test (KAIT), the unreliability of the difference between the Fluid IQ and the Crystallized IQ is due to the high correlation between the two scales. The consequences of the lack of precision with which differences are identified are wide confidence intervals and unpowerful significance tests (i.e., large differences are required to be declared statistically significant). Reliable component analysis (RCA) was performed on the subtests of the KAIT in order to address these problems. RCA is a new data reduction technique that results in uncorrelated component scores with maximum proportions of reliable variance. Results indicate that the scores defined by RCA have discriminant and convergent validity (with respect to the equally weighted scores) and that differences between the scores, derived from a single testing session, were more reliable than differences derived from equal weighting for each age group (11-14 years, 15-34 years, 35-85+ years). This reliability advantage results in narrower confidence intervals around difference scores and smaller differences required for statistical significance.

Scoring sacroiliac joints by magnetic resonance imaging. A Multiple-reader reliability experiment

DEFF Research Database (Denmark)

Landewé, RB; Hermann, KG; van der Heijde, DM

2005-01-01

Magnetic resonance imaging (MRI) of the sacroiliac (SI) joints and the spine is increasingly important in the assessment of inflammatory activity and structural damage in clinical trials with patients with ankylosing spondylitis (AS). We investigated inter-reader reliability and sensitivity...... for 'depth' and 'intensity,' and the fifth method included the SPARCC slice with the maximum score. Inter-reader reliability was investigated by calculating intraclass correlation coefficients (ICC) for all readers together and for all possible reader pairs. Sensitivity to change was investigated...... values close to zero (no agreement) and highest observed values over 0.80 (excellent agreement). In general, agreement of status scores was somewhat better than agreement of change scores, and agreement of the comprehensive SPARCC scoring system was somewhat better than agreement of the more condensed...
Feasibility and reliability of a newly developed antenatal risk score card in routine care

NARCIS (Netherlands)

E. Birnie; E.A.P. Steegers; Drs. H.W. Torij; M.J. Veen; J. Poeran; G.J. Bonsel

2015-01-01

A population-based cross-sectional study (feasibility) and a cohort study (inter-rater reliability) to study in routine care the feasibility and inter-rater reliability of the Rotterdam Reproductive Risk Reduction risk score card (R4U), a new semi-quantitative score card for use during the antenatal
The Reliability and Validity of Zimbardo Time Perspective Inventory Scores in Academically Talented Adolescents

Science.gov (United States)

Worrell, Frank C.; Mello, Zena R.

2007-01-01

In this study, the authors examined the reliability, structural validity, and concurrent validity of Zimbardo Time Perspective Inventory (ZTPI) scores in a group of 815 academically talented adolescents. Reliability estimates of the purported factors' scores were in the low to moderate range. Exploratory factor analysis supported a five-factor…
Reliability of the CMT neuropathy score (second version) in Charcot-Marie-Tooth disease.

LENUS (Irish Health Repository)

Murphy, Sinéad M

2011-09-01

The Charcot-Marie-Tooth neuropathy score (CMTNS) is a reliable and valid composite score comprising symptoms, signs, and neurophysiological tests, which has been used in natural history studies of CMT1A and CMT1X and as an outcome measure in treatment trials of CMT1A. Following an international workshop on outcome measures in Charcot-Marie-Tooth disease (CMT), the CMTNS was modified to attempt to reduce floor and ceiling effects and to standardize patient assessment, aiming to improve its sensitivity for detecting change over time and the effect of an intervention. After agreeing on the modifications made to the CMTNS (CMTNS2), three examiners evaluated 16 patients to determine inter-rater reliability; one examiner evaluated 18 patients twice within 8 weeks to determine intra-rater reliability. Three examiners evaluated 63 patients using the CMTNS and the CMTNS2 to determine how the modifications altered scoring. For inter- and intra-rater reliability, intra-class correlation coefficients (ICCs) were ≥0.96 for the CMT symptom score and the CMT examination score. There were small but significant differences in some of the individual components of the CMTNS compared with the CMTNS2, mainly in the components that had been modified the most. A longitudinal study is in progress to determine whether the CMTNS2 is more sensitive than the CMTNS for detecting change over time.
Effects of Analytical and Holistic Scoring Patterns on Scorer Reliability in Biology Essay Tests

Science.gov (United States)

Ebuoh, Casmir N.

2018-01-01

Literature revealed that the patterns/methods of scoring essay tests had been criticized for not being reliable and this unreliability is more likely to be more in internal examinations than in the external examinations. The purpose of this study is to find out the effects of analytical and holistic scoring patterns on scorer reliability in…
Validity and reliability of grade scoring in the diagnosis of exercise-induced laryngeal obstruction

DEFF Research Database (Denmark)

Walsted, Emil Schwarz; Hull, James H; Hvedstrup, Jeppe

2017-01-01

The current gold-standard method for diagnosing exercise-induced laryngeal obstruction (EILO) is continuous laryngoscopy during exercise (CLE), with severity classified by a visual grade scoring system. We evaluated the precision of this approach, by evaluating test-retest reliability of CLE...... grade scoring system does not appear to be a robust means for reliably classifying severity of EILO....
Examiner Reliability of Fluorosis Scoring: A Comparison of Photographic and Clinical Examination Findings

Science.gov (United States)

Cruz-Orcutt, Noemi; Warren, John J.; Broffitt, Barbara; Levy, Steven M.; Weber-Gasparoni, Karin

2012-01-01

Objective To assess and compare examiner reliability of clinical and photographic fluorosis examinations using the Fluorosis Risk Index (FRI) among children in the Iowa Fluoride Study (IFS). Methods The IFS examined 538 children for fluorosis and dental caries at age 13 and obtained intra-oral photographs from nearly all of them. To assess examiner reliability, duplicate clinical examinations were conducted for 40 of the subjects. In addition, 200 of the photographs were scored independently for fluorosis by two examiners in a standardized manner. Fluorosis data were compared between examiners for the clinical exams and separately for the photographic exams, and a comparison was made between clinical and photographic exams. For all 3 comparisons, examiner reliability was assessed using kappa statistics at the tooth level. Results Inter-examiner reliability for the duplicate clinical exams on the sample of 40 subjects as measured by kappa was 0.59, while the repeat exams of the 200 photographs yielded a kappa of 0.64. For the comparison of photographic and clinical exams, inter-examiner reliability, as measured by weighted kappa, was 0.46. FRI scores obtained using the photographs were higher on average than those obtained from the clinical exams. Fluorosis prevalence was higher for photographs (33%) than found for clinical exam (18%). Conclusion Results suggest inter-examiner reliability is greater and fluorosis scores higher when using photographic compared to clinical examinations. PMID:22316120
Knee injury and Osteoarthritis Outcome Score (KOOS – validation and comparison to the WOMAC in total knee replacement

Directory of Open Access Journals (Sweden)

Roos Ewa M

2003-05-01

Full Text Available Abstract Background The Knee injury and Osteoarthritis Outcome Score (KOOS is an extension of the Western Ontario and McMaster Universities Osteoarthrtis Index (WOMAC, the most commonly used outcome instrument for assessment of patient-relevant treatment effects in osteoarthritis. KOOS was developed for younger and/or more active patients with knee injury and knee osteoarthritis and has in previous studies on these groups been the more responsive instrument compared to the WOMAC. Some patients eligible for total knee replacement have expectations of more demanding physical functions than required for daily living. This encouraged us to study the use of the Knee injury and Osteoarthritis Outcome Score (KOOS to assess the outcome of total knee replacement. Methods We studied the test-retest reliability, validity and responsiveness of the Swedish version LK 1.0 of the KOOS when used to prospectively evaluate the outcome of 105 patients (mean age 71.3, 66 women after total knee replacement. The follow-up rates at 6 and 12 months were 92% and 86%, respectively. Results The intraclass correlation coefficients were over 0.75 for all subscales indicating sufficient test-retest reliability. Bland-Altman plots confirmed this finding. Over 90% of the patients regarded improvement in the subscales Pain, Symptoms, Activities of Daily Living, and knee-related Quality of Life to be extremely or very important when deciding to have their knee operated on indicating good content validity. The correlations found in comparison to the SF-36 indicated the KOOS measured expected constructs. The most responsive subscale was knee-related Quality of Life. The effect sizes of the five KOOS subscales at 12 months ranged from 1.08 to 3.54 and for the WOMAC from 1.65 to 2.56. Conclusion The Knee injury and Osteoarthritis Outcome Score (KOOS is a valid, reliable, and responsive outcome measure in total joint replacement. In comparison to the WOMAC, the KOOS improved validity
High inter-rater reliability, agreement, and convergent validity of Constant score in patients with clavicle fractures

DEFF Research Database (Denmark)

Ban, Ilija; Troelsen, Anders; Kristensen, Morten Tange

2016-01-01

BACKGROUND: The Constant score (CS) has been the primary endpoint in most studies on clavicle fractures. However, the CS was not developed to assess patients with clavicle fractures. Our aim was to examine inter-rater reliability and agreement of the CS in patients with clavicle fractures...... standardized CS assessment at a mean of 6.8 weeks (SD, 1.0 weeks) after injury. Reliability and agreement of the CS were determined by 2 raters. The interclass correlation coefficient (ICC2,1), standard error of measurement, minimal detectable change, Cronbach α coefficient, and Pearson correlation coefficient...... were estimated. RESULTS: Inter-rater reliability of the total CS was excellent (interclass correlation coefficient, 0.94; 95% confidence interval, 0.88-0.97), with no systematic difference between the 2 raters (P = .75). The standard error of measurement (measurement error at the group level) was 4...
Estimating the Reliability of Aggregated and Within-Person Centered Scores in Ecological Momentary Assessment

Science.gov (United States)

Huang, Po-Hsien; Weng, Li-Jen

2012-01-01

A procedure for estimating the reliability of test scores in the context of ecological momentary assessment (EMA) was proposed to take into account the characteristics of EMA measures. Two commonly used test scores in EMA were considered: the aggregated score (AGGS) and the within-person centered score (WPCS). Conceptually, AGGS and WPCS represent…
CERAD Neuropsychological Total Scores Reflect Cortical Thinning in Prodromal Alzheimer's Disease

Directory of Open Access Journals (Sweden)

T. Paajanen

2013-11-01

Full Text Available Background: Sensitive cognitive global scores are beneficial in screening and monitoring for prodromal Alzheimer's disease (AD. Early cortical changes provide a novel opportunity for validating established cognitive total scores against the biological disease markers. Methods: We examined how two different total scores of the Consortium to Establish a Registry for Alzheimer's Disease (CERAD battery and the Mini-Mental State Examination (MMSE are associated with cortical thickness (CTH in mild cognitive impairment (MCI and prodromal AD. Cognitive and magnetic resonance imaging (MRI data of 22 progressive MCI, 78 stable MCI, and 98 control subjects, and MRI data of 103 AD patients of the prospective multicenter study were analyzed. Results: CERAD total scores correlated with mean CTH more strongly (r = 0.34-0.38, p Conclusion: CERAD total scores are sensitive to the CTH signature of prodromal AD, which supports their biological validity in detecting early disease-related cognitive changes.
High inter-tester reliability of the new mobility score in patients with hip fracture

DEFF Research Database (Denmark)

Kristensen, M.T.; Bandholm, T.; Foss, N.B.

2008-01-01

OBJECTIVE: To assess the inter-tester reliability of the New Mobility Score in patients with acute hip fracture. DESIGN: An inter-tester reliability study. SUBJECTS: Forty-eight consecutive patients with acute hip fracture at a median age of 84 (interquartile range, 76-89) years; 40 admitted from...... their own home and 8 from nursing homes to an acute orthopaedic hip fracture unit at a university hospital. METHODS: The New Mobility Score, which evaluates the prefracture functional level with a score from 0 (not able to walk at all) to 9 (fully independent), was assessed by 2 independent physiotherapists...... the prefracture functional level in patients with acute hip fracture Udgivelsesdato: 2008/7...
Reliability, validity and sensitivity to change of neurogenic bowel dysfunction score in patients with spinal cord injury

DEFF Research Database (Denmark)

Erdem, D.; Hava, D.; Keskinoglu, P.

2017-01-01

cord injury (SCI). The reliability of NBD score was assessed by test-retest reliability and internal consistency. Cronbach's alpha coefficient was calculated to determine internal consistency. The construct validity was evaluated by exploring correlations between the NBD score and SF-36 scales, patient...... assessment of impact of NBD on quality of life (QoL) and the physician global assessment (PGA). The Global Rating of Change (GRC) scale was used to assess the change of NBD to investigate the sensitivity of the score to change. Results: Cronbach's alpha coefficient was 0.547. In test-retest reliability...
High intertester reliability of the cumulated ambulation score for the evaluation of basic mobility in patients with hip fracture

DEFF Research Database (Denmark)

Kristensen, Morten Tange; Andersen, Lene; Bech-Jensen, Rie

2009-01-01

OBJECTIVE: To examine the intertester reliability of the three activities of the Cumulated Ambulation Score (CAS) and the total CAS, and to define limits for the smallest change in basic mobility that indicates a real change in patients with hip fracture. DESIGN: An intertester reliability study....... SETTING: An acute 20-bed orthopaedic hip fracture unit. SUBJECTS: Fifty consecutive patients with a median age of 83 (25-75% quartile, 68-86) years. INTERVENTIONS: The CAS, which describes the patient's independency in three activities - (1) getting in and out of bed, (2) sit to stand from a chair, and (3...
Validation and Reliability of a Smartphone Application for the International Prostate Symptom Score Questionnaire: A Randomized Repeated Measures Crossover Study

Science.gov (United States)

Shim, Sung Ryul; Sun, Hwa Yeon; Ko, Young Myoung; Chun, Dong-Il; Yang, Won Jae

2014-01-01

Background Smartphone-based assessment may be a useful diagnostic and monitoring tool for patients. There have been many attempts to create a smartphone diagnostic tool for clinical use in various medical fields but few have demonstrated scientific validity. Objective The purpose of this study was to develop a smartphone application of the International Prostate Symptom Score (IPSS) and to demonstrate its validity and reliability. Methods From June 2012 to May 2013, a total of 1581 male participants (≥40 years old), with or without lower urinary tract symptoms (LUTS), visited our urology clinic via the health improvement center at Soonchunhyang University Hospital (Republic of Korea) and were enrolled in this study. A randomized repeated measures crossover design was employed using a smartphone application of the IPSS and the conventional paper form of the IPSS. Paired t test under a hypothesis of non-inferior trial was conducted. For the reliability test, the intraclass correlation coefficient (ICC) was measured. Results The total score of the IPSS (P=.289) and each item of the IPSS (P=.157-1.000) showed no differences between the paper version and the smartphone version of the IPSS. The mild, moderate, and severe LUTS groups showed no differences between the two versions of the IPSS. A significant correlation was noted in the total group (ICC=.935, Psmartphones could participate. Conclusions The validity and reliability of the smartphone application version were comparable to the conventional paper version of the IPSS. The smartphone application of the IPSS could be an effective method for measuring lower urinary tract symptoms. PMID:24513507
High inter-tester reliability of the new mobility score in patients with hip fracture

DEFF Research Database (Denmark)

Kristensen, M.T.; Bandholm, T.; Foss, N.B.

2008-01-01

OBJECTIVE: To assess the inter-tester reliability of the New Mobility Score in patients with acute hip fracture. DESIGN: An inter-tester reliability study. SUBJECTS: Forty-eight consecutive patients with acute hip fracture at a median age of 84 (interquartile range, 76-89) years; 40 admitted from...
Product analysis and initial reliability testing of the total mesorectal excision-quality assessment instrument.

Science.gov (United States)

Simunovic, Marko R; DeNardi, Franco G; Coates, Angela J; Szalay, David A; Eva, Kevin W

2014-07-01

Product analysis of rectal cancer resection specimens before specimen fixation may provide an immediate and relevant evaluation of surgical performance. We tested the interrater reliability (IRR) of a product analysis tool called the Total Mesorectal Excision-Quality Assessment Instrument (TME-QA). Participants included two gold standard raters, five pathology assistants, and eight pathologists. Domains of the TME-QA reflect total mesorectal excision principles including: (1) completeness of mesorectal margin; (2) completeness of mesorectum; (3) coning of distal mesorectum; (4) physical defects; and (5) overall specimen quality. Specimens were scored independently. We used the generalizability theory to assess the tool's internal consistency and IRR. There were 39 specimens and 120 ratings. Mean overall specimen quality scores for the gold standard raters, pathologists, and assistants were 4.43, 4.43, and 4.50, respectively (p > 0.85). IRR for the first nine items was 0.68 for the full sample, 0.62 for assistants alone, 0.63 for pathologists alone, and 0.74 for gold standard raters alone. IRR for the item overall specimen quality was 0.67 for the full sample, 0.45 for assistants, 0.80 for pathologists, and 0.86 for gold standard raters. IRR increased for all groups when scores were averaged across two raters. Assessment of surgical specimens using the TME-QA may provide rapid and relevant feedback to surgeons about their technical performance. Our results show good internal consistency and IRR when the TME-QA is used by pathologists. However, for pathology assistants, multiple ratings with the averaging of scores may be needed.
Using Generalizability Theory to Assess the Score Reliability of Communication Skills of Dentistry Students

Science.gov (United States)

Uzun, N. Bilge; Aktas, Mehtap; Asiret, Semih; Yormaz, Seha

2018-01-01

The goal of this study is to determine the reliability of the performance points of dentistry students regarding communication skills and to examine the scoring reliability by generalizability theory in balanced random and fixed facet (mixed design) data, considering also the interactions of student, rater and duty. The study group of the research…
The longitudinal reliability and responsiveness of the OMERACT Hand Osteoarthritis Magnetic Resonance Imaging Scoring System (HOAMRIS)

DEFF Research Database (Denmark)

Haugen, Ida K.; Eshed, Iris; Gandjbakhch, Frederique

2015-01-01

Objective. To evaluate the interreader reliability of change scores and the responsiveness of the OMERACT Hand Osteoarthritis (OA) Magnetic Resonance Image (MRI) Scoring System (HOAMRIS). Methods. Paired MRI (baseline and 5-yr followup) from 20 patients with hand OA were scored with known time se...
Inter-device reliability of an automatic-scoring actigraph for measuring sleep in healthy adults

Directory of Open Access Journals (Sweden)

Matthew Driller

2016-07-01

Full Text Available Actigraphy has become a common method of measuring sleep due to its non-invasive, cost-effective nature. An actigraph (Readiband™ that utilizes automatic scoring algorithms has been used in the research, but is yet to be evaluated for its inter-device reliability. A total of 77 nights of sleep data from 11 healthy adult participants was collected while participants were concomitantly wearing two Readiband™ actigraphs attached together (ACT1 and ACT2. Sleep indices including total sleep time (TST, sleep latency (SL, sleep efficiency (SE%, wake after sleep onset (WASO, total time in bed (TTB, wake episodes per night (WE, sleep onset variance (SOV and wake variance (WV were assessed between the two devices using mean differences, 95% levels of agreement, intraclass correlation coefficients (ICC, typical error of measurement (TEM and coefficient of variation (CV% analysis. There were no significant differences between devices for any of the measured sleep variables (p>0.05. TST, SE, SL, TTB, SOV and WV all resulted in very high ICC's (>0.90, with WASO and WE resulting in high ICC's between devices (0.85 and 0.80, respectively. Mean differences of −2.1 and 0.2 min for TST and SL were associated with a low TEM between devices (9.5 and 3.8 min, respectively. SE resulted in a 0.3% mean difference between devices. The Readiband™ is a reliable tool for researchers using multiple devices of this brand in sleep studies to assess basic measures of sleep quality and quantity in healthy adult populations.

The Portuguese version of the Outcome Questionnaire (OQ-45): Normative data, reliability, and clinical significance cut-offs scores.

Science.gov (United States)

Machado, Paulo P P; Fassnacht, Daniel B

2015-12-01

The Outcome Questionnaire (OQ-45) is one of the most extensively used standardized self-report instruments to monitor psychotherapy outcomes. The questionnaire is designed specifically for the assessment of change during psychotherapy treatments. Therefore, it is crucial to provide norms and clinical cut-off values for clinicians and researchers. The current study aims at providing study provides norms, reliability indices, and clinical cut-off values for the Portuguese version of the scale. Data from two large non-clinical samples (high school/university, N = 1,669; community, N = 879) and one clinical sample (n = 201) were used to investigate psychometric properties and derive normative data for all OQ-45 subscales and the total score. Significant and substantial differences were found for all subscales between the clinical and non-clinical sample. The Portuguese version also showed adequate reliabilities (internal consistency, test-retest), which were comparable to the original version. To assess individual clinical change, clinical cut-off values and reliable change indices were calculated allowing clinicians and researchers to monitor and evaluate clients' individual change. The Portuguese version of the OQ-45 is a reliable instrument with comparable Portuguese norms and cut-off scores to those from the original version. This allows clinicians and researchers to use this instrument for evaluating change and outcome in psychotherapy. This study provides norms for non-clinical and clinical Portuguese samples and investigates the reliability (internal consistency and test-retest) of the OQ-45. Cut-off values and reliable change index are provided allowing clinicians to evaluate clinical change and clients' response to treatment, monitoring the quality of mental health care services. These can be used, in routine clinical practice, as benchmarks for treatment progress and to empirically base clinical decisions such as continuation of treatment or considering
HitPredict version 4: comprehensive reliability scoring of physical protein-protein interactions from more than 100 species.

Science.gov (United States)

López, Yosvany; Nakai, Kenta; Patil, Ashwini

2015-01-01

HitPredict is a consolidated resource of experimentally identified, physical protein-protein interactions with confidence scores to indicate their reliability. The study of genes and their inter-relationships using methods such as network and pathway analysis requires high quality protein-protein interaction information. Extracting reliable interactions from most of the existing databases is challenging because they either contain only a subset of the available interactions, or a mixture of physical, genetic and predicted interactions. Automated integration of interactions is further complicated by varying levels of accuracy of database content and lack of adherence to standard formats. To address these issues, the latest version of HitPredict provides a manually curated dataset of 398 696 physical associations between 70 808 proteins from 105 species. Manual confirmation was used to resolve all issues encountered during data integration. For improved reliability assessment, this version combines a new score derived from the experimental information of the interactions with the original score based on the features of the interacting proteins. The combined interaction score performs better than either of the individual scores in HitPredict as well as the reliability score of another similar database. HitPredict provides a web interface to search proteins and visualize their interactions, and the data can be downloaded for offline analysis. Data usability has been enhanced by mapping protein identifiers across multiple reference databases. Thus, the latest version of HitPredict provides a significantly larger, more reliable and usable dataset of protein-protein interactions from several species for the study of gene groups. Database URL: http://hintdb.hgc.jp/htp. © The Author(s) 2015. Published by Oxford University Press.
Scoring haemophilic arthropathy on X-rays: improving inter- and intra-observer reliability and agreement using a consensus atlas

Energy Technology Data Exchange (ETDEWEB)

Foppen, Wouter; Schaaf, Irene C. van der; Beek, Frederik J.A. [University Medical Center Utrecht, Department of Radiology (Netherlands); Verkooijen, Helena M. [University Medical Center Utrecht, Department of Radiology (Netherlands); University Medical Center Utrecht, Julius Center for Health Sciences and Primary Care, Utrecht (Netherlands); Fischer, Kathelijn [University Medical Center Utrecht, Julius Center for Health Sciences and Primary Care, Utrecht (Netherlands); University Medical Center Utrecht, Van Creveldkliniek, Department of Hematology, Utrecht (Netherlands)

2016-06-15

The radiological Pettersson score (PS) is widely applied for classification of arthropathy to evaluate costly haemophilia treatment. This study aims to assess and improve inter- and intra-observer reliability and agreement of the PS. Two series of X-rays (bilateral elbows, knees, and ankles) of 10 haemophilia patients (120 joints) with haemophilic arthropathy were scored by three observers according to the PS (maximum score 13/joint). Subsequently, (dis-)agreement in scoring was discussed until consensus. Example images were collected in an atlas. Thereafter, second series of 120 joints were scored using the atlas. One observer rescored the second series after three months. Reliability was assessed by intraclass correlation coefficients (ICC), agreement by limits of agreement (LoA). Median Pettersson score at joint level (PS{sub joint}) of affected joints was 6 (interquartile range 3-9). Using the consensus atlas, inter-observer reliability of the PS{sub joint} improved significantly from 0.94 (95 % confidence interval (CI) 0.91-0.96) to 0.97 (CI 0.96-0.98). LoA improved from ±1.7 to ±1.1 for the PS{sub joint}. Therefore, true differences in arthropathy were differences in the PS{sub joint} of >2 points. Intra-observer reliability of the PS{sub joint} was 0.98 (CI 0.97-0.98), intra-observer LoA were ±0.9 points. Reliability and agreement of the PS improved by using a consensus atlas. (orig.)
The Reliability and Structure of the Classroom Assessment Scoring System in German Pre-Schools

Science.gov (United States)

Stuck, Andrea; Kammermeyer, Gisela; Roux, Susanna

2016-01-01

This study examined the reliability and structure of the Classroom Assessment Scoring System (CLASS; Pianta, R. C., K. M. La Paro, and B. K. Hamre. 2008. "Classroom Assessment Scoring System. Manual Pre-K." Baltimore, MD: Brookes) and the quality of interactional processes in a German pre-school setting, drawing on a sample of 390…
Validity and reliability of Thai version of the Foot and Ankle Outcome Score in patients with arthritis of the foot and ankle.

Science.gov (United States)

Angthong, Chayanin

2016-12-01

Although the Foot and Ankle Outcome Score (FAOS) is commonly used in several languages for a variety of foot disorders, it has not been validated specifically for foot and ankle arthritic conditions. The aims of the present study were to translate the original English FAOS into Thai and to evaluate the validity and reliability of the Thai version of the FAOS for the foot and ankle arthritic conditions. The original FAOS was translated into Thai using forward-backward translation. The Thai FAOS and validated Thai Short Form-36 (SF-36 ® ) questionnaires were distributed to 44 Thai patients suffering from arthritis of the foot and ankle to complete. For validation, Thai FAOS scores were correlated with SF-36 scores. Test-retest reliability and internal consistency were also analyzed in this study. The Thai FAOS score demonstrated sufficient correlation with SF-36 total score in Pain (Pearson's correlation coefficient (r)=0.45, p=0.002), Symptoms (r=0.45, p=0.002), Activities of Daily Living (ADL) (r=0.47, p=0.001), and Quality of Life (QOL) (r=0.38, p=0.011) subscales. The Sports and Recreational Activities (Sports & Rec) subscale did not correlate significantly with the SF-36 ® (r=0.20, p=0.20). Cronbach's alpha, a measure of internal consistency, for the five subscales was as follows: Pain, 0.94 (pvalidity for the evaluation of foot and ankle arthritis. Although reliability was satisfactory for the major subscale ADL, it was not sufficient for the minor subscales. Our findings suggest that it can be used as a disease-specific instrument to evaluate foot and ankle arthritis and can complement other reliable outcome surveys. Copyright © 2015 European Foot and Ankle Society. Published by Elsevier Ltd. All rights reserved.
Validation of use of subsets of teeth when applying the total mouth periodontal score (TMPS) system in dogs.

Science.gov (United States)

Harvey, Colin E; Laster, Larry; Shofer, Frances S

2012-01-01

A total mouth periodontal score (TMPS) system in dogs has been described previously. Use of buccal and palatal/lingual surfaces of all teeth requires observation and recording of 120 gingivitis scores and 120 periodontitis scores. Although the result is a reliable, repeatable assessment of the extent of periodontal disease in the mouth, observing and recording 240 data points is time-consuming. Using data from a previously reported study of periodontal disease in dogs, correlation analysis was used to determine whether use of any of seven different subsets of teeth can generate TMPS subset gingivitis and periodontitis scores that are highly correlated with TMPS all-site, all-teeth scores. Overall, gingivitis scores were less highly correlated than periodontitis scores. The minimal tooth set with a significant intra-class correlation (> or = 0.9 of means of right and left sides) for both gingivitis scores and attachment loss measurements consisted of the buccal surface of the maxillary third incisor canine, third premolar fourth premolar; and first molar teeth; and, the mandibular canine, third premolar, fourth premolar and first molar teeth on one side (9 teeth, 15 root sites). Use of this subset of teeth, which reduces the number of data points per dog from 240 to 30 for gingivitis and periodontitis at each scoring episode, is recommended when calculating the gingivitis and periodontitis scores using the TMPS system.
Method of administration of PROMIS scales did not significantly impact score level, reliability, or validity

DEFF Research Database (Denmark)

Bjorner, Jakob B; Rose, Matthias; Gandek, Barbara

2014-01-01

OBJECTIVES: To test the impact of the method of administration (MOA) on score level, reliability, and validity of scales developed in the Patient Reported Outcomes Measurement Information System (PROMIS). STUDY DESIGN AND SETTING: Two nonoverlapping parallel forms each containing eight items from......, no significant mode differences were found and all confidence intervals were within the prespecified minimal important difference of 0.2 standard deviation. Parallel-forms reliabilities were very high (ICC = 0.85-0.93). Only one across-mode ICC was significantly lower than the same-mode ICC. Tests of validity...... questionnaire (PQ), personal digital assistant (PDA), or personal computer (PC) and a second form by PC, in the same administration. Method equivalence was evaluated through analyses of difference scores, intraclass correlations (ICCs), and convergent/discriminant validity. RESULTS: In difference score analyses...
Reliability of the Dutch translation of the Kujala Patellofemoral Score Questionnaire.

Science.gov (United States)

Ummels, P E J; Lenssen, A F; Barendrecht, M; Beurskens, A J H M

2017-01-01

There are no Dutch language disease-specific questionnaires for patients with patellofemoral pain syndrome available that could help Dutch physiotherapists to assess and monitor these symptoms and functional limitations. The aim of this study was to translate the original disease-specific Kujala Patellofemoral Score into Dutch and evaluate its reliability. The questionnaire was translated from English into Dutch in accordance with internationally recommended guidelines. Reliability was determined in 50 stable subjects with an interval of 1 week. The patient inclusion criteria were age between 14 and 60 years; knowledge of the Dutch language; and the presence of at least three of the following symptoms: pain while taking the stairs, pain when squatting, pain when running, pain when cycling, pain when sitting with knees flexed for a prolonged period, grinding of the patella and a positive clinical patella test. The internal consistency, test-retest reliability, measurement error and limits of agreement were calculated. Internal consistency was 0.78 for the first assessment and 0.80 for the second assessment. The intraclass correlation coefficient (ICC agreement ) between the first and second assessments was 0.98. The mean difference between the first and second measurements was 0.64, and standard deviation was 5.51. The standard error measurement was 3.9, and the smallest detectable change was 11. The Bland and Altman plot shows that the limits of agreement are -10.37 and 11.65. The results of the present study indicated that the test-retest reliability translated Dutch version of the Kujala Patellofemoral Score questionnaire is equivalent of the test-retest original English language version and has good internal consistency. Trial registration NTR (TC = 3258). Copyright © 2015 John Wiley & Sons, Ltd. Copyright © 2015 John Wiley & Sons, Ltd.
A Note on the Score Reliability for the Satisfaction with Life Scale: An RG Study

Science.gov (United States)

Vassar, Matt

2008-01-01

The purpose of the present study was to meta-analytically investigate the score reliability for the Satisfaction With Life Scale. Four-hundred and sixteen articles using the measure were located through electronic database searches and then separated to identify studies which had calculated reliability estimates from their own data. Sixty-two…
Reliable change indices and standardized regression-based change score norms for evaluating neuropsychological change in children with epilepsy.

Science.gov (United States)

Busch, Robyn M; Lineweaver, Tara T; Ferguson, Lisa; Haut, Jennifer S

2015-06-01

Reliable change indices (RCIs) and standardized regression-based (SRB) change score norms permit evaluation of meaningful changes in test scores following treatment interventions, like epilepsy surgery, while accounting for test-retest reliability, practice effects, score fluctuations due to error, and relevant clinical and demographic factors. Although these methods are frequently used to assess cognitive change after epilepsy surgery in adults, they have not been widely applied to examine cognitive change in children with epilepsy. The goal of the current study was to develop RCIs and SRB change score norms for use in children with epilepsy. Sixty-three children with epilepsy (age range: 6-16; M=10.19, SD=2.58) underwent comprehensive neuropsychological evaluations at two time points an average of 12 months apart. Practice effect-adjusted RCIs and SRB change score norms were calculated for all cognitive measures in the battery. Practice effects were quite variable across the neuropsychological measures, with the greatest differences observed among older children, particularly on the Children's Memory Scale and Wisconsin Card Sorting Test. There was also notable variability in test-retest reliabilities across measures in the battery, with coefficients ranging from 0.14 to 0.92. Reliable change indices and SRB change score norms for use in assessing meaningful cognitive change in children following epilepsy surgery are provided for measures with reliability coefficients above 0.50. This is the first study to provide RCIs and SRB change score norms for a comprehensive neuropsychological battery based on a large sample of children with epilepsy. Tables to aid in evaluating cognitive changes in children who have undergone epilepsy surgery are provided for clinical use. An Excel sheet to perform all relevant calculations is also available to interested clinicians or researchers. Copyright © 2015 Elsevier Inc. All rights reserved.
Assessment of reliability, validity, responsiveness and minimally important change of the German Hip dysfunction and osteoarthritis outcome score (HOOS) in patients with osteoarthritis of the hip.

Science.gov (United States)

Arbab, Dariusch; van Ochten, Johannes H M; Schnurr, Christoph; Bouillon, Bertil; König, Dietmar

2017-12-01

Patient-reported outcome measures are a critical tool in evaluating the efficacy of orthopedic procedures. The intention of this study was to evaluate reliability, validity, responsiveness and minimally important change of the German version of the Hip dysfunction and osteoarthritis outcome score (HOOS). The German HOOS was investigated in 251 consecutive patients before and 6 months after total hip arthroplasty. All patients completed HOOS, Oxford-Hip Score, Short-Form (SF-36) and numeric scales for pain and disability. Test-retest reliability, internal consistency, floor and ceiling effects, construct validity and minimal important change were analyzed. The German HOOS demonstrated excellent test-retest reliability with intraclass correlation coefficient values > 0.7. Cronbach´s alpha values demonstrated strong internal consistency. As hypothesized, HOOS subscales strongly correlated with corresponding OHS and SF-36 domains. All subscales showed excellent (effect size/standardized response means > 0.8) responsiveness between preoperative assessment and postoperative follow-up. The HOOS and all subdomains showed higher changes than the minimal detectable change which indicates true changes. The German version of the HOOS demonstrated good psychometric properties. It proved to be valid, reliable and responsive to the changes instrument for use in patients with hip osteoarthritis undergoing total hip replacement.
Reliability of measuring hip abductor strength following total knee arthroplasty using a hand-held dynamometer.

Science.gov (United States)

Schache, Margaret B; McClelland, Jodie A; Webster, Kate E

2016-01-01

To investigate the test-retest reliability of measuring hip abductor strength in patients with total knee arthroplasty (TKA) using a hand-held dynamometer (HHD) with two different types of resistance: belt and manual resistance. Test-retest reliability of 30 subjects (17 female, 13 male, 71.9 ± 7.4 years old), 9.2 ± 2.7 days post TKA was measured using belt and therapist resistance. Retest reliability was calculated with intra-class coefficients (ICC3,1) and 95% confidence intervals (CI) for both the group average and the individual scores. A paired t-test assessed whether a difference existed between the belt and therapist methods of resistance. ICCs were 0.82 and 0.80 for the belt and therapist resisted methods, respectively. Hip abductor strength increases of 8 N (14%) for belt resisted and 14 N (17%) for therapist resisted measurements of the group average exceeded the 95% CI and may represent real change. For individuals, hip abductor strength increases of 33 N (72%) (belt resisted) and 57 N (79%) (therapist resisted) could be interpreted as real change. Hip abductor strength can be reliably measured using HHD in the clinical setting with the described protocol. Belt resistance demonstrated slightly higher test-retest reliability. Reliable measurement of hip abductor muscle strength in patients with TKA is important to ensure deficiencies are addressed in rehabilitation programs and function is maximized. Hip abductor strength can be reliably measured with a hand-held dynamometer in the clinical setting using manual or belt resistance.
A Reliability Generalization Study of Scores on Rotter's and Nowicki-Strickland's Locus of Control Scales

Science.gov (United States)

Beretvas, S. Natasha; Suizzo, Marie-Anne; Durham, Jennifer A.; Yarnell, Lisa M.

2008-01-01

The most commonly used measures of locus of control are Rotter's Internality-Externality Scale (I-E) and Nowicki and Strickland's Internality-Externality Scale (NSIE). A reliability generalization study is conducted to explore variability in I-E and NSIE score reliability. Studies are coded for aspects of the scales used (number of response…
Reliability and Validity Evidence of Scores on the French Version of the Questionnaire about Interpersonal Difficulties for Adolescents

Directory of Open Access Journals (Sweden)

Beatriz Delgado

2015-10-01

Full Text Available This study examined the reliability and validity evidence drawn from the scores of the French version of the Questionnaire about Interpersonal Difficulties for Adolescents (QIDA in a sample of 957 adolescents (48.5% boys ranging in age from 11 to 18 years ('M' = 14.48, 'SD' = 1.85. A principal axis factoring (PAF and confirmatory factor analyses (CFA were performed to determine the fit of the factor structure of scores on the QIDA. PAF and CFA replicated the previously identified correlated five-factor structure of the QIDA: Assertiveness, Heterosexual Relationships, Public Speaking, Family Relationships, and Close Friendships. The QIDA yielded acceptable reliability scores for French adolescents. Validity evidence of QIDA was also established through correlations with scores on the School Anxiety Inventory and the Social Anxiety Scale for Adolescents. Most of the correlations were positive and exceeded the established criteria of statistical significance, but the magnitude of these varied according to the scales of the QIDA. Results supported the reliability and validity evidence drawn from the scores of the French version of the QIDA.
Do in-training evaluation reports deserve their bad reputations? A study of the reliability and predictive ability of ITER scores and narrative comments.

Science.gov (United States)

Ginsburg, Shiphra; Eva, Kevin; Regehr, Glenn

2013-10-01

Although scores on in-training evaluation reports (ITERs) are often criticized for poor reliability and validity, ITER comments may yield valuable information. The authors assessed across-rotation reliability of ITER scores in one internal medicine program, ability of ITER scores and comments to predict postgraduate year three (PGY3) performance, and reliability and incremental predictive validity of attendings' analysis of written comments. Numeric and narrative data from the first two years of ITERs for one cohort of residents at the University of Toronto Faculty of Medicine (2009-2011) were assessed for reliability and predictive validity of third-year performance. Twenty-four faculty attendings rank-ordered comments (without scores) such that each resident was ranked by three faculty. Mean ITER scores and comment rankings were submitted to regression analyses; dependent variables were PGY3 ITER scores and program directors' rankings. Reliabilities of ITER scores across nine rotations for 63 residents were 0.53 for both postgraduate year one (PGY1) and postgraduate year two (PGY2). Interrater reliabilities across three attendings' rankings were 0.83 for PGY1 and 0.79 for PGY2. There were strong correlations between ITER scores and comments within each year (0.72 and 0.70). Regressions revealed that PGY1 and PGY2 ITER scores collectively explained 25% of variance in PGY3 scores and 46% of variance in PGY3 rankings. Comment rankings did not improve predictions. ITER scores across multiple rotations showed decent reliability and predictive validity. Comment ranks did not add to the predictive ability, but correlation analyses suggest that trainee performance can be measured through these comments.
Reliability and validity analysis of the open-source Chinese Foot and Ankle Outcome Score (FAOS).

Science.gov (United States)

Ling, Samuel K K; Chan, Vincent; Ho, Karen; Ling, Fona; Lui, T H

2017-12-21

Develop the first reliable and validated open-source outcome scoring system in the Chinese language for foot and ankle problems. Translation of the English FAOS into Chinese following regular protocols. First, two forward-translations were created separately, these were then combined into a preliminary version by an expert committee, and was subsequently back-translated into English. The process was repeated until the original and back translations were congruent. This version was then field tested on actual patients who provided feedback for modification. The final Chinese FAOS version was then tested for reliability and validity. Reliability analysis was performed on 20 subjects while validity analysis was performed on 50 subjects. Tools used to validate the Chinese FAOS were the SF36 and Pain Numeric Rating Scale (NRS). Internal consistency between the FAOS subgroups was measured using Cronbach's alpha. Spearman's correlation was calculated between each subgroup in the FAOS, SF36 and NRS. The Chinese FAOS passed both reliability and validity testing; meaning it is reliable, internally consistent and correlates positively with the SF36 and the NRS. The Chinese FAOS is a free, open-source scoring system that can be used to provide a relatively standardised outcome measure for foot and ankle studies. Copyright © 2017 Elsevier Ltd. All rights reserved.
Reliability of a visual scoring system with fluorescent tracers to assess dermal pesticide exposure.

Science.gov (United States)

Aragon, Aurora; Blanco, Luis; Lopez, Lylliam; Liden, Carola; Nise, Gun; Wesseling, Catharina

2004-10-01

We modified Fenske's semi-quantitative 'visual scoring system' of fluorescent tracer deposited on the skin of pesticide applicators and evaluated its reproducibility in the Nicaraguan setting. The body surface of 33 farmers, divided into 31 segments, was videotaped in the field after spraying with a pesticide solution containing a fluorescent tracer. A portable UV lamp was used for illumination in a foldaway dark room. The videos of five farmers were randomly selected. The scoring was based on a matrix with extension of fluorescent patterns (scale 0-5) on the ordinate and intensity (scale 0-5) on the abscissa, with the product of these two ranks as the final score for each body segment (0-25). Five medical students rated and evaluated the quality of 155 video images having undergone 4 h of training. Cronbach alpha coefficients and two-way random effects intraclass correlation coefficients (ICC) with absolute agreement were computed to assess inter-rater reliability. Consistency was high (Cronbach alpha = 0.96), but the scores differed substantially between raters. The overall ICC was satisfactory [0.75; 95% confidence interval (CI) = 0.62-0.83], but it was lower for intensity (0.54; 95% CI = 0.40-0.66) and higher for extension (0.80; 95% CI = 0.71-0.86). ICCs were lowest for images with low scores and evaluated as low quality, and highest for images with high scores and high quality. Inter-rater reliability coefficients indicate repeatability of the scoring system. However, field conditions for recording fluorescence should be improved to achieve higher quality images, and training should emphasize a better mechanism for the reading of body areas with low contamination.
Pulmonary Exacerbation Score in Cystlc Fibrosis Patients: Reliability and Validity Testing

OpenAIRE

Keller, F.

2016-01-01

Background: Lung disease in cystic fibrosis (CF) is characterized by recurrent pulmonary exacerbations (PEs), but consensus on diagnostic criteria for PE is lacking. The use of a consistent definition of PE as an outcome measure in CF clinical trials would allow meaningful comparison across centers. The aim of this study was to assess the reliability and validity of a simplified version of the Seattle Pulmonary Exacerbation Score (SPEX). Materials and Methods: A cross-sectional observational ...
Scoring the full extent of periodontal disease in the dog: development of a total mouth periodontal score (TMPS) system.

Science.gov (United States)

Harvey, Colin E; Laster, Larry; Shofer, Frances; Miller, Bonnie

2008-09-01

The development of a total mouth periodontal scoring system is described. This system uses methods to score the full extent of gingivitis and periodontitis of all tooth surfaces, weighted by size of teeth, and adjusted by size of dog.
A pediatric FOUR score coma scale: interrater reliability and predictive validity.

Science.gov (United States)

Czaikowski, Brianna L; Liang, Hong; Stewart, C Todd

2014-04-01

The Full Outline of UnResponsiveness (FOUR) Score is a coma scale that consists of four components (eye and motor response, brainstem reflexes, and respiration). It was originally validated among the adult population and recently in a pediatric population. To enhance clinical assessment of pediatric intensive care unit patients, including those intubated and/or sedated, at our children's hospital, we modified the FOUR Score Scale for this population. This modified scale would provide many of the same advantages as the original, such as interrater reliability, simplicity, and elimination of the verbal component that is not compatible with the Glasgow Coma Scale (GCS), creating a more valuable neurological assessment tool for the nursing community. Our goal was to potentially provide greater information than the formally used GCS when assessing critically ill, neurologically impaired patients, including those sedated and/or intubated. Experienced pediatric intensive care unit nurses were trained as "expert raters." Two different nurses assessed each subject using the Pediatric FOUR Score Scale (PFSS), GCS, and Richmond Agitation Sedation Scale at three different time points. Data were compared with the Pediatric Cerebral Performance Category (PCPC) assessed by another nurse. Our hypothesis was that the PFSS and PCPC should highly correlate and the GCS and PCPC should correlate lower. Study results show that the PFSS is excellent for interrater reliability for trained nurse-rater pairs and prediction of poor outcome and in-hospital mortality, under various situations, but there were no statistically significant differences between the PFSS and the GCS. However, the PFSS does have the potential to provide greater neurological assessment in the intubated and/or sedated patient based on the outcomes of our study.

The reliability of the McCabe score as a marker of co-morbidity in healthcare-associated infection point prevalence studies.

Science.gov (United States)

Reilly, J S; Coignard, B; Price, L; Godwin, J; Cairns, S; Hopkins, S; Lyytikäinen, O; Hansen, S; Malcolm, W; Hughes, G J

2016-05-01

This study aimed to ascertain the reliability of the McCabe score in a healthcare-associated infection point prevalence survey. A 10 European Union Member States survey in 20 hospitals (n = 1912) indicated that there was a moderate level of agreement (κ = 0.57) with the score. The reliability of the application of the score could be increased by training data collectors, particularly with reference to the ultimately fatal criteria. This is important if the score is to be used to risk adjust data to drive infection prevention and control interventions.
Attenuation of the Squared Canonical Correlation Coefficient under Varying Estimates of Score Reliability

Science.gov (United States)

Wilson, Celia M.

2010-01-01

Research pertaining to the distortion of the squared canonical correlation coefficient has traditionally been limited to the effects of sampling error and associated correction formulas. The purpose of this study was to compare the degree of attenuation of the squared canonical correlation coefficient under varying conditions of score reliability.…
Development and Reliability of the OMERACT Thumb Base Osteoarthritis Magnetic Resonance Imaging Scoring System

DEFF Research Database (Denmark)

Kroon, Féline P B; Conaghan, Philip G; Foltz, Violaine

2017-01-01

: The TOMS assessed the first carpometacarpal (CMC-1) and scaphotrapeziotrapezoid (STT) joints for synovitis, subchondral bone defects (including erosions, cysts, and bone attrition), osteophytes, cartilage, and bone marrow lesions on a 0-3 scale (normal to severe). Subluxation was evaluated only in the CMC......, with better performance for subchondral bone defects, subluxation, and bone marrow lesions. CONCLUSION: A thumb base OA MRI scoring system has been developed. The OMERACT TOMS demonstrated good intrareader and interreader reliability. Longitudinal studies are warranted to investigate reliability of change...
The Reliability of Disease Activity Score in 28 Joints-C-Reactive Protein Might Be Overestimated in a Subgroup of Rheumatoid Arthritis Patients, When the Score Is Solely Based on Subjective Parameters

DEFF Research Database (Denmark)

Jensen Hansen, Inger Marie; Asmussen Andreasen, Rikke; Van Bui Hansen, Mark Nam

2017-01-01

BACKGROUND: Disease Activity Score in 28 Joints (DAS28) is a scoring system to evaluate disease activity and treatment response in rheumatoid arthritis (RA). A DAS28 score of greater than 3.2 is a well-described limit for treatment intensification; however, the reliability of DAS28 might be overe......BACKGROUND: Disease Activity Score in 28 Joints (DAS28) is a scoring system to evaluate disease activity and treatment response in rheumatoid arthritis (RA). A DAS28 score of greater than 3.2 is a well-described limit for treatment intensification; however, the reliability of DAS28 might...... be overestimated. OBJECTIVE: The aim of this study was to evaluate the reliability of DAS28 in RA, especially focusing on a subgroup of patients with a DAS28 score of greater than 3.2. METHODS: Data from RA patients registered in the local part of Danish DANBIO Registry were collected in May 2015. Patients were....... Patients with central sensitization and psychological problems and those with false-positive diagnosis of RA are at high risk of overtreatment.This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND), where...
A Comparison of the Approaches of Generalizability Theory and Item Response Theory in Estimating the Reliability of Test Scores for Testlet-Composed Tests

Science.gov (United States)

Lee, Guemin; Park, In-Yong

2012-01-01

Previous assessments of the reliability of test scores for testlet-composed tests have indicated that item-based estimation methods overestimate reliability. This study was designed to address issues related to the extent to which item-based estimation methods overestimate the reliability of test scores composed of testlets and to compare several…
Reliability of ultrasound grading traditional score and new global OMERACT-EULAR score system (GLOESS): results from an inter- and intra-reading exercise by rheumatologists.

Science.gov (United States)

Ventura-Ríos, Lucio; Hernández-Díaz, Cristina; Ferrusquia-Toríz, Diana; Cruz-Arenas, Esteban; Rodríguez-Henríquez, Pedro; Alvarez Del Castillo, Ana Laura; Campaña-Parra, Alfredo; Canul, Efrén; Guerrero Yeo, Gerardo; Mendoza-Ruiz, Juan Jorge; Pérez Cristóbal, Mario; Sicsik, Sandra; Silva Luna, Karina

2017-12-01

This study aims to test the reliability of ultrasound to graduate synovitis in static and video images, evaluating separately grayscale and power Doppler (PD), and combined. Thirteen trained rheumatologist ultrasonographers participated in two separate rounds reading 42 images, 15 static and 27 videos, of the 7-joint count [wrist, 2nd and 3rd metacarpophalangeal (MCP), 2nd and 3rd interphalangeal (IPP), 2nd and 5th metatarsophalangeal (MTP) joints]. The images were from six patients with rheumatoid arthritis, performed by one ultrasonographer. Synovitis definition was according to OMERACT. Scoring system in grayscale, PD separately, and combined (GLOESS-Global OMERACT-EULAR Score System) were reviewed before exercise. Reliability intra- and inter-reading was calculated with Cohen's kappa weighted, according to Landis and Koch. Kappa values for inter-reading were good to excellent. The minor kappa was for GLOESS in static images, and the highest was for the same scoring in videos (k 0.59 and 0.85, respectively). Excellent values were obtained for static PD in 5th MTP joint and for PD video in 2nd MTP joint. Results for GLOESS in general were good to moderate. Poor agreement was observed in 3rd MCP and 3rd IPP in all kinds of images. Intra-reading agreement were greater in grayscale and GLOESS in static images than in videos (k 0.86 vs. 0.77 and k 0.86 vs. 0.71, respectively), but PD was greater in videos than in static images (k 1.0 vs. 0.79). The reliability of the synovitis scoring through static images and videos is in general good to moderate when using grayscale and PD separately or combined.
A study of the reliability of the Nociception Coma Scale.

Science.gov (United States)

Riganello, F; Cortese, M D; Arcuri, F; Candelieri, A; Guglielmino, F; Dolce, G; Sannita, W G; Schnakers, C

2015-04-01

In this study, we investigated the reliability of the Nociception Coma Scale which has recently been developed to assess nociception in non-communicative, severely brain-injured patients. Prospective cross-sequential study. Semi-intensive care unit and long-term brain injury care. Forty-four patients diagnosed as being in a vegetative state (n=26) or in a minimally conscious state (n=18). Patients were assessed by two experts (rater A and rater B) on two consecutive weeks to measure inter-rater agreement and test-retest reliability. Total scores and subscores of the Nociception Coma Scale. We performed a total of 176 assessments. The inter-rater agreement was moderate for the total scores (k = 0.57) and fair to substantial for the subscores (0.33 ≤ k ≤ 0.62) on week 2. The test-retest reliability was substantial for the total scores (k = 0.66) and moderate to almost perfect for the subscores (0.53 ≤ k ≤ 0.96) for rater A. The inter-rater agreement was weaker on week 1, whereas the test-retest reliability was lower for the least experienced rater (rater B). This study provides further evidence of the psychometric qualities of the Nociception Coma Scale. Future studies should assess the impact of practical experience and background on administration and scoring of the scale. © The Author(s) 2014.
Effect of rater training on reliability and accuracy of mini-CEX scores: a randomized, controlled trial.

Science.gov (United States)

Cook, David A; Dupras, Denise M; Beckman, Thomas J; Thomas, Kris G; Pankratz, V Shane

2009-01-01

Mini-CEX scores assess resident competence. Rater training might improve mini-CEX score interrater reliability, but evidence is lacking. Evaluate a rater training workshop using interrater reliability and accuracy. Randomized trial (immediate versus delayed workshop) and single-group pre/post study (randomized groups combined). Academic medical center. Fifty-two internal medicine clinic preceptors (31 randomized and 21 additional workshop attendees). The workshop included rater error training, performance dimension training, behavioral observation training, and frame of reference training using lecture, video, and facilitated discussion. Delayed group received no intervention until after posttest. Mini-CEX ratings at baseline (just before workshop for workshop group), and four weeks later using videotaped resident-patient encounters; mini-CEX ratings of live resident-patient encounters one year preceding and one year following the workshop; rater confidence using mini-CEX. Among 31 randomized participants, interrater reliabilities in the delayed group (baseline intraclass correlation coefficient [ICC] 0.43, follow-up 0.53) and workshop group (baseline 0.40, follow-up 0.43) were not significantly different (p = 0.19). Mean ratings were similar at baseline (delayed 4.9 [95% confidence interval 4.6-5.2], workshop 4.8 [4.5-5.1]) and follow-up (delayed 5.4 [5.0-5.7], workshop 5.3 [5.0-5.6]; p = 0.88 for interaction). For the entire cohort, rater confidence (1 = not confident, 6 = very confident) improved from mean (SD) 3.8 (1.4) to 4.4 (1.0), p = 0.018. Interrater reliability for ratings of live encounters (entire cohort) was higher after the workshop (ICC 0.34) than before (ICC 0.18) but the standard error of measurement was similar for both periods. Rater training did not improve interrater reliability or accuracy of mini-CEX scores. clinicaltrials.gov identifier NCT00667940
Using the Hemophilia Joint Health Score for assessment of children: Reliability of the Spanish version.

Science.gov (United States)

R, Cuesta-Barriuso; A, Torres-Ortuño; S, Pérez-Alenda; J, Carrasco Juan; F, Querol; J, Nieto-Munuera; Ja, López-Pina

2018-02-27

Numerous measuring instruments for the evaluation of hemophilic arthropathy have been developed. One of the most used systems is the Hemophilia Joint Health Score (HJHS) given its sensitivity to clinical changes appearing in the joints because of recurrent hemarthrosis. Assessing the interrater reliability, using the Spanish version of the HJHS (version 2.1) in children with hemophilia. Reliability study to assess the interrater reliability of the Spanish version of HJHS. A sample of 36 children aged 7-13 years diagnosed with hemophilia A or B was used. Two physiotherapists performed physical assessments with the Spanish version of the HJHS. Descriptive statistics (range, mean, standard deviation) and the analysis of interrater reliability were calculated. The interrater reliability was heterogeneous since the Kappa coefficient range (ĸ), although significant (p reliability of the Spanish population version of the HJHS is high. This scale should be used generically in evaluating musculoskeletal pediatric patients with hemophilia.
[Reliability and validity of the Chinese version on Comprehensive Scores for Financial Toxicity based on the patient-reported outcome measures].

Science.gov (United States)

Yu, H H; Bi, X; Liu, Y Y

2017-08-10

Objective: To evaluate the reliability and validity of the Chinese version on comprehensive scores for financial toxicity (COST), based on the patient-reported outcome measures. Methods: A total of 118 cancer patients were face-to-face interviewed by well-trained investigators. Cronbach's α and Pearson correlation coefficient were used to evaluate reliability. Content validity index (CVI) and exploratory factor analysis (EFA) were used to evaluate the content validity and construct validity, respectively. Results: The Cronbach's α coefficient appeared as 0.889 for the whole questionnaire, with the results of test-retest were between 0.77 and 0.98. Scale-content validity index (S-CVI) appeared as 0.82, with item-content validity index (I-CVI) between 0.83 and 1.00. Two components were extracted from the Exploratory factor analysis, with cumulative rate as 68.04% and loading>0.60 on every item. Conclusion: The Chinese version of COST scale showed high reliability and good validity, thus can be applied to assess the financial situation in cancer patients.
Validity and Reliability of Scores Obtained on Multiple-Choice Questions: Why Functioning Distractors Matter

Science.gov (United States)

Ali, Syed Haris; Carr, Patrick A.; Ruit, Kenneth G.

2016-01-01

Plausible distractors are important for accurate measurement of knowledge via multiple-choice questions (MCQs). This study demonstrates the impact of higher distractor functioning on validity and reliability of scores obtained on MCQs. Freeresponse (FR) and MCQ versions of a neurohistology practice exam were given to four cohorts of Year 1 medical…
Reliability and Validity of SERVQUAL Scores Used To Evaluate Perceptions of Library Service Quality.

Science.gov (United States)

Thompson, Bruce; Cook, Colleen

Research libraries are increasingly supplementing collection counts with perceptions of service quality as indices of status and productivity. The present study was undertaken to explore the reliability and validity of scores from the SERVQUAL measurement protocol (A. Parasuraman and others, 1991), which has previously been used in this type of…
Total Mini-Mental State Examination score and regional cerebral blood flow using Z score imaging and automated ROI analysis software in subjects with memory impairment

International Nuclear Information System (INIS)

Ikeda, Eiji; Shiozaki, Kazumasa; Takahashi, Nobukazu; Togo, Takashi; Odawara, Toshinari; Oka, Takashi; Inoue, Tomio; Hirayasu, Yoshio

2008-01-01

The Mini-Mental State Examination (MMSE) is considered a useful supplementary method to diagnose dementia and evaluate the severity of cognitive disturbance. However, the region of the cerebrum that correlates with the MMSE score is not clear. Recently, a new method was developed to analyze regional cerebral blood flow (rCBF) using a Z score imaging system (eZIS). This system shows changes of rCBF when compared with a normal database. In addition, a three-dimensional stereotaxic region of interest (ROI) template (3DSRT), fully automated ROI analysis software was developed. The objective of this study was to investigate the correlation between rCBF changes and total MMSE score using these new methods. The association between total MMSE score and rCBF changes was investigated in 24 patients (mean age±standard deviation (SD) 71.5±9.2 years; 6 men and 18 women) with memory impairment using eZIS and 3DSRT. Step-wise multiple regression analysis was used for multivariate analysis, with the total MMSE score as the dependent variable and rCBF change in 24 areas as the independent variable. Total MMSE score was significantly correlated only with the reduction of left hippocampal perfusion but not with right (P<0.01). Total MMSE score is an important indicator of left hippocampal function. (author)
Intrajudge and Interjudge Reliability of the Stuttering Severity Instrument-Fourth Edition.

Science.gov (United States)

Davidow, Jason H; Scott, Kathleen A

2017-11-08

The Stuttering Severity Instrument (SSI) is a tool used to measure the severity of stuttering. Previous versions of the instrument have known limitations (e.g., Lewis, 1995). The present study examined the intra- and interjudge reliability of the newest version, the Stuttering Severity Instrument-Fourth Edition (SSI-4) (Riley, 2009). Twelve judges who were trained on the SSI-4 protocol participated. Judges collected SSI-4 data while viewing 4 videos of adults who stutter at Time 1 and 4 weeks later at Time 2. Data were analyzed for intra- and interjudge reliability of the SSI-4 subscores (for Frequency, Duration, and Physical Concomitants), total score, and final severity rating. Intra- and interjudge reliability across the subscores and total score concurred with the manual's reported reliability when reliability was calculated using the methods described in the manual. New calculations of judge agreement produced different values from those in the manual-for the 3 subscores, total score, and final severity rating-and provided data absent from the manual. Clinicians and researchers who use the SSI-4 should carefully consider the limitations of the instrument. Investigation into the multitasking demands of the instrument may provide information on whether separating the collection of data for specific variables will improve intra- and interjudge reliability of those variables.
Preliminary testing of the reliability and feasibility of SAGE: a system to measure and score engagement with and use of research in health policies and programs.

Science.gov (United States)

Makkar, Steve R; Williamson, Anna; D'Este, Catherine; Redman, Sally

2017-12-19

Few measures of research use in health policymaking are available, and the reliability of such measures has yet to be evaluated. A new measure called the Staff Assessment of Engagement with Evidence (SAGE) incorporates an interview that explores policymakers' research use within discrete policy documents and a scoring tool that quantifies the extent of policymakers' research use based on the interview transcript and analysis of the policy document itself. We aimed to conduct a preliminary investigation of the usability, sensitivity, and reliability of the scoring tool in measuring research use by policymakers. Nine experts in health policy research and two independent coders were recruited. Each expert used the scoring tool to rate a random selection of 20 interview transcripts, and each independent coder rated 60 transcripts. The distribution of scores among experts was examined, and then, interrater reliability was tested within and between the experts and independent coders. Average- and single-measure reliability coefficients were computed for each SAGE subscales. Experts' scores ranged from the limited to extensive scoring bracket for all subscales. Experts as a group also exhibited at least a fair level of interrater agreement across all subscales. Single-measure reliability was at least fair except for three subscales: Relevance Appraisal, Conceptual Use, and Instrumental Use. Average- and single-measure reliability among independent coders was good to excellent for all subscales. Finally, reliability between experts and independent coders was fair to excellent for all subscales. Among experts, the scoring tool was comprehensible, usable, and sensitive to discriminate between documents with varying degrees of research use. Secondly, the scoring tool yielded scores with good reliability among the independent coders. There was greater variability among experts, although as a group, the tool was fairly reliable. The alignment between experts' and independent
Sensitivity and Specificity of the Coma Recovery Scale--Revised Total Score in Detection of Conscious Awareness.

Science.gov (United States)

Bodien, Yelena G; Carlowicz, Cecilia A; Chatelle, Camille; Giacino, Joseph T

2016-03-01

To describe the sensitivity and specificity of Coma Recovery Scale-Revised (CRS-R) total scores in detecting conscious awareness. Data were retrospectively extracted from the medical records of patients enrolled in a specialized disorders of consciousness (DOC) program. Sensitivity and specificity analyses were completed using CRS-R-derived diagnoses of minimally conscious state (MCS) or emerged from minimally conscious state (EMCS) as the reference standard for conscious awareness and the total CRS-R score as the test criterion. A receiver operating characteristic curve was constructed to demonstrate the optimal CRS-R total cutoff score for maximizing sensitivity and specificity. Specialized DOC program. Patients enrolled in the DOC program (N=252, 157 men; mean age, 49y; mean time from injury, 48d; traumatic etiology, n=127; nontraumatic etiology, n=125; diagnosis of coma or vegetative state, n=70; diagnosis of MCS or EMCS, n=182). Not applicable. Sensitivity and specificity of CRS-R total scores in detecting conscious awareness. A CRS-R total score of 10 or higher yielded a sensitivity of .78 for correct identification of patients in MCS or EMCS, and a specificity of 1.00 for correct identification of patients who did not meet criteria for either of these diagnoses (ie, were diagnosed with vegetative state or coma). The area under the curve in the receiver operating characteristic curve analysis is .98. A total CRS-R score of 10 or higher provides strong evidence of conscious awareness but resulted in a false-negative diagnostic error in 22% of patients who demonstrated conscious awareness based on CRS-R diagnostic criteria. A cutoff score of 8 provides the best balance between sensitivity and specificity, accurately classifying 93% of cases. The optimal total score cutoff will vary depending on the user's objective. Copyright © 2016 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Examining Reliability and Validity of an Online Score (ALiEM AIR) for Rating Free Open Access Medical Education Resources.

Science.gov (United States)

Chan, Teresa Man-Yee; Grock, Andrew; Paddock, Michael; Kulasegaram, Kulamakan; Yarris, Lalena M; Lin, Michelle

2016-12-01

Since 2014, Academic Life in Emergency Medicine (ALiEM) has used the Approved Instructional Resources (AIR) score to critically appraise online content. The primary goals of this study are to determine the interrater reliability (IRR) of the ALiEM AIR rating score and determine its correlation with expert educator gestalt. We also determine the minimum number of educator-raters needed to achieve acceptable reliability. Eight educators each rated 83 online educational posts with the ALiEM AIR scale. Items include accuracy, usage of evidence-based medicine, referencing, utility, and the Best Evidence in Emergency Medicine rating score. A generalizability study was conducted to determine IRR and rating variance contributions of facets such as rater, blogs, posts, and topic. A randomized selection of 40 blog posts previously rated through ALiEM AIR was then rated again by a blinded group of expert medical educators according to their gestalt. Their gestalt impression was subsequently correlated with the ALiEM AIR score. The IRR for the ALiEM AIR rating scale was 0.81 during the 6-month pilot period. Decision studies showed that at least 9 raters were required to achieve this reliability. Spearman correlations between mean AIR score and the mean expert gestalt ratings were 0.40 for recommendation for learners and 0.35 for their colleagues. The ALiEM AIR scale is a moderately to highly reliable, 5-question tool when used by medical educators for rating online resources. The score displays a fair correlation with expert educator gestalt in regard to the quality of the resources. The score displays a fair correlation with educator gestalt. Copyright Â© 2016 American College of Emergency Physicians. Published by Elsevier Inc. All rights reserved.
The OMERACT Psoriatic Arthritis Magnetic Resonance Imaging Score (PsAMRIS) is reliable and sensitive to change: results from an OMERACT workshop

DEFF Research Database (Denmark)

Bøyesen, Pernille; McQueen, Fiona M; Gandjbakhch, Frédérique

2011-01-01

The aim of this multireader exercise was to assess the reliability and sensitivity to change of the psoriatic arthritis magnetic resonance imaging score (PsAMRIS) in PsA patients followed for 1 year.......The aim of this multireader exercise was to assess the reliability and sensitivity to change of the psoriatic arthritis magnetic resonance imaging score (PsAMRIS) in PsA patients followed for 1 year....
Gait Deviation Index, Gait Profile Score and Gait Variable Score in children with spastic cerebral palsy: Intra-rater reliability and agreement across two repeated sessions.

Science.gov (United States)

Rasmussen, Helle Mätzke; Nielsen, Dennis Brandborg; Pedersen, Niels Wisbech; Overgaard, Søren; Holsgaard-Larsen, Anders

2015-07-01

The Gait Deviation Index (GDI) and Gait Profile Score (GPS) are the most used summary measures of gait in children with cerebral palsy (CP). However, the reliability and agreement of these indices have not been investigated, limiting their clinimetric quality for research and clinical practice. The aim of this study was to investigate the intra-rater reliability and agreement of summary measures of gait (GDI; GPS; and the Gait Variable Score (GVS) derived from the GPS). The intra-rater reliability and agreement were investigated across two repeated sessions in 18 children aged 5-12 years diagnosed with spastic CP. No systematic bias was observed between the sessions and no heteroscedasticity was observed in Bland-Altman plots. For the GDI and GPS, excellent reliability with intraclass correlation coefficient (ICC) values of 0.8-0.9 was found, while the GVS was found to have fair to good reliability with ICCs of 0.4-0.7. The agreement for the GDI and the logarithmically transformed GPS, in terms of the standard error of measurement as a percentage of the grand mean (SEM%) varied from 4.1 to 6.7%, whilst the smallest detectable change in percent (SDC%) ranged from 11.3 to 18.5%. For the logarithmically transformed GVS, we found a fair to large variation in SEM% from 7 to 29% and in SDC% from 18 to 81%. The GDI and GPS demonstrated excellent reliability and acceptable agreement proving that they can both be used in research and clinical practice. However, the observed large variability for some of the GVS requires cautious consideration when selecting outcome measures. Copyright © 2015 Elsevier B.V. All rights reserved.
Estimating Between-Person and Within-Person Subscore Reliability with Profile Analysis.

Science.gov (United States)

Bulut, Okan; Davison, Mark L; Rodriguez, Michael C

2017-01-01

Subscores are of increasing interest in educational and psychological testing due to their diagnostic function for evaluating examinees' strengths and weaknesses within particular domains of knowledge. Previous studies about the utility of subscores have mostly focused on the overall reliability of individual subscores and ignored the fact that subscores should be distinct and have added value over the total score. This study introduces a profile reliability approach that partitions the overall subscore reliability into within-person and between-person subscore reliability. The estimation of between-person reliability and within-person reliability coefficients is demonstrated using subscores from number-correct scoring, unidimensional and multidimensional item response theory scoring, and augmented scoring approaches via a simulation study and a real data study. The effects of various testing conditions, such as subtest length, correlations among subscores, and the number of subtests, are examined. Results indicate that there is a substantial trade-off between within-person and between-person reliability of subscores. Profile reliability coefficients can be useful in determining the extent to which subscores provide distinct and reliable information under various testing conditions.

Reliable Change Indices and Standardized Regression-Based Change Score Norms for Evaluating Neuropsychological Change in Children with Epilepsy

Science.gov (United States)

Busch, Robyn M.; Lineweaver, Tara T.; Ferguson, Lisa; Haut, Jennifer S.

2015-01-01

Reliable change index scores (RCIs) and standardized regression-based change score norms (SRBs) permit evaluation of meaningful changes in test scores following treatment interventions, like epilepsy surgery, while accounting for test-retest reliability, practice effects, score fluctuations due to error, and relevant clinical and demographic factors. Although these methods are frequently used to assess cognitive change after epilepsy surgery in adults, they have not been widely applied to examine cognitive change in children with epilepsy. The goal of the current study was to develop RCIs and SRBs for use in children with epilepsy. Sixty-three children with epilepsy (age range 6–16; M=10.19, SD=2.58) underwent comprehensive neuropsychological evaluations at two time points an average of 12 months apart. Practice adjusted RCIs and SRBs were calculated for all cognitive measures in the battery. Practice effects were quite variable across the neuropsychological measures, with the greatest differences observed among older children, particularly on the Children’s Memory Scale and Wisconsin Card Sorting Test. There was also notable variability in test-retest reliabilities across measures in the battery, with coefficients ranging from 0.14 to 0.92. RCIs and SRBs for use in assessing meaningful cognitive change in children following epilepsy surgery are provided for measures with reliability coefficients above 0.50. This is the first study to provide RCIs and SRBs for a comprehensive neuropsychological battery based on a large sample of children with epilepsy. Tables to aid in evaluating cognitive changes in children who have undergone epilepsy surgery are provided for clinical use. An excel sheet to perform all relevant calculations is also available to interested clinicians or researchers. PMID:26043163
Coping strategies related to total stress score among post graduate medical students and residents

Directory of Open Access Journals (Sweden)

R. Irawati Ismail

2013-05-01

several dominant coping strategies related to total stress score levels.Methods:A cross-sectional purposive sampling method study among postgraduate medical students of the Faculty of Medicine, Universitas Indonesia was done April-July 2011. We used a coping strategies questionnaire and the WHO SRQ-20. Linear regression was used to identify dominant coping strategies related to stress levels.Results:This study had 272 subjects, aged 23-47 years. Four items decreased the total stress score (accepting the reality of the fact, talking to someone who could do something, seeking God’s help, and laughing about the situation. However, three factors increased the total stress score (taking one step at a time has to be done, talking to someone to find out more about the situation, and admitting can’t deal solving the situation. One point of accepting the reality of the situation reduced 0.493 points the total stress score [regression coefficient (β= -0.493; P=0.002]. While one point seeking God’s help reduced 0.307 points the total stress score (β= -0.307; P=0.056. However, one point of doing one step at a time increased 0.54 point the total stress score (β=0.540; P=0.005.Conclusions: Accepting the reality of the situation, talking to someone who could do something, seeking God’s help, and laughing about the situation decreased the stress level. However, taking one step at a time, talking to someone to find out more about the situation and admitting can’t deal solving the situation, increased the total stress score.Key words:stress level, coping strategies, age, seeking God’s help
Effect of Clinically Discriminating, Evidence-Based Checklist Items on the Reliability of Scores from an Internal Medicine Residency OSCE

Science.gov (United States)

Daniels, Vijay J.; Bordage, Georges; Gierl, Mark J.; Yudkowsky, Rachel

2014-01-01

Objective structured clinical examinations (OSCEs) are used worldwide for summative examinations but often lack acceptable reliability. Research has shown that reliability of scores increases if OSCE checklists for medical students include only clinically relevant items. Also, checklists are often missing evidence-based items that high-achieving…
Quality Evaluation Scores are no more Reliable than Gestalt in Evaluating the Quality of Emergency Medicine Blogs: A METRIQ Study.

Science.gov (United States)

Thoma, Brent; Sebok-Syer, Stefanie S; Colmers-Gray, Isabelle; Sherbino, Jonathan; Ankel, Felix; Trueger, N Seth; Grock, Andrew; Siemens, Marshall; Paddock, Michael; Purdy, Eve; Kenneth Milne, William; Chan, Teresa M

2018-01-30

Construct: We investigated the quality of emergency medicine (EM) blogs as educational resources. Online medical education resources such as blogs are increasingly used by EM trainees and clinicians. However, quality evaluations of these resources using gestalt are unreliable. We investigated the reliability of two previously derived quality evaluation instruments for blogs. Sixty English-language EM websites that published clinically oriented blog posts between January 1 and February 24, 2016, were identified. A random number generator selected 10 websites, and the 2 most recent clinically oriented blog posts from each site were evaluated using gestalt, the Academic Life in Emergency Medicine (ALiEM) Approved Instructional Resources (AIR) score, and the Medical Education Translational Resources: Impact and Quality (METRIQ-8) score, by a sample of medical students, EM residents, and EM attendings. Each rater evaluated all 20 blog posts with gestalt and 15 of the 20 blog posts with the ALiEM AIR and METRIQ-8 scores. Pearson's correlations were calculated between the average scores for each metric. Single-measure intraclass correlation coefficients (ICCs) evaluated the reliability of each instrument. Our study included 121 medical students, 88 EM residents, and 100 EM attendings who completed ratings. The average gestalt rating of each blog post correlated strongly with the average scores for ALiEM AIR (r = .94) and METRIQ-8 (r = .91). Single-measure ICCs were fair for gestalt (0.37, IQR 0.25-0.56), ALiEM AIR (0.41, IQR 0.29-0.60) and METRIQ-8 (0.40, IQR 0.28-0.59). The average scores of each blog post correlated strongly with gestalt ratings. However, neither ALiEM AIR nor METRIQ-8 showed higher reliability than gestalt. Improved reliability may be possible through rater training and instrument refinement.
Reliability of sonographic assessment of tendinopathy in tennis elbow.

Science.gov (United States)

Poltawski, Leon; Ali, Syed; Jayaram, Vijay; Watson, Tim

2012-01-01

To assess the reliability and compute the minimum detectable change using sonographic scales to quantify the extent of pathology and hyperaemia in the common extensor tendon in people with tennis elbow. The lateral elbows of 19 people with tennis elbow were assessed sonographically twice, 1-2 weeks apart. Greyscale and power Doppler images were recorded for subsequent rating of abnormalities. Tendon thickening, hypoechogenicity, fibrillar disruption and calcification were each rated on four-point scales, and scores were summed to provide an overall rating of structural abnormality; hyperaemia was scored on a five point scale. Inter-rater reliability was established using the intraclass correlation coefficient (ICC) to compare scores assigned independently to the same set of images by a radiologist and a physiotherapist with training in musculoskeletal imaging. Test-retest reliability was assessed by comparing scores assigned by the physiotherapist to images recorded at the two sessions. The minimum detectable change (MDC) was calculated from the test-retest reliability data. ICC values for inter-rater reliability ranged from 0.35 (95% CI: 0.05, 0.60) for fibrillar disruption to 0.77 (0.55, 0.88) for overall greyscale score, and 0.89 (0.79, 0.95) for hyperaemia. Test-retest reliability ranged from 0.70 (0.48, 0.84) for tendon thickening to 0.82 (0.66, 0.90) for overall greyscale score and 0.86 (0.73, 0.93) for calcification. The MDC for the greyscale total score was 2.0/12 and for the hyperaemia score was 1.1/5. The sonographic scoring system used in this study may be used reliably to quantify tendon abnormalities and change over time. A relatively inexperienced imager can conduct the assessment and use the rating scales reliably.
HitPredict version 4: comprehensive reliability scoring of physical protein?protein interactions from more than 100 species

OpenAIRE

L?pez, Yosvany; Nakai, Kenta; Patil, Ashwini

2015-01-01

HitPredict is a consolidated resource of experimentally identified, physical protein?protein interactions with confidence scores to indicate their reliability. The study of genes and their inter-relationships using methods such as network and pathway analysis requires high quality protein?protein interaction information. Extracting reliable interactions from most of the existing databases is challenging because they either contain only a subset of the available interactions, or a mixture of p...
Validity and Reliability of the Abbreviated Barratt Impulsiveness Scale in Spanish (BIS-15S)*

Science.gov (United States)

Orozco-Cabal, Luis; Rodríguez, Maritza; Herin, David V.; Gempeler, Juanita; Uribe, Miguel

2010-01-01

Objective This study determined the validity and reliability of a new, abbreviated version of the Spanish Barratt Impulsiveness Scale (BIS-15S) in Colombian subjects. Method The BIS-15S was tested in non-clinical (n=283) and clinical (n=164) native Spanish-speakers. Intra-scale reliability was calculated using Cronbach’s α, and test-retest reliability was measured with Pearson correlations. Psychometric properties were determined using standard statistics. A factor analysis was performed to determine BIS-15S factor structure. Results 447 subjects participated in the study. Clinical subjects were older and more educated compared to non-clinical subjects. Impulsivity scores were normally distributed in each group. BIS-15S total, motor, non-planning and attention scores were significantly lower in non-clinical vs. clinical subjects. Subjects with substance-related disorders had the highest BIS-15S total scores, followed by subjects with bipolar disorders and bulimia nervosa/binge eating. Internal consistency was 0.793 and test-retest reliability was 0.80. Factor analysis confirmed a three-factor structure (attention, motor, non-planning) accounting for 47.87% of the total variance in BIS-15S total scores. Conclusions The BIS-15S is a valid and reliable self-report measure of impulsivity in this population. Further research is needed to determine additional components of impulsivity not investigated by this measure. PMID:21152412
SF-36 total score as a single measure of health-related quality of life: Scoping review.

Science.gov (United States)

Lins, Liliane; Carvalho, Fernando Martins

2016-01-01

According to the 36-Item Short Form Health Survey questionnaire developers, a global measure of health-related quality of life such as the "SF-36 Total/Global/Overall Score" cannot be generated from the questionnaire. However, studies keep on reporting such measure. This study aimed to evaluate the frequency and to describe some characteristics of articles reporting the SF-36 Total/Global/Overall Score in the scientific literature. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses method was adapted to a scoping review. We performed searches in PubMed, Web of Science, SCOPUS, BVS, and Cochrane Library databases for articles using such scores. We found 172 articles published between 1997 and 2015; 110 (64.0%) of them were published from 2010 onwards; 30.0% appeared in journals with Impact Factor 3.00 or greater. Overall, 129 (75.0%) out of the 172 studies did not specify the method for calculating the "SF-36 Total Score"; 13 studies did not specify their methods but referred to the SF-36 developers' studies or others; and 30 articles used different strategies for calculating such score, the most frequent being arithmetic averaging of the eight SF-36 domains scores. We concluded that the "SF-36 Total/Global/Overall Score" has been increasingly reported in the scientific literature. Researchers should be aware of this procedure and of its possible impacts upon human health.
Reliability of scoring arousals in normal children and children with obstructive sleep apnea syndrome.

Science.gov (United States)

Wong, Tat Kong; Galster, Patricia; Lau, Tai Shing; Lutz, Janita M; Marcus, Carole L

2004-09-15

Scoring of arousals in children is based on an extension of adult criteria, as defined by the American Sleep Disorders Association (ASDA). By this, a minimum duration of 3 seconds is required. A few recent studies utilized modified criteria for the study of children, with durations as short as 1 second. However, the validity and reliability of scoring these shorter arousals have never been verified. Based on studies in adults, we hypothesized that interscorer agreement for scoring arousals shorter than 3 seconds was poor. Retrospective review of polysomnograms by 2 experienced sleep practitioners who independently scored arousals according to the ASDA 3-second criteria and modified duration criteria of 1 and 2 seconds. Academic hospital. 20 polysomnographic studies from children aged 3 to 8 years with mild to severe obstructive sleep apnea syndrome, and 16 polysomnographic studies from normal children. None. The intraclass correlation coefficient for scoring ASDA arousals was 0.90 (95% confidence interval: 0.81-0.95), indicating excellent interscorer agreement. The intraclass correlation coefficient for scoring modified 1-second and 2-second arousals were 0.35 (95% confidence interval: 0.02-0.61) and 0.42 (95% confidence interval: 0.12-0.65) respectively, indicating poor to fair interscorer agreement. Furthermore, modified 1-second and 2-second arousals accounted for less than 15% of all arousals scored. We conclude that there is much poorer interscorer agreement for scoring arousals shorter than 3 seconds, when compared to the standard ASDA criteria. We propose that scoring of arousals in children should follow the standard ASDA criteria.
Examination of the reliability and validity of the Mindful Eating Questionnaire in pregnant women.

Science.gov (United States)

Apolzan, John W; Myers, Candice A; Cowley, Amanda D; Brady, Heather; Hsia, Daniel S; Stewart, Tiffany M; Redman, Leanne M; Martin, Corby K

2016-05-01

Mindfulness is theorized to affect the eating behavior and weight of pregnant women, yet no measure has been validated during pregnancy. This study qualitatively and quantitatively evaluated the reliability and validity of the Mindful Eating Questionnaire (MEQ) in overweight and obese pregnant women. Participants completed focus groups and cognitive interviews. The MEQ was administered twice to measure test-retest reliability. The Eating Inventory (EI) and Mindful Attention Awareness Scale (MAAS) were administered to assess convergent validity, and the Neighborhood Environment Walkability Scale (NEWS) assessed discriminant validity. Participants were 20 ± 8 weeks gestation (mean ± SD), 30 ± 2 years old, and 55% were obese. The MEQ total score had good test-retest reliability (r = .85). The total score internal consistency reliability was poor (Cronbach's α = .56). The external cues subscale (ECS) was not internally consistent (α = .31). Other subscales ranged from α = .59-.68. When the ECS was excluded, the MEQ total score internal consistency was acceptable (α = .62). Convergent validity was supported by the MEQ total score (with and without ECS) correlating significantly with the MAAS and the EI disinhibition and hunger subscales. Discriminant validity of the MEQ was supported by the MEQ and NEWS total scores and subscales not being significantly correlated. The quantitative results were supported by the qualitative context and content analysis. With the exception of the ECS, the MEQ's reliability and validity was supported in pregnant women, and most of the subscales were more robust in pregnant women than in the original sample of healthy adults. The MEQ's use with overweight and obese pregnant women is supported. Copyright © 2016 Elsevier Ltd. All rights reserved.
Interrater and Test-Retest Reliability and Minimal Detectable Change of the Balance Evaluation Systems Test (BESTest) and Subsystems With Community-Dwelling Older Adults.

Science.gov (United States)

Wang-Hsu, Elizabeth; Smith, Susan S

2017-01-10

Falls are a common cause of injuries and hospital admissions in older adults. Balance limitation is a potentially modifiable factor contributing to falls. The Balance Evaluation Systems Test (BESTest), a clinical balance measure, categorizes balance into 6 underlying subsystems. Each of the subsystems is scored individually and summed to obtain a total score. The reliability of the BESTest and its individual subsystems has been reported in patients with various neurological disorders and cancer survivors. However, the reliability and minimal detectable change (MDC) of the BESTest with community-dwelling older adults have not been reported. The purposes of our study were to (1) determine the interrater and test-retest reliability of the BESTest total and subsystem scores; and (2) estimate the MDC of the BESTest and its individual subsystem scores with community-dwelling older adults. We used a prospective cohort methodological design. Community-dwelling older adults (N = 70; aged 70-94 years; mean = 85.0 [5.5] years) were recruited from a senior independent living community. Trained testers (N = 3) administered the BESTest. All participants were tested with the BESTest by the same tester initially and then retested 7 to 14 days later. With 32 of the participants, a second tester concurrently scored the retest for interrater reliability. Testers were blinded to each other's scores. Intraclass correlation coefficients [ICC(2,1)] were used to determine the interrater and test-retest reliability. Test-retest reliability was also analyzed using method error and the associated coefficients of variation (CVME). MDC was calculated using standard error of measurement. Interrater reliability (N = 32) of the BESTest total score was ICC(2, 1) = 0.97 (95% confidence interval [CI], 0.94-0.99). The ICCs for the individual subsystem scores ranged from 0.85 to 0.94. Test-retest reliability (N = 70) of the BESTest total score was ICC(2,1) = 0.93 (95% CI, 0.89-0.96). ICCs for the
The reliability and validity of the Tokyo Autistic Behaviour Scale.

Science.gov (United States)

Kurita, H; Miyake, Y

1990-03-01

The Tokyo Autistic Behavior Scale (TABS) consisting of 39 items provisionally grouped in four areas--interpersonal-social relationship, language-communication, habit-mannerism and others--is an instrument used by a child's caretaker to rate the child's autistic behaviors on a 3-point scale. Test-retest reliability was satisfactory (i.e., an r for a total score was .94). Among six DSM-III diagnostic groups, infantile autism showed a significantly higher total TABS score than the other five groups, and a taxonomic validity coefficient was .54. An r between total scores of the TABS and the Childhood Autism Rating Scale--Tokyo Version was .59. The area scores showed a lower validity than the total score. The TABS appears to be a useful instrument to assess autistic behavior.
Scoring of the radiological picture of idiopathic interstitial pneumonia: a study to verify the reliability of the method

International Nuclear Information System (INIS)

Kocova, Eva; Vanasek, Jiri; Koblizek, Vladimir; Novosad, Jakub; Elias, Pavel; Bartos, Vladimir; Sterclova, Martina

2015-01-01

Idiopathic pulmonary fibrosis (IPF) is a clinical form of usual interstitial pneumonia (UIP). Computed chest tomography (CT) has a fundamental role in the multidisciplinary diagnostics. However, it has not been verified if and how a subjective opinion of a radiologists or pneumologists can influence the assessment and overall diagnostic summary. To verify the reliability of the scoring system. Assessment of conformity of the radiological score of high-resolution CT (HRCT) of lungs in patients with IPF was performed by a group of radiologists and pneumologists. Personal data were blinded and the assessment was performed independently using the Dutka/Vasakova scoring system (modification of the Gay system). The final score of the single assessors was then evaluated by means of the paired Spearman’s correlation and analysis of the principal components. Two principal components explaining cumulatively a 62% or 73% variability of the assessment of the single assessors were extracted during the analysis. The groups did not differ both in terms of specialty and experience with the assessment of the HRCT findings. According to our study, scoring of a radiological image using the Dutka/Vasakova system is a reliable method in the hands of experienced radiologists. Significant differences occur during the assessment performed by pneumologists especially during the evaluation of the alveolar changes
Validation and reliability of a Behcet's Syndrome Activity Scale in Korea.

Science.gov (United States)

Choi, Hyo Jin; Seo, Mi Ryoung; Ryu, Hee Jung; Baek, Han Joo

2016-01-01

We prepared a cross-cultural adaptation of the Behcet's Syndrome Activity Scale (BSAS) and evaluated its reliability and validity in Korea. Fifty patients with Behcet's disease (BD) who attended the Rheumatology Clinic of Gachon University Gil Medical Center were included in this study. The first BSAS questionnaire was administered at each clinic visit, and the second questionnaire was completed at home within 24 hours of the visit. A Behcet's Disease Current Activity Form (BDCAF) and a Behcet's Disease Quality of Life (BDQOL) form were also given to patients. The test-retest reliability was analyzed by intraclass correlation coefficients (ICC). To assess the validity, the total BSAS score was compared with the BDCAF score, the patient/physician global assessment, and the BDQOL by Spearman rank correlation. Twelve males and 38 females were enrolled. The mean age was 48.5 years and the mean disease duration was 6.7 years. Thirty-eight patients (76.0%) returned the questionnaire by mail. For the test-retest reliability, the two assessments were significantly correlated on all 10 items of the BSAS questionnaire (p < 0.05) and the total BSAS score (ICC, 0.925; p < 0.001). The total BSAS score was statistically correlated with the BDQOL, BDCAF, and patient/physician global assessment (p < 0.01). The Korean version of BSAS is a reliable and valid instrument to measure BD activity.
Total hip arthroplasty outcomes assessment using functional and radiographic scores to compare canine systems.

Science.gov (United States)

Iwata, D; Broun, H C; Black, A P; Preston, C A; Anderson, G I

2008-01-01

A retrospective multi-centre study was carried out in order to compare outcomes between cemented and uncemented total hip arthoplasties (THA). A quantitative orthopaedic outcome assessment scoring system was devised in order to relate functional outcome to a numerical score, to allow comparison between treatments and amongst centres. The system combined a radiographic score and a clinical score. Lower scores reflect better outcomes than higher scores. Consecutive cases of THA were included from two specialist practices between July 2002 and December 2005. The study included 46 THA patients (22 uncemented THA followed for 8.3 +/- 4.7M and 24 cemented THA for 26.0 +/- 15.7M) with a mean age of 4.4 +/- 3.3 years at surgery. Multi-variable linear and logistical regression analyses were performed with adjustments for age at surgery, surgeon, follow-up time, uni- versus bilateral disease, gender and body weight. The differences between treatment groups in terms of functional scores or total scores were not significant (p > 0.05). Radiographic scores were different between treatment groups. However, these scores were usually assessed within two months of surgery and proved unreliable predictors of functional outcome (p > 0.05). The findings reflect relatively short-term follow-up, especially for the uncemented group, and do not include clinician-derived measures, such as goniometry and thigh circumference. Longer-term follow-up for the radiographic assessments is essential. A prospective study including the clinician-derived outcomes needs to be performed in order to validate the outcome instrument in its modified form.
Reliability, Validity, and Optimal Cutoff Score of the Montreal Cognitive Assessment (Changsha Version) in Ischemic Cerebrovascular Disease Patients of Hunan Province, China

Science.gov (United States)

Tu, Qiu-yun; Jin, Hui; Ding, Bin-rong; Yang, Xia; Lei, Zeng-hui; Bai, Song; Zhang, Ying-dong; Tang, Xiang-qi

2013-01-01

Background/Aims The goal of this study was to examine the reliability and validity of the Changsha version of the Montreal Cognitive Assessment (MoCA-CS) in ischemic cerebrovascular disease patients of Hunan Province, China, and to explore the optimal cutoff score for detecting vascular cognitive impairment-no dementia (VCI-ND) and vascular dementia (VD). Methods Three hundred and thirty-eight ischemic cerebrovascular disease patients (131 with normal cognition, 111 with VCI-ND, and 96 with VD) and 132 healthy controls were recruited. All participants accepted examination by the MoCA-CS, Mini-Mental State Examination (MMSE), and other related scales. A detailed neuropsychological battery was used for making a final cognitive diagnosis. SPSS 16.0 statistical software was used for reliability, validity examination, and optimal cutoff score detection. Results Cronbach's α of the MoCA-CS was 0.884, and test-retest and interrater reliability of the MoCA-CS were 0.966 and 0.926, respectively. MoCA-CS scores were highly correlated with MMSE scores (r = 0.867) and simplified intelligence quotients (r = 0.822). The results indicate that 1 point should be added for subjects with less than 6 years of education, and that the optimal cutoff score for detecting VCI-ND is 26/27 (sensitivity 96.1%, specificity 75.6%), whereas the optimal cutoff score for detecting VD is 16/17 (sensitivity 92.7%, specificity 96.3%). Conclusion The MoCA-CS has good reliability and validity, and is a useful cognitive screening instrument for detecting VCI in the Chinese population. PMID:23637698
Reliability, Validity, and Optimal Cutoff Score of the Montreal Cognitive Assessment (Changsha Version in Ischemic Cerebrovascular Disease Patients of Hunan Province, China

Directory of Open Access Journals (Sweden)

Qiu-yun Tu

2013-02-01

Full Text Available Background/Aims: The goal of this study was to examine the reliability and validity of the Changsha version of the Montreal Cognitive Assessment (MoCA-CS in ischemic cerebrovascular disease patients of Hunan Province, China, and to explore the optimal cutoff score for detecting vascular cognitive impairment-no dementia (VCI-ND and vascular dementia (VD. Methods: Three hundred and thirty-eight ischemic cerebrovascular disease patients (131 with normal cognition, 111 with VCI-ND, and 96 with VD and 132 healthy controls were recruited. All participants accepted examination by the MoCA-CS, Mini-Mental State Examination (MMSE, and other related scales. A detailed neuropsychological battery was used for making a final cognitive diagnosis. SPSS 16.0 statistical software was used for reliability, validity examination, and optimal cutoff score detection. Results: Cronbach’s α of the MoCA-CS was 0.884, and test-retest and interrater reliability of the MoCA-CS were 0.966 and 0.926, respectively. MoCA-CS scores were highly correlated with MMSE scores (r = 0.867 and simplified intelligence quotients (r = 0.822. The results indicate that 1 point should be added for subjects with less than 6 years of education, and that the optimal cutoff score for detecting VCI-ND is 26/27 (sensitivity 96.1%, specificity 75.6%, whereas the optimal cutoff score for detecting VD is 16/17 (sensitivity 92.7%, specificity 96.3%. Conclusion: The MoCA-CS has good reliability and validity, and is a useful cognitive screening instrument for detecting VCI in the Chinese population.
Reliability and Validity of the Greek Migraine Disability Assessment (MIDAS) Questionnaire.

Science.gov (United States)

Oikonomidi, Theodora; Vikelis, Michail; Artemiadis, Artemios; Chrousos, George P; Darviri, Christina

2018-03-01

The Migraine Disability Assessment (MIDAS) Questionnaire is a reliable and valid instrument for migraine-related disability. Such a tool is needed to quantify migraine-related disability in the Greek population. This validation study aims to assess the test-retest reliability, internal consistency, item discriminant and convergent validity of the Greek translation of the MIDAS. Adults diagnosed with migraine completed the MIDAS Questionnaire on two occasions 3 weeks apart to assess reliability, and completed the RAND-36 to assess validity. Participants (n = 152) had a median MIDAS score of 24 and mostly severe disability (58% were grade IV). The test-retest reliability analysis (N = 59) revealed excellent reliability for the total score. Internal consistency was α = 0.71 for initial and α = 0.82 for retest completion. For item discriminant validity, the correlations between each question and the total score were significant, with high correlations for questions 2-5 (range 0.67 ≤ r ≤ 0.79; p MIDAS score tended to have better wellbeing. Psychometric properties are comparable with those of other published validation studies of the MIDAS and the original. Findings on question 1 show that missing work/school days may be closely related with increased affect issues. The Greek version of the MIDAS Questionnaire has good reliability and validity. This study allowed for cross-cultural comparability of research findings.
Reliability of four experimental mechanical pain tests in children

DEFF Research Database (Denmark)

Søe, Ann-Britt Langager; Thomsen, Lise L; Tornoe, Birte

2013-01-01

In order to study pain in children, it is necessary to determine whether pain measurement tools used in adults are reliable measurements in children. The aim of this study was to explore the intrasession reliability of pressure pain thresholds (PPT) in healthy children. Furthermore, the aim was a...... was also to study the intersession reliability of the following four tests: (1) Total Tenderness Score; (2) PPT; (3) Visual Analog Scale score at suprapressure pain threshold; and (4) area under the curve (stimulus-response functions for pressure versus pain).......In order to study pain in children, it is necessary to determine whether pain measurement tools used in adults are reliable measurements in children. The aim of this study was to explore the intrasession reliability of pressure pain thresholds (PPT) in healthy children. Furthermore, the aim...
Measuring reliable change in cognition using the Edinburgh Cognitive and Behavioural ALS Screen (ECAS).

Science.gov (United States)

Crockford, Christopher; Newton, Judith; Lonergan, Katie; Madden, Caoifa; Mays, Iain; O'Sullivan, Meabhdh; Costello, Emmet; Pinto-Grau, Marta; Vajda, Alice; Heverin, Mark; Pender, Niall; Al-Chalabi, Ammar; Hardiman, Orla; Abrahams, Sharon

2018-02-01

Cognitive impairment affects approximately 50% of people with amyotrophic lateral sclerosis (ALS). Research has indicated that impairment may worsen with disease progression. The Edinburgh Cognitive and Behavioural ALS Screen (ECAS) was designed to measure neuropsychological functioning in ALS, with its alternate forms (ECAS-A, B, and C) allowing for serial assessment over time. The aim of the present study was to establish reliable change scores for the alternate forms of the ECAS, and to explore practice effects and test-retest reliability of the ECAS's alternate forms. Eighty healthy participants were recruited, with 57 completing two and 51 completing three assessments. Participants were administered alternate versions of the ECAS serially (A-B-C) at four-month intervals. Intra-class correlation analysis was employed to explore test-retest reliability, while analysis of variance was used to examine the presence of practice effects. Reliable change indices (RCI) and regression-based methods were utilized to establish change scores for the ECAS alternate forms. Test-retest reliability was excellent for ALS Specific, ALS Non-Specific, and ECAS Total scores of the combined ECAS A, B, and C (all > .90). No significant practice effects were observed over the three testing sessions. RCI and regression-based methods produced similar change scores. The alternate forms of the ECAS possess excellent test-retest reliability in a healthy control sample, with no significant practice effects. The use of conservative RCI scores is recommended. Therefore, a change of ≥8, ≥4, and ≥9 for ALS Specific, ALS Non-Specific, and ECAS Total score is required for reliable change.

Analysis of the reliability and validity of the Turkish version of the intermittent and constant osteoarthritis pain questionnaire.

Science.gov (United States)

Erel, Suat; Şimşek, İbrahim Engin; Özkan, Hüseyin

2015-01-01

The aim of this study was to analyze the validity and reliability of the Turkish version (ICOAP-TR) of the intermittent and constant osteoarthritis pain (ICOAP) questionnaire in patients with knee osteoarthritis (OA). Thirty-eight volunteer patients diagnosed with knee OA answered the questionnaire twice with an interval of 2-4 days. The reliability of the measurement was assessed using Cronbach's alpha coefficient and intraclass correlation (ICC) for test-retest reliability. Criterion validity was tested against the Western Ontario and McMaster Universities Arthritis Index (WOMAC) pain score and visual analog scale (VAS) designed to assess the perceived discomfort rated by the patient. Test-retest reliability was found to be ICC=0.942 for total score, 0.902 for constant pain subscale, and 0.945 for intermittent pain subscale. Internal consistency was tested using Cronbach's alpha and was found to be 0.970 for total score, 0.948 for constant pain subscale, and 0.972 for intermittent pain subscale. For criterion validity, the correlation between the total score of ICOAP-TR and WOMAC pain subscale was r=0.779 (p<0.05), and correlation between total score of ICOAP-TR and VAS was r=0.570 (p<0.05). The ICOAP-TR is a reliable and valid instrument to be used with patients with knee OA.
The Score Reliability of Draw-a-Person Intellectual Ability Test (DAP: IQ) for Rural Malawi Students

Science.gov (United States)

Khasu, Denis S.; Williams, Thomas O., Jr.

2016-01-01

In this brief article, the reliability of scores for the Draw-A-Person Intellectual Ability Test for Children, Adolescents, and Adults (DAP: IQ; Reynolds & Hickman, 2004) was examined through several analyses with a sample of 147 children from rural Malawi, Africa using a Chichewa translation of instructions. Cronbach alpha coefficients for…
Interrater reliability of the Melbourne Assessment of Unilateral Upper Limb Function for children with hemiplegic cerebral palsy.

LENUS (Irish Health Repository)

Spirtos, Michelle

2012-02-01

OBJECTIVE: We examined the interrater reliability of the Melbourne Assessment of Unilateral Upper Limb Function. METHOD: Three occupational therapists independently scored 34 videotaped assessments of children with hemiplegic cerebral palsy aged 6 yr, 1 mo, to 14 yr, 5 mo. Intraclass correlation coefficients (ICCs) at a 95% confidence interval were calculated for total scores, category scores, and item scores. RESULTS: The correlation between raters\\' total scores was high (ICC = .961). The highest correlation for test components between raters was found for fluency (ICC = .902), followed by range of movement (ICC = .866), and the lowest correlation was found for quality of movement (ICC = .683). The ICCs for individual test item scores varied and ranged from .368 to .899. CONCLUSION: This study demonstrated high interrater reliability for total scores, with scoring of some individual components and items requiring further consideration from both a clinical and a research perspective.
MRI quantitative assessment of brain maturation and prognosis in premature infants using total maturation score

International Nuclear Information System (INIS)

Qi Ying; Wang Xiaoming

2009-01-01

Objective: To quantitatively assess brain maturation and prognosis in premature infants on conventional MRI using total maturation score (TMS). Methods: Nineteen cases of sequelae of white matter damage (WMD group )and 21 cases of matched controls (control group) in premature infants confirmed by MRI examinations were included in the study. All cases underwent conventional MR imaging approximately during the perinatal period after birth. Brain development was quantitatively assessed using Childs AM's validated scoring system of TMS by two sophisticated radiology physicians. Interobserver agreement and reliability was evaluated by using intraclass correlation (ICC). Linear regression analysis between TMS and postmenstrual age (PMA) was made(Y: TMS, X: PMA). Independent-sample t test of the two groups' TMS was made. Results: Sixteen of 19 cases revealed MRI abnormalities. Lesions showing T 1 and T 2 shortening tended to occur in clusters or a linear pattern in the deep white matter of the centrum semiovale, periventricular white matter. Diffusion-weighted MR image (DWI) showed 3 cases with greater lesions and 4 cases with new lesions in corpus callosum. There was no abnormality in control group on MRI and DWI. The average numbers of TMS between the two observers were 7.13±2.27, 7.13±2.21. Interobservcer agreement was found to be high (ICC=0.990, P 2 =0.6401,0.5156 respectively, P 0.05). Conclusion: Conventional MRI is able to quantify the brain maturation and prognosis of premature infants using TMS. (authors)
Reliable categorisation of visual scoring of coronary artery calcification on low-dose CT for lung cancer screening: validation with the standard Agatston score

Energy Technology Data Exchange (ETDEWEB)

Huang, Yi-Luan; Wu, Fu-Zong; Wang, Yen-Chi [Kaohsiung Veterans General Hospital, Department of Radiology, Kaohsiung 813 (China); National Yang Ming University, Faculty of Medicine, School of Medicine, Taipei (China); Ju, Yu-Jeng [National Taiwan University, Department of Psychology, Taipei (China); Mar, Guang-Yuan [Kaohsiung Veterans General Hospital, Division of Cardiology, Department of Medicine, Kaohsiung 813 (China); Chuo, Chiung-Chen [Kaohsiung Veterans General Hospital, Department of Radiology, Kaohsiung 813 (China); Lin, Huey-Shyan [Fooyin University, School of Nursing, Kaohsiung (China); Wu, Ming-Ting [Kaohsiung Veterans General Hospital, Department of Radiology, Kaohsiung 813 (China); National Yang Ming University, Faculty of Medicine, School of Medicine, Taipei (China); National Yang Ming University, Institute of Clinical Medicine, Taipei (China)

2013-05-15

To validate the reliability of the visual coronary artery calcification score (VCACS) on low-dose CT (LDCT) for concurrent screening of CAC and lung cancer. We enrolled 401 subjects receiving LDCT for lung cancer screening and ECG-gated CT for the Agatston score (AS). LDCT was reconstructed with 3- and 5-mm slice thickness (LDCT-3mm and LDCT-5mm respectively) for VCACS to obtain VCACS-3mm and VCACS-5mm respectively. After a training session comprising 32 cases, two observers performed four-scale VCACS (absent, mild, moderate, severe) of 369 data sets independently, the results were compared with four-scale AS (0, 1-100, 101-400, >400). CACs were present in 39.6 % (146/369) of subjects. The sensitivity of VCACS-3mm was higher than for VCACS-5mm (83.6 % versus 74.0 %). The median of AS of the 24 false-negative cases in VCACS-3mm was 2.3 (range 1.1-21.1). The false-negative rate for detecting AS {>=} 10 on LDCT-3mm was 1.9 %. VCACS-3mm had higher concordance with AS than VCACS-5mm (k = 0.813 versus k = 0.685). An extended test of VCACS-3mm for four junior observers showed high inter-observer reliability (intra-class correlation = 0.90) and good concordance with AS (k = 0.662-0.747). This study validated the reliability of VCACS on LDCT for lung cancer screening and showed that LDCT-3mm was more feasible than LDCT-5mm for CAD risk stratification. (orig.)
Examining the interrater reliability of the Hare Psychopathy Checklist-Revised across a large sample of trained raters.

Science.gov (United States)

Blais, Julie; Forth, Adelle E; Hare, Robert D

2017-06-01

The goal of the current study was to assess the interrater reliability of the Psychopathy Checklist-Revised (PCL-R) among a large sample of trained raters (N = 280). All raters completed PCL-R training at some point between 1989 and 2012 and subsequently provided complete coding for the same 6 practice cases. Overall, 3 major conclusions can be drawn from the results: (a) reliability of individual PCL-R items largely fell below any appropriate standards while the estimates for Total PCL-R scores and factor scores were good (but not excellent); (b) the cases representing individuals with high psychopathy scores showed better reliability than did the cases of individuals in the moderate to low PCL-R score range; and (c) there was a high degree of variability among raters; however, rater specific differences had no consistent effect on scoring the PCL-R. Therefore, despite low reliability estimates for individual items, Total scores and factor scores can be reliably scored among trained raters. We temper these conclusions by noting that scoring standardized videotaped case studies does not allow the rater to interact directly with the offender. Real-world PCL-R assessments typically involve a face-to-face interview and much more extensive collateral information. We offer recommendations for new web-based training procedures. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Cross-cultural adaptation and validation of the Japanese version of the new Knee Society Scoring System for osteoarthritic knee with total knee arthroplasty.

Science.gov (United States)

Hamamoto, Yosuke; Ito, Hiromu; Furu, Moritoshi; Ishikawa, Masahiro; Azukizawa, Masayuki; Kuriyama, Shinichi; Nakamura, Shinichiro; Matsuda, Shuichi

2015-09-01

The purposes of this study were to translate the new Knee Society Score (KSS) into Japanese and to evaluate the construct and content validity, test-retest reliability, and internal consistency of the Japanese version of the new KSS. The Japanese version of the KSS was developed according to cross-cultural guidelines by using the "translation-back translation" method to ensure content validity. KSS data were then obtained from patients who had undergone total knee arthroplasty (TKA). The psychometric properties evaluated were as follows: for feasibility, response rate, and floor and ceiling effects; for construct validity, internal consistency using Cronbach's alpha, and correlations with quality of life. Construct validity was evaluated by using Spearman's correlation coefficient to quantify the correlation between the KSS and the Japanese version of the Oxford 12-item Knee Score or Short Form 36 Health Survey (SF-36) questionnaires. The Japanese version of the KSS was sent to 93 consecutive osteoarthritic patients who underwent primary TKA in our institution. Fifty-five patients completed the questionnaires and were included in this study. Neither a floor nor ceiling effect was observed. The reliability proved excellent in the majority of domains, with intraclass correlation coefficients of 0.65-0.88. Internal consistency, assessed by Cronbach's alpha, was good to excellent for all domains (0.78-0.94). All of the four domains of the KSS correlated significantly with the Oxford 12-item Knee Score. The activity and satisfaction domains of the KSS correlated significantly with all and the majority of subscales of the SF-36, respectively, whereas symptoms and expectation domains showed significant correlations only with bodily pain and vitality subscales and with the physical function, bodily pain, and vitality subscales, respectively. The Japanese version of the new KSS is a valid, reliable, and responsive instrument to capture subjective aspects of the functional
Reliability of the Quality of Upper Extremity Skills Test for Children with Cerebral Palsy Aged 2 to 12 Years

Science.gov (United States)

Thorley, Megan; Lannin, Natasha; Cusick, Anne; Novak, Iona; Boyd, Roslyn

2012-01-01

Aim: To investigate reliability of the Quality of Upper Extremity Skills Test (QUEST) scores for children with cerebral palsy (CP) aged 2-12 years. Method: Thirty-one QUESTs from 24 children with CP were rated once by two raters and twice by one rater. Internal consistency of total scores, inter- and intra-rater reliability findings for total,…
Reliability of the Cooking Task in adults with acquired brain injury.

Science.gov (United States)

Poncet, Frédérique; Swaine, Bonnie; Taillefer, Chantal; Lamoureux, Julie; Pradat-Diehl, Pascale; Chevignard, Mathilde

2015-01-01

Acquired brain injury (ABI) often leads to deficits in executive functioning (EF) responsible for severe and long-standing disabilities in daily life activities. The Cooking Task is an ecological and valid test of EF involving multi-tasking in a real environment. Given its complex scoring system, it is important to establish the tool's reliability. The objective of the study was to examine the reliability of the Cooking Task (internal consistency, inter-rater and test-retest reliability). A total of 160 patients with ABI (113 men, mean age 37 years, SD = 14.3) were tested using the Cooking Task. For test-retest reliability, patients were assessed by the same rater on two occasions (mean interval 11 days) while two raters independently and simultaneously observed and scored patients' performances to estimate inter-rater reliability. Internal consistency was high for the global scale (Cronbach α = .74). Inter-rater reliability (n = 66) for total errors was also high (ICC = .93), however the test-retest reliability (n = 11) was poor (ICC = .36). In general the Cooking Task appears to be a reliable tool. The low test-retest results were expected given the importance of EF in the performance of novel tasks.
The use of the SF-36 questionnaire in adult survivors of childhood cancer: evaluation of data quality, score reliability, and scaling assumptions

Directory of Open Access Journals (Sweden)

Winter David L

2006-10-01

Full Text Available Abstract Background The SF-36 has been used in a number of previous studies that have investigated the health status of childhood cancer survivors, but it never has been evaluated regarding data quality, scaling assumptions, and reliability in this population. As health status among childhood cancer survivors is being increasingly investigated, it is important that the measurement instruments are reliable, validated and appropriate for use in this population. The aim of this paper was to determine whether the SF-36 questionnaire is a valid and reliable instrument in assessing self-perceived health status of adult survivors of childhood cancer. Methods We examined the SF-36 to see how it performed with respect to (1 data completeness, (2 distribution of the scale scores, (3 item-internal consistency, (4 item-discriminant validity, (5 internal consistency, and (6 scaling assumptions. For this investigation we used SF-36 data from a population-based study of 10,189 adult survivors of childhood cancer. Results Overall, missing values ranged per item from 0.5 to 2.9 percent. Ceiling effects were found to be highest in the role limitation-physical (76.7% and role limitation-emotional (76.5% scales. All correlations between items and their hypothesised scales exceeded the suggested standard of 0.40 for satisfactory item-consistency. Across all scales, the Cronbach's alpha coefficient of reliability was found to be higher than the suggested value of 0.70. Consistent across all cancer groups, the physical health related scale scores correlated strongly with the Physical Component Summary (PCS scale scores and weakly with the Mental Component Summary (MCS scale scores. Also, the mental health and role limitation-emotional scales correlated strongly with the MCS scale score and weakly with the PCS scale score. Moderate to strong correlations with both summary scores were found for the general health perception, energy/vitality, and social functioning
A reliable parameter to standardize the scoring of stem cell spheres.

Directory of Open Access Journals (Sweden)

Xiaochen Zhou

Full Text Available Sphere formation assay is widely used in selection and enrichment of normal stem cells or cancer stem cells (CSCs, also known as tumor initiating cells (TICs, based on their ability to grow in serum-free suspension culture for clonal proliferation. However, there is no standardized parameter to accurately score the spheres, which should be reflected by both the number and size of the spheres. Here we define a novel parameter, designated as Standardized Sphere Score (SSS, which is expressed by the total volume of selected spheres divided by the number of cells initially plated. SSS was validated in quantification of both tumor spheres from cancer cell lines and embryonic bodies (EB from mouse embryonic stem cells with high sensitivity and reproducibility.
An analysis of reliability and validity of the papilla index score of implant-supported single crowns of maxillary central incisors

DEFF Research Database (Denmark)

Peng, Min; Fei, Wei; Hosseini, Mandana

2012-01-01

Objectives: To test the reliability and validity of the papilla index scores of the implant-supported single crowns (ISSCs) of maxillary central incisors. Materials and Methods: Twenty-five patients with 25 ISSCs were included. Two prosthodontists evaluated the papilla index score (PIS) of three...... inter-observer agreement. The PIS score demonstrated significant correlation to the corresponding PP value (rs=.567, p=.000). Conclusions: The feasibility, reliability and validity of the PIS made the parameter useful for quality control of the pri-implant soft tissue of ISSCs....... fill percent (PP) was calculated. The validity of PIS was tested against the corresponding papilla fill percent (PP) by using the Spearman correlation analysis. Results: The intra-observer agreement was >70% in 4/5 and >50% in all observations, the pooled Cohen’s ¿ was 0.64 and 0.70 for two observers...
Reliability of the Balance Evaluation Systems Test (BESTest) and BESTest sections for adults with hemiparesis

Science.gov (United States)

Rodrigues, Letícia C.; Marques, Aline P.; Barros, Paula B.; Michaelsen, Stella M.

2014-01-01

BACKGROUND: The Balance Evaluation Systems Test (BESTest) was recently created to allow the development of treatments according to the specific balance system affected in each patient. The Brazilian version of the BESTest has not been specifically tested after stroke. OBJECTIVE: To evaluate the intra- and inter-rater reliability and concurrent and convergent validity of the total score of the BESTest and BESTest sections for adults with hemiparesis after stroke. METHOD: The study included 16 subjects (61.1±7.5 years) with chronic hemiparesis (54.5±43.5 months after stroke). The BESTest was administered by two raters in the same week and one of the raters repeated the test after a one-week interval. Intraclass correlation coefficient (ICC) was calculated to assess intra- and interrater reliability. Concurrent validity with the Berg Balance Scale (BBS) and convergent validity with the Activities-specific Balance Confidence scale (ABC-Brazil) were assessed using Pearson's correlation coefficient. RESULTS: Both the BESTest total score (ICC=0.98) and the BESTest sections (ICC between 0.85 and 0.96) have excellent intrarater reliability. Interrater reliability for the total score was excellent (ICC=0.93) and, for the sections, it ranged between 0.71 and 0.94. The correlation coefficient between the BESTest and the BBS and ABC-Brazil were 0.78 and 0.59, respectively. CONCLUSIONS: The Brazilian version of the BESTest demonstrated adequate reliability when measured by sections and could identify what balance system was affected in patients after stroke. Concurrent validity was excellent with the BBS total score and good to excellent with the sections. The total scores but not the sections present adequate convergent validity with the ABC-Brazil. However, other psychometric properties should be further investigated. PMID:25003281
Reliability and concurrent validity of the Dutch hip and knee replacement expectations surveys.

Science.gov (United States)

van den Akker-Scheek, Inge; van Raay, Jos J A M; Reininga, Inge H F; Bulstra, Sjoerd K; Zijlstra, Wiebren; Stevens, Martin

2010-10-19

Preoperative expectations of outcome of total hip and knee arthroplasty are important determinants of patients' satisfaction and functional outcome. Aims of the study were (1) to translate the Hospital for Special Surgery Hip Replacement Expectations Survey and Knee Replacement Expectations Survey into Dutch and (2) to study test-retest reliability and concurrent validity. Patients scheduled for total hip (N = 112) or knee replacement (N = 101) were sent the Dutch Expectations Surveys twice with a 2 week interval to determine test-retest reliability. To determine concurrent validity, the Expectation WOMAC was sent. The results for the Dutch Hip Replacement Expectations Survey revealed good test-retest reliability (ICC 0.87), no bias and good internal consistency (alpha 0.86) (N = 72). The correlation between the Hip Expectations Score and the Expectation WOMAC score was 0.59 (N = 86). The results for the Dutch Knee Replacement Expectations Survey revealed good test-retest reliability (ICC 0.79), no bias and good internal consistency (alpha 0.91) (N = 46). The correlation with the Expectation WOMAC score was 0.52 (N = 57). Both Dutch Expectations Surveys are reliable instruments to determine patients' expectations before total hip or knee arthroplasty. As for concurrent validity, the correlation between both surveys and the Expectation WOMAC was moderate confirming that the same construct was determined. However, patients scored systematically lower on the Expectation WOMAC compared to the Dutch Expectation Surveys. Research on patients' expectations before total hip and knee replacement has only been performed in a limited amount of countries. With the Dutch Expectations Surveys it is now possible to determine patients' expectations in another culture and healthcare setting.
Bearing Procurement Analysis Method by Total Cost of Ownership Analysis and Reliability Prediction

Science.gov (United States)

Trusaji, Wildan; Akbar, Muhammad; Sukoyo; Irianto, Dradjad

2018-03-01

In making bearing procurement analysis, price and its reliability must be considered as decision criteria, since price determines the direct cost as acquisition cost and reliability of bearing determine the indirect cost such as maintenance cost. Despite the indirect cost is hard to identify and measured, it has high contribution to overall cost that will be incurred. So, the indirect cost of reliability must be considered when making bearing procurement analysis. This paper tries to explain bearing evaluation method with the total cost of ownership analysis to consider price and maintenance cost as decision criteria. Furthermore, since there is a lack of failure data when bearing evaluation phase is conducted, reliability prediction method is used to predict bearing reliability from its dynamic load rating parameter. With this method, bearing with a higher price but has higher reliability is preferable for long-term planning. But for short-term planning the cheaper one but has lower reliability is preferable. This contextuality can give rise to conflict between stakeholders. Thus, the planning horizon needs to be agreed by all stakeholder before making a procurement decision.
Validation and reliability of a Behcet’s Syndrome Activity Scale in Korea

Science.gov (United States)

Choi, Hyo Jin; Seo, Mi Ryoung; Ryu, Hee Jung; Baek, Han Joo

2016-01-01

Background/Aims: We prepared a cross-cultural adaptation of the Behcet’s Syndrome Activity Scale (BSAS) and evaluated its reliability and validity in Korea. Methods: Fifty patients with Behcet’s disease (BD) who attended the Rheumatology Clinic of Gachon University Gil Medical Center were included in this study. The first BSAS questionnaire was administered at each clinic visit, and the second questionnaire was completed at home within 24 hours of the visit. A Behcet’s Disease Current Activity Form (BDCAF) and a Behcet’s Disease Quality of Life (BDQOL) form were also given to patients. The test-retest reliability was analyzed by intraclass correlation coefficients (ICC). To assess the validity, the total BSAS score was compared with the BDCAF score, the patient/physician global assessment, and the BDQOL by Spearman rank correlation. Results: Twelve males and 38 females were enrolled. The mean age was 48.5 years and the mean disease duration was 6.7 years. Thirty-eight patients (76.0%) returned the questionnaire by mail. For the test-retest reliability, the two assessments were significantly correlated on all 10 items of the BSAS questionnaire (p < 0.05) and the total BSAS score (ICC, 0.925; p < 0.001). The total BSAS score was statistically correlated with the BDQOL, BDCAF, and patient/physician global assessment (p < 0.01). Conclusions: The Korean version of BSAS is a reliable and valid instrument to measure BD activity. PMID:26767871
Distribution of Total Depressive Symptoms Scores and Each Depressive Symptom Item in a Sample of Japanese Employees.

Science.gov (United States)

Tomitaka, Shinichiro; Kawasaki, Yohei; Ide, Kazuki; Yamada, Hiroshi; Miyake, Hirotsugu; Furukawa, Toshiaki A; Furukaw, Toshiaki A

2016-01-01

In a previous study, we reported that the distribution of total depressive symptoms scores according to the Center for Epidemiologic Studies Depression Scale (CES-D) in a general population is stable throughout middle adulthood and follows an exponential pattern except for at the lowest end of the symptom score. Furthermore, the individual distributions of 16 negative symptom items of the CES-D exhibit a common mathematical pattern. To confirm the reproducibility of these findings, we investigated the distribution of total depressive symptoms scores and 16 negative symptom items in a sample of Japanese employees. We analyzed 7624 employees aged 20-59 years who had participated in the Northern Japan Occupational Health Promotion Centers Collaboration Study for Mental Health. Depressive symptoms were assessed using the CES-D. The CES-D contains 20 items, each of which is scored in four grades: "rarely," "some," "much," and "most of the time." The descriptive statistics and frequency curves of the distributions were then compared according to age group. The distribution of total depressive symptoms scores appeared to be stable from 30-59 years. The right tail of the distribution for ages 30-59 years exhibited a linear pattern with a log-normal scale. The distributions of the 16 individual negative symptom items of the CES-D exhibited a common mathematical pattern which displayed different distributions with a boundary at "some." The distributions of the 16 negative symptom items from "some" to "most" followed a linear pattern with a log-normal scale. The distributions of the total depressive symptoms scores and individual negative symptom items in a Japanese occupational setting show the same patterns as those observed in a general population. These results show that the specific mathematical patterns of the distributions of total depressive symptoms scores and individual negative symptom items can be reproduced in an occupational population.
Evaluation of Fracture and Osteotomy Union in the Setting of Osteogenesis Imperfecta: Reliability of the Modified Radiographic Union Score for Tibial Fractures (RUST).

Science.gov (United States)

Franzone, Jeanne M; Finkelstein, Mark S; Rogers, Kenneth J; Kruse, Richard W

2017-09-08

Evaluation of the union of osteotomies and fractures in patients with osteogenesis imperfecta (OI) is a critical component of patient care. Studies of the OI patient population have so far used varied criteria to evaluate bony union. The radiographic union score for tibial fractures (RUST), which was subsequently revised to the modified RUST, is an objective standardized method of evaluating fracture healing. We sought to evaluate the reliability of the modified RUST in the setting of the tibias of patients with OI. Tibial radiographs of 30 patients with OI fractures, or osteotomies were scored by 3 observers on 2 separate occasions. Each of the 4 cortices was given a score (1=no callus, 2=callus present, 3=bridging callus, and 4=remodeled, fracture not visible) and the modified RUST is the sum of these scores (range, 4 to 16). The interobserver and intraobserver reliabilities were evaluated using intraclass coefficients (ICC) with 95% confidence intervals. The ICC representing the interobserver reliability for the first iteration of scores was 0.926 (0.864 to 0.962) and for the second series was 0.915 (0.845 to 0.957). The ICCs representing the intraobserver reliability for each of the 3 reviewers for the measurements in series 1 and 2 were 0.860 (0.707 to 0.934), 0.994 (0.986 to 0.997), and 0.974 (0.946 to 0.988). The modified RUST has excellent interobserver and intraobserver reliability in the setting of OI despite challenges related to the poor quality of the bone and its dysplastic nature. The application and routine use of the modified RUST in the OI population will help standardize our evaluation of osteotomy and fracture healing. Level III-retrospective study of nonconsecutive patients.
An alternative to the balance error scoring system: using a low-cost balance board to improve the validity/reliability of sports-related concussion balance testing.

Science.gov (United States)

Chang, Jasper O; Levy, Susan S; Seay, Seth W; Goble, Daniel J

2014-05-01

Recent guidelines advocate sports medicine professionals to use balance tests to assess sensorimotor status in the management of concussions. The present study sought to determine whether a low-cost balance board could provide a valid, reliable, and objective means of performing this balance testing. Criterion validity testing relative to a gold standard and 7 day test-retest reliability. University biomechanics laboratory. Thirty healthy young adults. Balance ability was assessed on 2 days separated by 1 week using (1) a gold standard measure (ie, scientific grade force plate), (2) a low-cost Nintendo Wii Balance Board (WBB), and (3) the Balance Error Scoring System (BESS). Validity of the WBB center of pressure path length and BESS scores were determined relative to the force plate data. Test-retest reliability was established based on intraclass correlation coefficients. Composite scores for the WBB had excellent validity (r = 0.99) and test-retest reliability (R = 0.88). Both the validity (r = 0.10-0.52) and test-retest reliability (r = 0.61-0.78) were lower for the BESS. These findings demonstrate that a low-cost balance board can provide improved balance testing accuracy/reliability compared with the BESS. This approach provides a potentially more valid/reliable, yet affordable, means of assessing sports-related concussion compared with current methods.
"Reliability of the Norwegian version of the short physical performance battery in older people with and without dementia".

Science.gov (United States)

Olsen, Cecilie Fromholt; Bergland, Astrid

2017-06-09

The purpose of the study was to establish the test-retest reliability of the Norwegian version of the Short Physical Performance Battery (SPPB). This was a cross- sectional reliability study. A convenience sample of 61 older adults with a mean age of 88.4(8.1) was tested by two different physiotherapists at two time points. The mean time interval between tests was 2.5 days. The Intraclass Correlation Coefficient model 3.1 (ICC, 3.1) with 95% confidence intervals as well as the weighted Kappa (K) were used as measures of relative reliability. The Standard Error of Measurement (SEM) and Minimal Detectable Change (MDC) were used to measure absolute reliability. The results were also analyzed for a subgroup of 24 older people with dementia. The ICC reflected high relative reliability for the SPPB summary score and the 4 m walk test (4mwt), both for the total sample (ICC = 0.92, and 0.91 respectively)) and for the subgroup with dementia (ICC = 0.84 and 0.90 respectively). Furthermore, weighted Ks for the SPPB subscales were 0.64 for the chair stand, 0.80 for gait and 0.52 for balance for the total sample and almost identical for the subgroup with dementia. MDC-values at the 95% confidence intervals (MDC95) were calculated at 0.8 for the total score of SPPB and 0.39 m/s for the 4mwt in the total sample. For the subgroup with dementia MDC95 was 1.88 for the total score of SPPB and 0.28 m/s for 4mwt. The SPPB total score and the timed walking test showed overall high relative and absolute reliability for the total sample indicating that the Norwegian version of the SPPB is reliable when used by trained physiotherapists with older people. The reliability of the Norwegian SPPB in older people with dementia seems high, but due to a small sample size this needs further investigation.

Spousal concordance and reliability of the 'Prudence Score' as a summary of diet and lifestyle.

Science.gov (United States)

Parekh, Sanjoti; King, David; Owen, Neville; Jamrozik, Konrad

2009-08-01

This paper describes a composite 'Prudence Score' summarising self-reported behavioural risk factors for non-communicable diseases. If proved robust, the 'Prudence score' might be used widely to encourage large numbers of individuals to adopt and maintain simple, healthy changes in their lifestyle. We calculated the 'Prudence Score' based on responses collected in late 2006 to a postal questionnaire sent to 225 adult patients aged 25 to 75 years identified from the records of two general medical practices in Brisbane, Australia. Participants completed the behavioural, dietary and lifestyle items in relation to their spouse as well as themselves. The spouse or partner of each addressee completed their own copy of the study questionnaire. Kappa scores for spousal concordance with probands' reports (n = 45 pairs) on diet-related items varied between 0.35 (for vegetable intake) to 0.77 (for usual type of milk consumed). Spousal concordance values for other behaviours were 0.67 (physical activity), 0.82 (alcohol intake) and 1.0 (smoking habits). Kappa scores for test-retest reliability (n = 53) varied between 0.47 (vegetable intake) and 0.98 (smoking habits). The veracity of self-reported data is a challenge for studies of behavioural change. Our results indicate moderate to substantial agreement from life partners regarding individuals' self-reports for most of the behavioural risk items included in the 'Prudence Score'. This increases confidence that key aspects of diet and lifestyle can be assessed by self-report. The 'Prudence Score' potentially has wide application as a simple and robust tool for health promotion programs.
Reliability of Scores Obtained from Self-, Peer-, and Teacher-Assessments on Teaching Materials Prepared by Teacher Candidates

Science.gov (United States)

Nalbantoglu Yilmaz, Funda

2017-01-01

This study aims to determine the reliability of scores obtained from self-, peer-, and teacher-assessments in terms of teaching materials prepared by teacher candidates. The study group of this research constitutes 56 teacher candidates. In the scope of research, teacher candidates were asked to develop teaching material related to their study.…
Reliability and validity of the new Tanaka B Intelligence Scale scores: a group intelligence test.

Science.gov (United States)

Uno, Yota; Mizukami, Hitomi; Ando, Masahiko; Yukihiro, Ryoji; Iwasaki, Yoko; Ozaki, Norio

2014-01-01

The present study evaluated the reliability and concurrent validity of the new Tanaka B Intelligence Scale, which is an intelligence test that can be administered on groups within a short period of time. The new Tanaka B Intelligence Scale and Wechsler Intelligence Scale for Children-Third Edition were administered to 81 subjects (mean age ± SD 15.2 ± 0.7 years) residing in a juvenile detention home; reliability was assessed using Cronbach's alpha coefficient, and concurrent validity was assessed using the one-way analysis of variance intraclass correlation coefficient. Moreover, receiver operating characteristic analysis for screening for individuals who have a deficit in intellectual function (an FIQIntelligence Scale IQ (BIQ) was 0.86, and the intraclass correlation coefficient with FIQ was 0.83. Receiver operating characteristic analysis demonstrated an area under the curve of 0.89 (95% CI: 0.85-0.96). In addition, the stratum-specific likelihood ratio for the BIQ≤65 stratum was 13.8 (95% CI: 3.9-48.9), and the stratum-specific likelihood ratio for the BIQ≥76 stratum was 0.1 (95% CI: 0.03-0.4). Thus, intellectual disability could be ruled out or determined. The present results demonstrated that the new Tanaka B Intelligence Scale score had high reliability and concurrent validity with the Wechsler Intelligence Scale for Children-Third Edition score. Moreover, the post-test probability for the BIQ could be calculated when screening for individuals who have a deficit in intellectual function. The new Tanaka B Intelligence Test is convenient and can be administered within a variety of settings. This enables evaluation of intellectual development even in settings where performing intelligence tests have previously been difficult.
Reliability assessment of AOSpine thoracolumbar spine injury classification system and Thoracolumbar Injury Classification and Severity Score (TLICS) for thoracolumbar spine injuries: results of a multicentre study.

Science.gov (United States)

Kaul, Rahul; Chhabra, Harvinder Singh; Vaccaro, Alexander R; Abel, Rainer; Tuli, Sagun; Shetty, Ajoy Prasad; Das, Kali Dutta; Mohapatra, Bibhudendu; Nanda, Ankur; Sangondimath, Gururaj M; Bansal, Murari Lal; Patel, Nishit

2017-05-01

The aim of this multicentre study was to determine whether the recently introduced AOSpine Classification and Injury Severity System has better interrater and intrarater reliability than the already existing Thoracolumbar Injury Classification and Severity Score (TLICS) for thoracolumbar spine injuries. Clinical and radiological data of 50 consecutive patients admitted at a single centre with a diagnosis of an acute traumatic thoracolumbar spine injury were distributed to eleven attending spine surgeons from six different institutions in the form of PowerPoint presentation, who classified them according to both classifications. After time span of 6 weeks, cases were randomly rearranged and sent again to same surgeons for re-classification. Interobserver and intraobserver reliability for each component of TLICS and new AOSpine classification were evaluated using Fleiss Kappa coefficient (k value) and Spearman rank order correlation. Moderate interrater and intrarater reliability was seen for grading fracture type and integrity of posterior ligamentous complex (Fracture type: k = 0.43 ± 0.01 and 0.59 ± 0.16, respectively, PLC: k = 0.47 ± 0.01 and 0.55 ± 0.15, respectively), and fair to moderate reliability (k = 0.29 ± 0.01 interobserver and 0.44+/0.10 intraobserver, respectively) for total score according to TLICS. Moderate interrater (k = 0.59 ± 0.01) and substantial intrarater reliability (k = 0.68 ± 0.13) was seen for grading fracture type regardless of subtype according to AOSpine classification. Near perfect interrater and intrarater agreement was seen concerning neurological status for both the classification systems. Recently proposed AOSpine classification has better reliability for identifying fracture morphology than the existing TLICS. Additional studies are clearly necessary concerning the application of these classification systems across multiple physicians at different level of training and trauma centers to evaluate not
Translation and validation of the new version of the Knee Society Score - The 2011 KS Score - into Brazilian Portuguese.

Science.gov (United States)

Silva, Adriana Lucia Pastore E; Croci, Alberto Tesconi; Gobbi, Riccardo Gomes; Hinckel, Betina Bremer; Pecora, José Ricardo; Demange, Marco Kawamura

2017-01-01

Translation, cultural adaptation, and validation of the new version of the Knee Society Score - The 2011 KS Score - into Brazilian Portuguese and verification of its measurement properties, reproducibility, and validity. In 2012, the new version of the Knee Society Score was developed and validated. This scale comprises four separate subscales: (a) objective knee score (seven items: 100 points); (b) patient satisfaction score (five items: 40 points); (c) patient expectations score (three items: 15 points); and (d) functional activity score (19 items: 100 points). A total of 90 patients aged 55-85 years were evaluated in a clinical cross-sectional study. The pre-operative translated version was applied to patients with TKA referral, and the post-operative translated version was applied to patients who underwent TKA. Each patient answered the same questionnaire twice and was evaluated by two experts in orthopedic knee surgery. Evaluations were performed pre-operatively and three, six, or 12 months post-operatively. The reliability of the questionnaire was evaluated using the intraclass correlation coefficient (ICC) between the two applications. Internal consistency was evaluated using Cronbach's alpha. The ICC found no difference between the means of the pre-operative, three-month, and six-month post-operative evaluations between sub-scale items. The Brazilian Portuguese version of The 2011 KS Score is a valid and reliable instrument for objective and subjective evaluation of the functionality of Brazilian patients who undergo TKA and revision TKA.
Boundary curves of individual items in the distribution of total depressive symptom scores approximate an exponential pattern in a general population.

Science.gov (United States)

Tomitaka, Shinichiro; Kawasaki, Yohei; Ide, Kazuki; Akutagawa, Maiko; Yamada, Hiroshi; Furukawa, Toshiaki A; Ono, Yutaka

2016-01-01

Previously, we proposed a model for ordinal scale scoring in which individual thresholds for each item constitute a distribution by each item. This lead us to hypothesize that the boundary curves of each depressive symptom score in the distribution of total depressive symptom scores follow a common mathematical model, which is expressed as the product of the frequency of the total depressive symptom scores and the probability of the cumulative distribution function of each item threshold. To verify this hypothesis, we investigated the boundary curves of the distribution of total depressive symptom scores in a general population. Data collected from 21,040 subjects who had completed the Center for Epidemiologic Studies Depression Scale (CES-D) questionnaire as part of a national Japanese survey were analyzed. The CES-D consists of 20 items (16 negative items and four positive items). The boundary curves of adjacent item scores in the distribution of total depressive symptom scores for the 16 negative items were analyzed using log-normal scales and curve fitting. The boundary curves of adjacent item scores for a given symptom approximated a common linear pattern on a log normal scale. Curve fitting showed that an exponential fit had a markedly higher coefficient of determination than either linear or quadratic fits. With negative affect items, the gap between the total score curve and boundary curve continuously increased with increasing total depressive symptom scores on a log-normal scale, whereas the boundary curves of positive affect items, which are not considered manifest variables of the latent trait, did not exhibit such increases in this gap. The results of the present study support the hypothesis that the boundary curves of each depressive symptom score in the distribution of total depressive symptom scores commonly follow the predicted mathematical model, which was verified to approximate an exponential mathematical pattern.
Boundary curves of individual items in the distribution of total depressive symptom scores approximate an exponential pattern in a general population

Directory of Open Access Journals (Sweden)

Shinichiro Tomitaka

2016-10-01

Full Text Available Background Previously, we proposed a model for ordinal scale scoring in which individual thresholds for each item constitute a distribution by each item. This lead us to hypothesize that the boundary curves of each depressive symptom score in the distribution of total depressive symptom scores follow a common mathematical model, which is expressed as the product of the frequency of the total depressive symptom scores and the probability of the cumulative distribution function of each item threshold. To verify this hypothesis, we investigated the boundary curves of the distribution of total depressive symptom scores in a general population. Methods Data collected from 21,040 subjects who had completed the Center for Epidemiologic Studies Depression Scale (CES-D questionnaire as part of a national Japanese survey were analyzed. The CES-D consists of 20 items (16 negative items and four positive items. The boundary curves of adjacent item scores in the distribution of total depressive symptom scores for the 16 negative items were analyzed using log-normal scales and curve fitting. Results The boundary curves of adjacent item scores for a given symptom approximated a common linear pattern on a log normal scale. Curve fitting showed that an exponential fit had a markedly higher coefficient of determination than either linear or quadratic fits. With negative affect items, the gap between the total score curve and boundary curve continuously increased with increasing total depressive symptom scores on a log-normal scale, whereas the boundary curves of positive affect items, which are not considered manifest variables of the latent trait, did not exhibit such increases in this gap. Discussion The results of the present study support the hypothesis that the boundary curves of each depressive symptom score in the distribution of total depressive symptom scores commonly follow the predicted mathematical model, which was verified to approximate an
Effects of Differential Item Functioning on Examinees' Test Performance and Reliability of Test

Science.gov (United States)

Lee, Yi-Hsuan; Zhang, Jinming

2017-01-01

Simulations were conducted to examine the effect of differential item functioning (DIF) on measurement consequences such as total scores, item response theory (IRT) ability estimates, and test reliability in terms of the ratio of true-score variance to observed-score variance and the standard error of estimation for the IRT ability parameter. The…
Towards a contemporary, comprehensive scoring system for determining technical outcomes of hybrid percutaneous chronic total occlusion treatment: The RECHARGE score.

Science.gov (United States)

Maeremans, Joren; Spratt, James C; Knaapen, Paul; Walsh, Simon; Agostoni, Pierfrancesco; Wilson, William; Avran, Alexandre; Faurie, Benjamin; Bressollette, Erwan; Kayaert, Peter; Bagnall, Alan J; Smith, Dave; McEntegart, Margaret B; Smith, William H T; Kelly, Paul; Irving, John; Smith, Elliot J; Strange, Julian W; Dens, Jo

2018-02-01

This study sought to create a contemporary scoring tool to predict technical outcomes of chronic total occlusion (CTO) percutaneous coronary intervention (PCI) from patients treated by hybrid operators with differing experience levels. Current scoring systems need regular updating to cope with the positive evolutions regarding materials, techniques, and outcomes, while at the same time being applicable for a broad range of operators. Clinical and angiographic characteristics from 880 CTO-PCIs included in the REgistry of CrossBoss and Hybrid procedures in FrAnce, the NetheRlands, BelGium and UnitEd Kingdom (RECHARGE) were analyzed by using a derivation and validation set (2:1 ratio). Variables significantly associated with technical failure in the multivariable analysis were incorporated in the score. Subsequently, the discriminatory capacity was assessed and the validation set was used to compare with the J-CTO score and PROGRESS scores. Technical success in the derivation and validation sets was 83% and 85%, respectively. Multivariate analysis identified six parameters associated with technical failure: blunt stump (beta coefficient (b) = 1.014); calcification (b = 0.908); tortuosity ≥45° (b = 0.964); lesion length 20 mm (b = 0.556); diseased distal landing zone (b = 0.794), and previous bypass graft on CTO vessel (b = 0.833). Score variables remained significant after bootstrapping. The RECHARGE score showed better discriminatory capacity in both sets (area-under-the-curve (AUC) = 0.783 and 0.711), compared to the J-CTO (AUC = 0.676) and PROGRESS (AUC = 0.608) scores. The RECHARGE score is a novel, easy-to-use tool for assessing the risk for technical failure in hybrid CTO-PCI and has the potential to perform well for a broad community of operators. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Reliability and validity of migraine disability assessment questionnaire-Thai version (Thai-MIDAS).

Science.gov (United States)

Seethong, Piman; Nimmannit, Akarin; Chaisewikul, Rungsan; Prayoonwiwat, Naraporn; Chotinaiwattarakul, Wattanachai

2013-02-01

To assess the validity and test-retest reliability of a Thai translation of the Migraine Disability Assessment (MIDAS) Questionnaire in Thai patients with migraine. Migraineurs from the Headache Clinic in Siriraj Hospital were recruited and asked to complete a 13-weeks diary and answered the Thai-MIDAS at once. Some participants were asked to provide the 2nd Thai-MIDAS in the next 2 weeks for test-retest reliability. Ninety-three patients had completed the 13-weeks diaries. Age range was 18-58 years with mean 37.69 +/- 9.60 years. All 5 items and the total score of Thai-MIDAS were moderately correlated with data from 13-weeks diary (Spearman's correlation coefficient = 0.32-0.62). The test-retest reliability of the total score of Thai-MIDAS in 30 patients demonstrated a highly reliable degree of intraclass correlation (ICC = 0.76, 95% CI 0.49-0.88). The present study reveals that the Thai-MIDAS has satisfactory validity and reliability in comparison with the original English MIDAS version.
Malingering in Toxic Exposure. Classification Accuracy of Reliable Digit Span and WAIS-III Digit Span Scaled Scores

Science.gov (United States)

Greve, Kevin W.; Springer, Steven; Bianchini, Kevin J.; Black, F. William; Heinly, Matthew T.; Love, Jeffrey M.; Swift, Douglas A.; Ciota, Megan A.

2007-01-01

This study examined the sensitivity and false-positive error rate of reliable digit span (RDS) and the WAIS-III Digit Span (DS) scaled score in persons alleging toxic exposure and determined whether error rates differed from published rates in traumatic brain injury (TBI) and chronic pain (CP). Data were obtained from the files of 123 persons…
Performance of a novel clinical score, the Pediatric Asthma Severity Score (PASS), in the evaluation of acute asthma.

Science.gov (United States)

Gorelick, Marc H; Stevens, Molly W; Schultz, Theresa R; Scribano, Philip V

2004-01-01

To evaluate the reliability, validity, and responsiveness of a new clinical asthma score, the Pediatric Asthma Severity Score (PASS), in children aged 1 through 18 years in an acute clinical setting. This was a prospective cohort study of children treated for acute asthma at two urban pediatric emergency departments (EDs). A total of 852 patients were enrolled at one site and 369 at the second site. Clinical findings were assessed at the start of the ED visit, after one hour of treatment, and at the time of disposition. Peak expiratory flow rate (PEFR) (for patients aged 6 years and older) and pulse oximetry were also measured. Composite scores including three, four, or five clinical findings were evaluated, and the three-item score (wheezing, prolonged expiration, and work of breathing) was selected as the PASS. Interobserver reliability for the PASS was good to excellent (kappa = 0.72 to 0.83). There was a significant correlation between PASS and PEFR (r = 0.27 to 0.37) and pulse oximetry (r = 0.29 to 0.41) at various time points. The PASS was able to discriminate between those patients who did and did not require hospitalization, with area under the receiver operating characteristic curve of 0.82. Finally, the PASS was shown to be responsive, with a 48% relative increase in score from start to end of treatment and an overall effect size of 0.62, indicating a moderate to large effect. This clinical score, the PASS, based on three clinical findings, is a reliable and valid measure of asthma severity in children and shows both discriminative and responsive properties. The PASS may be a useful tool to assess acute asthma severity for clinical and research purposes.
Reliability and validity of the Physical Activity Scale for the Elderly (PASE in patients with hip osteoarthritis

Directory of Open Access Journals (Sweden)

Svege Ida

2012-02-01

Full Text Available Abstract Background Physical activity (PA is beneficial in reducing pain and improving function in lower limb osteoarthritis (OA, and is recommended as a first line treatment. Self-administered questionnaires are used to assess PA, but knowledge about reliability and validity of these PA questionnaires are limited, in particular for patients with OA. The purpose of this study was to evaluate the reliability and validity of the Physical Activity Scale for the Elderly (PASE in patients with hip OA. Methods Forty patients with hip OA (20 men and 20 women, mean age 61.3 ± 10 years were included. For test-retest reliability PASE was administered twice with a mean time between tests of 9 ± 4 days. Intraclass correlation coefficient (ICC, standard error of measurement (SEM and minimal detectable change (MDC were calculated for the total score and for the particular items assessing different PA intensity levels. In addition a Bland-Altman analysis for the total PASE score was performed. Construct validity was evaluated by comparing the PASE results with the Actigraph GT1M accelerometer and the International Physical Activity Questionnaire (IPAQ, using the Spearman rank correlation coefficient. Results ICC for the total PASE score was 0.78, with relatively large error of measurement; SEM = 31 and MDC = 87. ICC for the intensity items was 0.20 for moderate PA intensity, 0.46 for light PA intensity and to 0.68 for vigorous PA intensity. The Spearman rank correlation coefficient between the Actigraph GT1M total counts per minute and the total PASE score was 0.30 (p = 0.089, and ranging from 0.20-0.38 for the different PA intensity categories. The Spearman rank correlation between IPAQ and PASE was 0.61 (p = 0.001 for the total scores. Conclusions In patients with hip OA the test-retest reliability of the total PASE score was moderate, with acceptable ICC, but with large measurement errors. The construct validity of the PASE was poor when compared to the
Reliability and validity of the new Tanaka B Intelligence Scale scores: a group intelligence test.

Directory of Open Access Journals (Sweden)

Yota Uno

Full Text Available OBJECTIVE: The present study evaluated the reliability and concurrent validity of the new Tanaka B Intelligence Scale, which is an intelligence test that can be administered on groups within a short period of time. METHODS: The new Tanaka B Intelligence Scale and Wechsler Intelligence Scale for Children-Third Edition were administered to 81 subjects (mean age ± SD 15.2 ± 0.7 years residing in a juvenile detention home; reliability was assessed using Cronbach's alpha coefficient, and concurrent validity was assessed using the one-way analysis of variance intraclass correlation coefficient. Moreover, receiver operating characteristic analysis for screening for individuals who have a deficit in intellectual function (an FIQ<70 was performed. In addition, stratum-specific likelihood ratios for detection of intellectual disability were calculated. RESULTS: The Cronbach's alpha for the new Tanaka B Intelligence Scale IQ (BIQ was 0.86, and the intraclass correlation coefficient with FIQ was 0.83. Receiver operating characteristic analysis demonstrated an area under the curve of 0.89 (95% CI: 0.85-0.96. In addition, the stratum-specific likelihood ratio for the BIQ≤65 stratum was 13.8 (95% CI: 3.9-48.9, and the stratum-specific likelihood ratio for the BIQ≥76 stratum was 0.1 (95% CI: 0.03-0.4. Thus, intellectual disability could be ruled out or determined. CONCLUSION: The present results demonstrated that the new Tanaka B Intelligence Scale score had high reliability and concurrent validity with the Wechsler Intelligence Scale for Children-Third Edition score. Moreover, the post-test probability for the BIQ could be calculated when screening for individuals who have a deficit in intellectual function. The new Tanaka B Intelligence Test is convenient and can be administered within a variety of settings. This enables evaluation of intellectual development even in settings where performing intelligence tests have previously been difficult.
The validity and reliability of the Moroccan version of the Revised Fibromyalgia Impact Questionnaire.

Science.gov (United States)

Srifi, Najlaa; Bahiri, Rachid; Rostom, Samira; Bendeddouche, Imad; Lazrek, Noufissa; Hajjaj-Hassouni, Najia

2013-01-01

The Revised Fibromyalgia Impact Questionnaire (FIQ-R) is an updated version of the FIQ attempts to address the limitations of the Fibromyalgia Impact Questionnaire (FIQ). As there is no Moroccan version of the FIQ-R available, we aimed to investigate the validity and reliability of a Moroccan translation of the FIQR in Moroccan fibromyalgia (FM) patients. After translating the FIQR into Moroccan, it was administered to 80 patients with FM. All of the patients filled out the questionnaire together with Arabic version of short form-36 (SF-36). The tender-point count was calculated from tender points identified by thumb palpation. Three days later, FM patients filled out the Moroccan FIQR at their second visit. The test-retest reliability of the Moroccan FIQR questions ranged from 0.72 to 0.87. The test and retest reliability of total FIQR score was 0.84. Cronbach's alpha was 0.91 for FIQR visit 1 (the first assessment) and 0.92 for FIQR visit 2 (the second assessment), indicating acceptable levels of internal consistency for both assessments. Significant correlations for construct validity were obtained between the Moroccan FIQ-R total and domain scores and the subscales of the SF-36 (FIQR total versus SF-36 physical component score and mental component score were r = -0.69, P FIQ-R showed adequate reliability and validity. This instrument can be used in the clinical evaluation of Moroccan and Arabic-speaking patients with FM.
The ERICE-score: the new native cardiovascular score for the low-risk and aged Mediterranean population of Spain.

Science.gov (United States)

Gabriel, Rafael; Brotons, Carlos; Tormo, M José; Segura, Antonio; Rigo, Fernando; Elosua, Roberto; Carbayo, Julio A; Gavrila, Diana; Moral, Irene; Tuomilehto, Jaakko; Muñiz, Javier

2015-03-01

In Spain, data based on large population-based cohorts adequate to provide an accurate prediction of cardiovascular risk have been scarce. Thus, calibration of the EuroSCORE and Framingham scores has been proposed and done for our population. The aim was to develop a native risk prediction score to accurately estimate the individual cardiovascular risk in the Spanish population. Seven Spanish population-based cohorts including middle-aged and elderly participants were assembled. There were 11800 people (6387 women) representing 107915 person-years of follow-up. A total of 1214 cardiovascular events were identified, of which 633 were fatal. Cox regression analyses were conducted to examine the contributions of the different variables to the 10-year total cardiovascular risk. Age was the strongest cardiovascular risk factor. High systolic blood pressure, diabetes mellitus and smoking were strong predictive factors. The contribution of serum total cholesterol was small. Antihypertensive treatment also had a significant impact on cardiovascular risk, greater in men than in women. The model showed a good discriminative power (C-statistic=0.789 in men and C=0.816 in women). Ten-year risk estimations are displayed graphically in risk charts separately for men and women. The ERICE is a new native cardiovascular risk score for the Spanish population derived from the background and contemporaneous risk of several Spanish cohorts. The ERICE score offers the direct and reliable estimation of total cardiovascular risk, taking in consideration the effect of diabetes mellitus and cardiovascular risk factor management. The ERICE score is a practical and useful tool for clinicians to estimate the total individual cardiovascular risk in Spain. Copyright © 2014 Sociedad Española de Cardiología. Published by Elsevier España, S.L.U. All rights reserved.
Reliability, validity, and minimal detectable change of the push-off test scores in assessing upper extremity weight-bearing ability.

Science.gov (United States)

Mehta, Saurabh P; George, Hannah R; Goering, Christian A; Shafer, Danielle R; Koester, Alan; Novotny, Steven

2017-11-01

Clinical measurement study. The push-off test (POT) was recently conceived and found to be reliable and valid for assessing weight bearing through injured wrist or elbow. However, further research with larger sample can lend credence to the preliminary findings supporting the use of the POT. This study examined the interrater reliability, construct validity, and measurement error for the POT in patients with wrist conditions. Participants with musculoskeletal (MSK) wrist conditions were recruited. The performance on the POT, grip isometric strength of wrist extensors was assessed. The shortened version of the Disabilities of the Arm, Shoulder and Hand and numeric pain rating scale were completed. The intraclass correlation coefficient assessed interrater reliability of the POT. Pearson correlation coefficients (r) examined the concurrent relationships between the POT and other measures. The standard error of measurement and the minimal detectable change at 90% confidence interval were assessed as measurement error and index of true change for the POT. A total of 50 participants with different elbow or wrist conditions (age: 48.1 ± 16.6 years) were included in this study. The results of this study strongly supported the interrater reliability (intraclass correlation coefficient: 0.96 and 0.93 for the affected and unaffected sides, respectively) of the POT in patients with wrist MSK conditions. The POT showed convergent relationships with the grip strength on the injured side (r = 0.89) and the wrist extensor strength (r = 0.7). The POT showed smaller standard error of measurement (1.9 kg). The minimal detectable change at 90% confidence interval for the POT was 4.4 kg for the sample. This study provides additional evidence to support the reliability and validity of the POT. This is the first study that provides the values for the measurement error and true change on the POT scores in patients with wrist MSK conditions. Further research should examine the
Reliability and sensitivity to change of the OMERACT rheumatoid arthritis magnetic resonance imaging score in a multireader, longitudinal setting

DEFF Research Database (Denmark)

Haavardsholm, ea; Østergaard, Mikkel; Kvan, NP

2005-01-01

OBJECTIVE: To assess the intra- and interreader reliability and the sensitivity to change of the Outcome Measures in Rheumatology Clinical Trials (OMERACT) Rheumatoid Arthritis Magnetic Resonance Imaging Score (RAMRIS) system on digital images of the wrist joints of patients with early or establi...
Reliability of a retail food store survey and development of an accompanying retail scoring system to communicate survey findings and identify vendors for healthful food and marketing initiatives.

Science.gov (United States)

Ghirardelli, Alyssa; Quinn, Valerie; Sugerman, Sharon

2011-01-01

To develop a retail grocery instrument with weighted scoring to be used as an indicator of the food environment. Twenty six retail food stores in low-income areas in California. Observational. Inter-rater reliability for grocery store survey instrument. Description of store scoring methodology weighted to emphasize availability of healthful food. Type A intra-class correlation coefficients (ICC) with absolute agreement definition or a κ test for measures using ranges as categories. Measures of availability and price of fruits and vegetables performed well in reliability testing (κ = 0.681-0.800). Items for vegetable quality were better than for fruit (ICC 0.708 vs 0.528). Kappa scores indicated low to moderate agreement (0.372-0.674) on external store marketing measures and higher scores for internal store marketing. "Next to" the checkout counter was more reliable than "within 6 feet." Health departments using the store scoring system reported it as the most useful communication of neighborhood findings. There was good reliability of the measures among the research pairs. The local store scores can show the need to bring in resources and to provide access to fruits and vegetables and other healthful food. Copyright © 2011 Society for Nutrition Education. Published by Elsevier Inc. All rights reserved.
Dutch validation of the low anterior resection syndrome score.

Science.gov (United States)

Hupkens, B J P; Breukink, S O; Olde Reuver Of Briel, C; Tanis, P J; de Noo, M E; van Duijvendijk, P; van Westreenen, H L; Dekker, J W T; Chen, T Y T; Juul, T

2018-04-21

The aim of this study was to validate the Dutch translation of the low anterior resection syndrome (LARS) score in a population of Dutch rectal cancer patients. Patients who underwent surgery for rectal cancer received the LARS score questionnaire, a single quality of life (QoL) category question and the European Organization for Research and Treatment of Cancer (EORTC) QLQ-C30 questionnaire. A subgroup of patients received the LARS score twice to assess the test-retest reliability. A total of 165 patients were included in the analysis, identified in six Dutch centres. The response rate was 62.0%. The percentage of patients who reported 'major LARS' was 59.4%. There was a high proportion of patients with a perfect or moderate fit between the QoL category question and the LARS score, showing a good convergent validity. The LARS score was able to discriminate between patients with or without neoadjuvant radiotherapy (P = 0.003), between total and partial mesorectal excision (P = 0.008) and between age groups (P = 0.039). There was a statistically significant association between a higher LARS score and an impaired function on the global QoL subscale and the physical, role, emotional and social functioning subscales of the EORTC QLQ-C30 questionnaire. The test-retest reliability of the LARS score was good, with an interclass correlation coefficient of 0.79. The good psychometric properties of the Dutch version of the LARS score are comparable overall to the earlier validations in other countries. Therefore, the Dutch translation can be considered to be a valid tool for assessing LARS in Dutch rectal cancer patients. Colorectal Disease © 2018 The Association of Coloproctology of Great Britain and Ireland.

The Motivated Strategies for Learning Questionnaire: score validity among medicine residents.

Science.gov (United States)

Cook, David A; Thompson, Warren G; Thomas, Kris G

2011-12-01

The Motivated Strategies for Learning Questionnaire (MSLQ) purports to measure motivation using the expectancy-value model. Although it is widely used in other fields, this instrument has received little study in health professions education. The purpose of this study was to evaluate the validity of MSLQ scores. We conducted a validity study evaluating the relationships of MSLQ scores to other variables and their internal structure (reliability and factor analysis). Participants included 210 internal medicine and family medicine residents participating in a web-based course on ambulatory medicine at an academic medical centre. Measurements included pre-course MSLQ scores, pre- and post-module motivation surveys, post-module knowledge test and post-module Instructional Materials Motivation Survey (IMMS) scores. Internal consistency was universally high for all MSLQ items together (Cronbach's α = 0.93) and for each domain (α ≥ 0.67). Total MSLQ scores showed statistically significant positive associations with post-test knowledge scores. For example, a 1-point rise in total MSLQ score was associated with a 4.4% increase in post-test scores (β = 4.4; p motivation and satisfaction. Scores on MSLQ domains demonstrated associations that generally aligned with our hypotheses. Self-efficacy and control of learning belief scores demonstrated the strongest domain-specific relationships with knowledge scores (β = 2.9 for both). Confirmatory factor analysis showed a borderline model fit. Follow-up exploratory factor analysis revealed the scores of five factors (self-efficacy, intrinsic interest, test anxiety, extrinsic goals, attribution) demonstrated psychometric and predictive properties similar to those of the original scales. Scores on the MSLQ are reliable and predict meaningful outcomes. However, the factor structure suggests a simplified model might better fit the empiric data. Future research might consider how assessing and responding to motivation could enhance
Total Cerebral Small Vessel Disease MRI Score Is Associated With Cognitive Decline In Executive Function In Patients With Hypertension

Directory of Open Access Journals (Sweden)

Renske Uiterwijk

2016-12-01

Full Text Available Objectives: Hypertension is a major risk factor for white matter hyperintensities, lacunes, cerebral microbleeds and perivascular spaces, which are MRI markers of cerebral small vessel disease (SVD. Studies have shown associations between these individual MRI markers and cognitive functioning and decline. Recently, a total SVD score was proposed in which the different MRI markers were combined into one measure of SVD, to capture total SVD-related brain damage. We investigated if this SVD score was associated with cognitive decline over 4 years in patients with hypertension. Methods: In this longitudinal cohort study, 130 hypertensive patients (91 patients with uncomplicated hypertension and 39 hypertensive patients with a lacunar stroke were included. They underwent a neuropsychological assessment at baseline and after 4 years. The presence of white matter hyperintensities, lacunes, cerebral microbleeds, and perivascular spaces were rated on baseline MRI. Presence of each individual marker was added to calculate the total SVD score (range 0-4 in each patient. Results: Uncorrected linear regression analyses showed associations between SVD score and decline in overall cognition (p=0.017, executive functioning (p<0.001 and information processing speed (p=0.037, but not with memory (p=0.911. The association between SVD score and decline in overall cognition and executive function remained significant after adjustment for age, sex, education, anxiety and depression score, potential vascular risk factors, patient group and baseline cognitive performance.Conclusions: Our study shows that a total SVD score can predict cognitive decline, specifically in executive function, over 4 years in hypertensive patients. This emphasizes the importance of considering total brain damage due to SVD.
Validity and reliability of the Fels physical activity questionnaire for children.

Science.gov (United States)

Treuth, Margarita S; Hou, Ningqi; Young, Deborah R; Maynard, L Michele

2005-03-01

The aim was to evaluate the reliability and validity of the Fels physical activity questionnaire (PAQ) for children 7-19 yr of age. A cross-sectional study was conducted among 130 girls and 99 boys in elementary (N=70), middle (N=81), and high (N=78) schools in rural Maryland. Weight and height were measured on the initial school visit. All the children then wore an Actiwatch accelerometer for 6 d. The Fels PAQ for children was given on two separate occasions to evaluate reliability and was compared with accelerometry data to evaluate validity. The reliability of the Fels PAQ for the girls, boys, and the elementary, middle, and high school age groups range was r=0.48-0.76. For the elementary school children, the correlation coefficient examining validity between the Fels PAQ total score and Actiwatch (counts per minute) was 0.34 (P=0.004). The correlation coefficients were lower in middle school (r=0.11, P=0.31) and high school (r=0.21, P=0.006) adolescents. The sport index of the Fels PAQ for children had the highest validity in the high school participants (r=0.34, P=0.002). The Fels PAQ for children is moderately reliable for all age groups of children. Validity of the Fels PAQ for children is acceptable for elementary and high school students when the total activity score or the sport index is used. The sport index was similar to the total score for elementary students but was a better measure of physical activity among high school students.
THE RELIABILITY OF THE MANKIN SCORE FOR OSTEOARTHRITIS

NARCIS (Netherlands)

van der Sluijs, J.A.; GEESINK, RGT; van der Linden, A.J.; BULSTRA, SK; Kuijer, Roelof; DRUKKER, J

For the histopathological classification of the severity of osteoarthritic lesions of cartilage, the Mankin score is frequently used. A necessary constraint on the validity of this scoring system is the consistency with which cartilage lesions are classified. The intra- and interobserver agreement
Reliability and validity of the KIPPPI: an early detection tool for psychosocial problems in toddlers.

Directory of Open Access Journals (Sweden)

Ingrid Kruizinga

Full Text Available BACKGROUND: The KIPPPI (Brief Instrument Psychological and Pedagogical Problem Inventory is a Dutch questionnaire that measures psychosocial and pedagogical problems in 2-year olds and consists of a KIPPPI Total score, Wellbeing scale, Competence scale, and Autonomy scale. This study examined the reliability, validity, screening accuracy and clinical application of the KIPPPI. METHODS: Parents of 5959 2-year-old children in the Rotterdam area, the Netherlands, were invited to participate in the study. Parents of 3164 children (53.1% of all invited parents completed the questionnaire. The internal consistency was evaluated and in subsamples the test-retest reliability and concurrent validity with regard to the Child Behavioral Checklist (CBCL. Discriminative validity was evaluated by comparing scores of parents who worried about their child's upbringing and parent's that did not. Screening accuracy of the KIPPPI was evaluated against the CBCL by calculating the Receiver Operating Characteristic (ROC curves. The clinical application was evaluated by the relation between KIPPPI scores and the clinical decision made by the child health professionals. RESULTS: Psychometric properties of the KIPPPI Total score, Wellbeing scale, Competence scale and Autonomy scale were respectively: Cronbach's alphas: 0.88, 0.86, 0.83, 0.58. Test-retest correlations: 0.80, 0.76, 0.73, 0.60. Concurrent validity was as hypothesised. The KIPPPI was able to discriminate between parents that worried about their child and parents that did not. Screening accuracy was high (>0.90 for the KIPPPI Total score and for the Wellbeing scale. The KIPPPI scale scores and clinical decision of the child health professional were related (p<0.05, indicating a good clinical application. CONCLUSION: The results in this large-scale study of a diverse general population sample support the reliability, validity and clinical application of the KIPPPI Total score, Wellbeing scale and Competence
ERP Reliability Analysis (ERA) Toolbox: An open-source toolbox for analyzing the reliability of event-related brain potentials.

Science.gov (United States)

Clayson, Peter E; Miller, Gregory A

2017-01-01

Generalizability theory (G theory) provides a flexible, multifaceted approach to estimating score reliability. G theory's approach to estimating score reliability has important advantages over classical test theory that are relevant for research using event-related brain potentials (ERPs). For example, G theory does not require parallel forms (i.e., equal means, variances, and covariances), can handle unbalanced designs, and provides a single reliability estimate for designs with multiple sources of error. This monograph provides a detailed description of the conceptual framework of G theory using examples relevant to ERP researchers, presents the algorithms needed to estimate ERP score reliability, and provides a detailed walkthrough of newly-developed software, the ERP Reliability Analysis (ERA) Toolbox, that calculates score reliability using G theory. The ERA Toolbox is open-source, Matlab software that uses G theory to estimate the contribution of the number of trials retained for averaging, group, and/or event types on ERP score reliability. The toolbox facilitates the rigorous evaluation of psychometric properties of ERP scores recommended elsewhere in this special issue. Copyright © 2016 Elsevier B.V. All rights reserved.
A Novel Risk Score in Predicting Failure or Success for Antegrade Approach to Percutaneous Coronary Intervention of Chronic Total Occlusion: Antegrade CTO Score.

Science.gov (United States)

Namazi, Mohammad Hasan; Serati, Ali Reza; Vakili, Hosein; Safi, Morteza; Parsa, Saeed Ali Pour; Saadat, Habibollah; Taherkhani, Maryam; Emami, Sepideh; Pedari, Shamseddin; Vatanparast, Masoomeh; Movahed, Mohammad Reza

2017-06-01

Total occlusion of a coronary artery for more than 3 months is defined as chronic total occlusion (CTO). The goal of this study was to develop a risk score in predicting failure or success during attempted percutaneous coronary intervention (PCI) of CTO lesions using antegrade approach. This study was based on retrospective analyses of clinical and angiographic characteristics of CTO lesions that were assessed between February 2012 and February 2014. Success rate was defined as passing through occlusion with successful stent deployment using an antegrade approach. A total of 188 patients were studied. Mean ± SD age was 59 ± 9 years. Failure rate was 33%. In a stepwise multivariate regression analysis, bridging collaterals (OR = 6.7, CI = 1.97-23.17, score = 2), absence of stump (OR = 5.8, CI = 1.95-17.9, score = 2), presence of calcification (OR = 3.21, CI = 1.46-7.07, score = 1), presence of bending (OR = 2.8, CI = 1.28-6.10, score = 1), presence of near side branch (OR = 2.7, CI = 1.08-6.57, score = 1), and absence of retrograde filling (OR = 2.5, CI = 1.03-6.17, score = 1) were independent predictors of PCI failure. A score of 7 or more was associated with 100% failure rate whereas a score of 2 or less was associated with over 80% success rate. Most factors associated with failure of CTO-PCI are related to lesion characteristics. A new risk score (range 0-8) is developed to predict CTO-PCI success or failure rate during antegrade approach as a guide before attempting PCI of CTO lesions.
Automated Quantification of the Landing Error Scoring System With a Markerless Motion-Capture System.

Science.gov (United States)

Mauntel, Timothy C; Padua, Darin A; Stanley, Laura E; Frank, Barnett S; DiStefano, Lindsay J; Peck, Karen Y; Cameron, Kenneth L; Marshall, Stephen W

2017-11-01

The Landing Error Scoring System (LESS) can be used to identify individuals with an elevated risk of lower extremity injury. The limitation of the LESS is that raters identify movement errors from video replay, which is time-consuming and, therefore, may limit its use by clinicians. A markerless motion-capture system may be capable of automating LESS scoring, thereby removing this obstacle. To determine the reliability of an automated markerless motion-capture system for scoring the LESS. Cross-sectional study. United States Military Academy. A total of 57 healthy, physically active individuals (47 men, 10 women; age = 18.6 ± 0.6 years, height = 174.5 ± 6.7 cm, mass = 75.9 ± 9.2 kg). Participants completed 3 jump-landing trials that were recorded by standard video cameras and a depth camera. Their movement quality was evaluated by expert LESS raters (standard video recording) using the LESS rubric and by software that automates LESS scoring (depth-camera data). We recorded an error for a LESS item if it was present on at least 2 of 3 jump-landing trials. We calculated κ statistics, prevalence- and bias-adjusted κ (PABAK) statistics, and percentage agreement for each LESS item. Interrater reliability was evaluated between the 2 expert rater scores and between a consensus expert score and the markerless motion-capture system score. We observed reliability between the 2 expert LESS raters (average κ = 0.45 ± 0.35, average PABAK = 0.67 ± 0.34; percentage agreement = 0.83 ± 0.17). The markerless motion-capture system had similar reliability with consensus expert scores (average κ = 0.48 ± 0.40, average PABAK = 0.71 ± 0.27; percentage agreement = 0.85 ± 0.14). However, reliability was poor for 5 LESS items in both LESS score comparisons. A markerless motion-capture system had the same level of reliability as expert LESS raters, suggesting that an automated system can accurately assess movement. Therefore, clinicians can use
End-stage dementia spark of life: reliability and validity of the "GATOS" questionnaire.

Science.gov (United States)

Tsoucalas, Gregory; Bourelia, Stamati; Kalogirou, Vaso; Giatsiou, Styliani; Mavrogiannaki, Eirini; Gatos, Georgios; Galanos, Antonis; Repana, Olga; Iliadou, Eleni; Antoniou, Antonis; Sgantzos, Markos; Gatos, Konstantinos

2015-01-01

Fl oor effects are present in most dementia assessment tools as dementia progresses and the in-depth assessment of patients considered more or less on vegetative state is questionable. To develop a questionnaire (the "Gatos Clinical Test-GCT") for the assessment of end-stage demented patients. Five hundred patients with dementia of various causes and an MMSE score between 0 and 2 were enrolled in the study. The GCT consists of 14 closed type questions rated on a Likert scale. The total score is used to evaluate patient's dementia. Various aspects of validity and reliability (including face, content and structural validity as well as test-retest reliability) were examined. Three subscales "Autonomy/Alertness", "Gnosias" and "Somatokinetic function" were defined, with a Cronbach equal to 0.851, 0.756 and 0.598 respectively. The GCT subscales and total score were statistically significant higher in patients with MMSE score 1 or 2 compared with those with MMSE score 0 (pGATOS" questionnaire is a valid and reliable test for patients with severe dementia, aiming at identification of those patients who could sustain some quality of life. It is a relatively short and easy to administer tool. As dementia prevalence is expected to rise further worldwide we believe that GCT could offer valuable services to health professionals, caregivers and patients.
Comparison between the Harris- and Oxford Hip Score to evaluate outcomes one-year after total hip arthroplasty

NARCIS (Netherlands)

Weel, Hanneke; Lindeboom, Robert; Kuipers, Sander E.; Vervest, Ton M. J. S.

2017-01-01

Harris Hip Score (HHS) is a surgeon administered measurement for assessing hip function before and after total hip arthroplasties (THA). Patient reported outcome measurements (PROMs) such as the Oxford Hip Score (OHS) are increasingly used. HHS was compaired to the OHS assessing whether the HHS can
Reliability and accuracy of digital templating for the humeral component of total shoulder arthroplasty.

Science.gov (United States)

Lee, Christopher S; Davis, Shane M; Lane, Christianne J; Koonce, Ryan C; Hartman, Andrew P; Ball, Kenneth; Esch, James C

2015-01-01

This experimental study evaluated the interobserver reliability and accuracy of pre-operative digital templating for humeral head size, stem size and neck angle for total shoulder arthroplasty. Twenty-five patients underwent a total shoulder arthroplasty with a single prosthesis. Four independent, blinded surgeons (two experienced shoulder surgeons and two PGY-6 fellows) used pre-operative radiographs and templating software to generate templates of the humeral head, stem and neck for each patient. Interobserver reliability was calculated using weighted kappa (κ) analysis. Accuracy was assessed by comparing templates to actual implant sizes. Interobserver reliability was fair to substantial (κ = 0.26 to 0.71) for head size, fair to substantial (κ = 0.39 to 0.72) for stem size and slight to fair (κ = 0.16 to 0.34) for neck angle. Templated head size, stem size and neck angle had accuracies of 53%, 77% and 68% within one size variation, respectively. Experience did not affect accuracy (p = 0.11 to 0.48). Digital templating is not a useful guide for pre-operative surgical planning and should not be used to select a prosthesis.
Boundary curves of individual items in the distribution of total depressive symptom scores approximate an exponential pattern in a general population

OpenAIRE

Tomitaka, Shinichiro; Kawasaki, Yohei; Ide, Kazuki; Akutagawa, Maiko; Yamada, Hiroshi; Furukawa, Toshiaki A.; Ono, Yutaka

2016-01-01

[Background]Previously, we proposed a model for ordinal scale scoring in which individual thresholds for each item constitute a distribution by each item. This lead us to hypothesize that the boundary curves of each depressive symptom score in the distribution of total depressive symptom scores follow a common mathematical model, which is expressed as the product of the frequency of the total depressive symptom scores and the probability of the cumulative distribution function of each item th...
The Reliability and Validity of Weighted Composite Scores.

Science.gov (United States)

Kane, Michael; Case, Susan

The scores on two distinct tests (e.g., essay and objective) are often combined into a composite score, which is used to make decisions. The validity of the observed composite can sometimes be evaluated relative to a separate criterion. In cases where no criterion is available, the observed composite has generally been evaluated in terms of its…
Validation study of the Forgotten Joint Score-12 as a universal patient-reported outcome measure.

Science.gov (United States)

Matsumoto, Mikio; Baba, Tomonori; Homma, Yasuhiro; Kobayashi, Hideo; Ochi, Hironori; Yuasa, Takahito; Behrend, Henrik; Kaneko, Kazuo

2015-10-01

The Forgotten Joint Score-12 (FJS-12) is for patients to forget their artificial joint and is reportedly a useful patient-reported outcome tool for artificial joints. The purpose of this study was to determine whether the FJS-12 is as useful as the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) or the Japanese Orthopaedic Association Hip Disease Evaluation Questionnaire (JHEQ) in Japan. All patients who visited our hospital's hip joint specialists following unilateral THA from August 2013 to July 2014 were evaluated. Medical staff members other than physicians administered three questionnaires. Items evaluated were (1) the reliability of the FJS-12 and (2) correlations between the FJS-12 and the total and subscale scores of the WOMAC or JHEQ. Of 130 patients, 22 were excluded. Cronbach's α coefficient was 0.97 for the FJS-12. The FJS-12 showed a significantly lower score than the WOMAC or JHEQ (p < 0.01). The FJS-12 was moderately correlated with the total WOMAC score (r = 0.522) and its subscale scores for "stiffness" (r = 0.401) and "function" (r = 0.539) and was weakly correlated with the score for "pain" (r = 0.289). The FJS-12 was favorably correlated with the total JHEQ score (r = 0.686) and its subscale scores (r = 0.530-0.643). The FJS-12 was correlated with and showed reliability similar to that of the JHEQ and WOMAC. The FJS-12, which is not affected by culture or lifestyle, may be useful in Japan.
Validity and Reliability Study of Bahasa Malaysia Version of Voice Handicap Index-10.

Science.gov (United States)

Ong, Fei Ming; Husna Nik Hassan, Nik Fariza; Azman, Mawaddah; Sani, Abdullah; Mat Baki, Marina

2018-05-21

This study aimed to determine the validity and reliability of Bahasa Malaysia version of Voice Handicap Index-10 (mVHI-10). This cross-sectional study was carried out in the Otorhinolaryngology, Head and Neck Surgery Department of Universiti Kebangsaan Malaysia Medical Centre (UKMMC) from June 2015 to May 2016. The mVHI-10 was produced following a rigorous forward and backward translation. One hundred participants, including 50 healthy volunteers (17 male, 33 female) and 50 patients with voice disorders (26 male, 24 female), were recruited to complete the mVHI-10 before flexible laryngoscopic examinations and acoustic analysis. The mVHI-10 was repeated in 2 weeks via telephone interview or clinic visit. Its reliability and validity were assessed using interclass correlation. The test-retest reliability for total mVHI-10 and each item score was high, with the Cronbach alpha of >0.90. The total mVHI-10 score and domain scores were significantly higher (P Kaiser-Meyer-Olkin measure was 0.92, which depicted excellent construct validity. There was a significant positive correlation between the mVHI-10 score and jitter and shimmer result (P < 0.001). The present study showed good reliability and validity of the mVHI-10 when applied to both healthy volunteers and patients with voice disorders. We recommend the use of the mVHI-10 in daily clinical practice among Bahasa Malaysia-speaking population. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Reliability performance testing of totally encapsulating chemical protective suits

International Nuclear Information System (INIS)

Johnson, J.S.; Swearengen, P.M.

1991-01-01

The need to assure a high degree of reliability for totally encapsulating chemical protective (TECP) suits has been recognized by Lawrence Livermore National Laboratory's (LLNL) Hazards Control Department for some time. The following four tests were proposed as necessary to provide complete evaluation of TECP suit performance: 1. Quantitative leak test (ASTM draft), 2. Worst-case chemical exposure test (conceptual), 3. Pressure leak-rate test (complete, ASTM F1057-87), and 4. Chemical leak-rate test (ASTM draft). This paper reports on these tests which should be applied to measuring TECP suit performance in two stages: design qualification tests and field use tests. Test 1, 2, and 3 are used as design qualification tests, and tests 3 and 4 are used as field use tests
Negative emotions affect postoperative scores for evaluating functional knee recovery and quality of life after total knee replacement

Directory of Open Access Journals (Sweden)

A. Qi

2016-01-01

Full Text Available This study aimed to determine whether psychological factors affect health-related quality of life (HRQL and recovery of knee function in total knee replacement (TKR patients. A total of 119 TKR patients (male: 38; female: 81 completed the Beck Anxiety Inventory (BAI, Beck Depression Inventory (BDI, State Trait Anxiety Inventory (STAI, Eysenck Personality Questionnaire-revised (EPQR-S, Knee Society Score (KSS, and HRQL (SF-36. At 1 and 6 months after surgery, anxiety, depression, and KSS scores in TKR patients were significantly better compared with those preoperatively (P<0.05. SF-36 scores at the sixth month after surgery were significantly improved compared with preoperative scores (P<0.001. Preoperative Physical Component Summary Scale (PCS and Mental Component Summary Scale (MCS scores were negatively associated with extraversion (E score (B=-0.986 and -0.967, respectively, both P<0.05. Postoperative PCS and State Anxiety Inventory (SAI scores were negatively associated with neuroticism (N score; B=-0.137 and -0.991, respectively, both P<0.05. Postoperative MCS, SAI, Trait Anxiety Inventory (TAI, and BAI scores were also negatively associated with the N score (B=-0.367, -0.107, -0.281, and -0.851, respectively, all P<0.05. The KSS function score at the sixth month after surgery was negatively associated with TAI and N scores (B=-0.315 and -0.532, respectively, both P<0.05, but positively associated with the E score (B=0.215, P<0.05. The postoperative KSS joint score was positively associated with postoperative PCS (B=0.356, P<0.05. In conclusion, for TKR patients, the scores used for evaluating recovery of knee function and HRQL after 6 months are inversely associated with the presence of negative emotions.
Reliability of IOTA score and ADNEX model in the screening of ovarian malignancy in postmenopausal women.

Science.gov (United States)

Nohuz, Erdogan; De Simone, Luisa; Chêne, Gautier

2018-04-28

The IOTA (International Ovarian Tumor Analysis) group has developed the ADNEX (Assessment of Different NEoplasias in the adneXa) model to predict the risk that an ovarian mass is benign, borderline or malignant. This study aimed to test reliability of these risks prediction models to improve the performance of pelvic ultrasound and discriminate between benign and malignant cysts. Postmenopausal women with an adnexal mass (including ovarian, para-ovarian and tubal) and who underwent a standardized ultrasound examination before surgery were included. Prospectively and retrospectively collected data and ultrasound appearances of the tumors were described using the terms and definitions of the IOTA group and tested in accordance with the ADNEX model and were compared to the final histological diagnosis. Of the 107 menopausal patients recruited between 2011 and 2016, 14 were excluded (incomplete inclusion criteria). Thus, 93 patients constituted a cohort in whom 89 had benign cysts (83 ovarian and 6 tubal or para-ovarian cysts), 1 had border line tumor and 3 had invasive ovarian cancers (1 at first stage, 1 at advanced stage and 1 metastatic tumor in the ovary). The overall prevalence of malignancy was 4.3%. Every benign ovarian cyst was classified as probably benign by IOTA score which showed also a high specificity with the totality of probably malignant lesion proved malignant by histological exam. The limit of this score was the important rate of not classified or undetermined cysts. However, the malignancy risks calculated by ADNEX model allowed identifying the totality of malignancy. Thus, the combination of the two methods of analysis showed a sensitivity and specificity rates of respectively 100% and 98%. Evaluation of malignancy risks by these 2 tests highlighted a negative predictive value of 100% (there was no case of false negative) and a positive predictive value of 80%. On the basis of our findings, the IOTA classification and the ADNEX multimodal
RENZI SCORE FOR OBSTRUCTED DEFECATION SYNDROME - VALIDATION OF THE PORTUGUESE VERSION ACCORDING TO THE COSMIN CHECKLIST.

Science.gov (United States)

Caetano, Ana Celia; Dias, Sara; Santa-Cruz, André; Rolanda, Carla

2018-01-01

Recently, the Obstructed Defecation Syndrome score (ODS score) was developed and validated by Renzi to assess clinical staging and to allow evaluation and comparison of the efficacy of treatment of this disorder. Our goal is to validate the Portuguese version of Renzi ODS score, according to the Consensus based Standards for the selection of the Health Measurement Instruments (COSMIN) checklist. Following guidelines for cross-cultural validity, Renzi ODS score was translated into the Portuguese language. Then, a group of patients and healthy controls were invited to fill in the Renzi ODS score at baseline, after 2 weeks and 3 months, respectively. We assessed internal consistency, reliability and measurement error, content and construct validity, responsiveness and interpretability. A total of 113 individuals (77 patients; 36 healthy controls) completed the questionnaire. Seventy and 30 patients repeated the Renzi ODS score after 2 weeks and 3 months respectively. Factor analysis confirmed the unidimensionality of the scale. Cronbach's α coefficient of 0.77 supported item's homogeneity. Weighted quadratic kappa of 0.89 established test-retest reliability. The smallest detectable change at the individual level was 2.66 and at the group level was 0.30. Renzi ODS score and the total (-0.32) and physical (-0.43) SF-36 scores correlated negatively. Patient and control's groups significantly differed (11 points). The change score of Renzi ODS score between baseline and 3 months correlated negatively with the clinical evolution (-0.86). ROC analysis showed minimal important change of 2.00 with AUC 0.97. Neither floor nor ceiling effects were observed. This work validated the Portuguese version of Renzi ODS score. We can now use this reliable, responsive, and interpretable (at the group level) tool to evaluate Portuguese ODS patients.
Validity and reliability of Abbreviated Mental Test Score (AMTS) among older Iranian.

Science.gov (United States)

Foroughan, Mahshid; Wahlund, Lars-Olof; Jafari, Zahra; Rahgozar, Mehdi; Farahani, Ida G; Rashedi, Vahid

2017-11-01

Cognitive impairment is common among older people and is associated with increased morbidity and mortality. The main aim of this study was to evaluate the validity of the Persian version of the Abbreviated Mental Test Score (AMTS) as a screening tool for dementia. Data were obtained from a cross-sectional study. One hundred and one older adults who were members of Iranian Alzheimer Association and 101 of their siblings were entered into this study by convenient sampling. The Diagnostic and Statistical Manual of Mental Disorders, 4th edition, criteria for diagnosing dementia and the Mini-Mental State Examination were used as the study tools. The gathered data were analyzed by the Mann-Whitney U-test, the Kruskal-Wallis test, Spearman's rank correlation coefficient, and the receiver-operating characteristic. The AMTS could successfully differentiate the dementia group from the non-dementia group. Scores were significantly correlated with Diagnostic and Statistical Manual of Mental Disorders diagnosis for dementia and Mini-Mental State Examination scores (P < 0.001). Educational level (P < 0.001) and male sex (P = 0.015) were positively associated with AMTS, whereas (P < 0.001) was negatively associated with AMTS. Total Cronbach's α coefficient was 0.90. The scores 6 and 7 showed the optimum balance between sensitivity (99% and 94%, respectively) and specificity (85% and 86%, respectively). The Persian version of the AMTS is a valid cognitive assessment tool for older Iranian adults and can be used for dementia screening in Iran. © 2017 Japanese Psychogeriatric Society.

Reliability, validity and responsiveness of the German self-reported foot and ankle score (SEFAS) in patients with foot or ankle surgery.

Science.gov (United States)

Arbab, Dariusch; Kuhlmann, Katharina; Schnurr, Christoph; Bouillon, Bertil; Lüring, Christian; König, Dietmar

2017-10-10

Patient-reported outcome measures are a critical tool in evaluating the efficacy of orthopedic procedures and are increasingly used in clinical trials to assess outcomes of health care. The intention of this study was to develop and culturally adapt a German version of the Self-reported Foot and Ankle Score (SEFAS) and to evaluate reliability, validity and responsiveness. According to Cross Cultural Adaptation of Self-Reported Measure guidelines forward and backward translation has been performed. The German SEFAS was investigated in 177 consecutive patients. 177 Patients completed the German SEFAS, Foot and Ankle Outcome Score (FAOS), Short-Form 36 and numeric scales for pain and disability (NRS) before and 118 patients 6 months after foot or ankle surgery. Test-Retest reliability, internal consistency, floor and ceiling effects, construct validity and minimal important change were analyzed. The German SEFAS demonstrated excellent test-retest reliability with ICC values of 0.97. Cronbach's alpha (α) value of 0.89 demonstrated strong internal consistency. No floor or ceiling effects were observed for the German version of the SEFAS. As hypothesized SEFAS correlated strongly with FAOS and SF-36 domains. It showed moderate (ES/SRM > 0.5) responsiveness between preoperative assessment and postoperative follow-up. The German version of the SEFAS demonstrated good psychometric properties. It proofed to be a valid and reliable instrument for use in foot and ankle patients. DRKS00007585.
Development and reliability of a structured interview guide for the Montgomery Asberg Depression Rating Scale (SIGMA).

Science.gov (United States)

Williams, Janet B W; Kobak, Kenneth A

2008-01-01

The Montgomery-Asberg Depression Rating Scale (MADRS) is often used in clinical trials to select patients and to assess treatment efficacy. The scale was originally published without suggested questions for clinicians to use in gathering the information necessary to rate the items. Structured and semi-structured interview guides have been found to improve reliability with other scales. To describe the development and test-retest reliability of a structured interview guide for the MADRS (SIGMA). A total of 162 test-retest interviews were conducted by 81 rater pairs. Each patient was interviewed twice, once by each rater conducting an independent interview. The intraclass correlation for total score between raters using the SIGMA was r=0.93, Preliability. Use of the SIGMA can result in high reliability of MADRS scores in evaluating patients with depression.
Development of a valid and reliable test to assess trauma radiograph interpretation performance

International Nuclear Information System (INIS)

Neep, M.J.; Steffens, T.; Riley, V.; Eastgate, P.; McPhail, S.M.

2017-01-01

Objectives: The purpose of this investigation was to develop and examine the preliminary validity and reliability among radiographers of a test to assess trauma radiograph interpretation performance suitable for use among health professionals. Methods: Stage 1 examined 14,159 consecutive appendicular and axial examinations from a hospital emergency department over a 12 month period to quantify a typical anatomical region case-mix of trauma radiographs. A sample of radiographic cases representative of affected anatomical regions was then developed into the Image Interpretation Test (IIT). Stage 2 involved prospective investigations of the IIT's reliability (inter-rater, intra-rater, internal consistency) and validity (concurrent) among 41 radiographers. Results: The IIT included 60 cases. The median (interquartile range) clinical experience of participants was 5 (2–10) years. Case scores were internally consistent (Cronbach's alpha = 0.90). Favourable inter-rater reliability (kappa > 0.70 for 58/60 cases, Intra-class correlation coefficient (ICC) > 0.99 for total score) and intra-rater reliability (kappa > 0.90 for 60/60 cases, ICC > 0.99 for total score) was observed. There was a positive association between radiographers' confidence in image interpretation and IIT score (coefficient = 1.52, r-squared = 0.60, p < 0.001). Conclusions: The IIT developed during this investigation included a selection of radiographic cases consistent with anatomical regions represented in an adult trauma case-mix. This study has also provided foundational preliminary evidence to support the reliability and validity of the IIT among radiographers. The findings suggest that it is possible to assess image interpretation performance of adult trauma radiographs with this test. - Highlights: • Development of an Image Interpretation Test (IIT). • Cases consistent with anatomical regions represented in a typical adult trauma case-mix. • Development of a
Korean Version of the Delirium Rating Scale-Revised-98: Reliability and Validity

Science.gov (United States)

Ryu, Jian; Lee, Jinyoung; Kim, Hwi-Jung; Shin, Im Hee; Kim, Jeong-Lan; Trzepacz, Paula T.

2011-01-01

Objective The aims of the present study were 1) to standardize the validity and reliability of the Korean version of Delirium Rating Scale-Revised-98 (DRS-R98-K) and 2) to establish the optimum cut-off value, sensitivity, and specificity for discriminating delirium from other non-delirious psychiatric conditions. Methods Using DSM-IV criteria, 157 subjects (69 delirium, 29 dementia, 32 schizophrenia, and 27 other psychiatric patients) were enrolled. Subjects were evaluated using DRS-R98-K, DRS-K, Mini-Mental State Examination (MMSE-K), and Clinical Global Impression-Severity (CGI-S) scale. Results DRS-R98-K total and severity scores showed high correlations with DRS-K. They were significantly different across all groups (p=0.000). However, neither MMSE-K nor CGI-S distinguished delirium from dementia. All DRS-R98-K diagnostic items (#14-16) and items #1 and 2 significantly discriminated delirium from dementia. Cronbach's alpha coefficient revealed high internal consistency for DRS-R98-K total (r=0.91) and severity (r=0.89) scales. Interrater reliability (ICC between 0.96 and 1) was very high. Using receiver operating characteristic analysis, the area under the curve of DRS-R98-K total score was 0.948 between the delirium group and all other groups and 0.873 between the delirium and dementia groups. The best cut-off scores in DRS-R98-K total score were 18.5 and 19.5 between the delirium and the other three groups and 20.5 between the delirium and dementia groups. Conclusion We demonstrated that DRS-R98-K is a valid and reliable instrument for assessing delirium severity and diagnosis and discriminating delirium from dementia and other psychiatric disorders in Korean patients. PMID:21519534
[Reliability and validity of the standardized Mini Mental State Examination in the diagnosis of mild dementia in Turkish population].

Science.gov (United States)

Güngen, Can; Ertan, Turan; Eker, Engin; Yaşar, Resmiye; Engin, Funda

2002-01-01

Reliability and validity of the Mini Mental State Examination in differentiating mild dementia from normal controls in Turkish population. The Standardized Mini Mental State Examination (SMMSE) and its instruction were translated into Turkish. A total of 212 subjects with mean age of 77 +/- 6, were recruited for the study. 71 were diagnosed to be demented and 141 were evaluated as normal controls. The scale total score was analysed for discriminant validity using Student's t-test. Sensitivity, specificity, positive and negative predictive values and kappa score were calculated for all of the scores between 18 and 29. Kappa value was calculated for the comparison of the dementia diagnosis between the two investigators using the best cut off score obtained in the analysis above. Statistical analysis revealed that the Turkish version of the SMMSE has high discriminant validity and interrater reliability in the diagnosis of mild dementia. The cut off score 23/24 was found to have the highest sensitivity (0.91), specificity (0.95), positive and negative predictive values (0.90 and 0.95) and kappa score (0.86). Interrater reliability analysis showed high correlation (r:0.99) and kappa value (0.92). The results of this study showed that the Turkish version of the SMMSE has high reliability and validity for the diagnosis of mild dementia in Turkish population.
A comparison between patient recall and concurrent measurement of preoperative quality of life outcome in total hip arthroplasty.

Science.gov (United States)

Howell, Jonathan; Xu, Min; Duncan, Clive P; Masri, Bassam A; Garbuz, Donald S

2008-09-01

The objective is to evaluate the reliability of patients' recall of preoperative pain and function during the immediate postoperation period after total hip arthroplasty. A prospective cohort of 104 patients completed a survey about their quality of life before operation, and recalled preoperative status at 3 days, 6 weeks, and 12 weeks after operation. Quality of life was measured by the Western Ontario and McMaster University Osteoarthritis Index, the Oxford-12 hip score, and the 12-item Short-Form score. The intraclass correlation coefficient and Spearman correlation coefficient were used to compare preoperative quality of life scores to the scores recalled. The reliability of recall remained high up to 3 months postoperation. Patients are able to accurately recall their preoperative function for up to 3 months after total hip arthroplasty.
Reliability and validity of Persian version of Western Ontario and McMaster Universities Osteoarthritis index in knee osteoarthritis

Directory of Open Access Journals (Sweden)

Bina Eftekhar-Sadat

2015-08-01

Full Text Available Introduction: This study aimed to test the reliability and validity of translated and adapted version of Western Ontario and McMaster (WOMAC questionnaire in Persian language speaking patients with symptomatic osteoarthritis (OA of the knee. Methods: 100 consecutive patients, attended 3 major referral rehabilitation centers at the northwest of Iran, were asked to answer two disease-specific questionnaires WOMAC and knee injury and osteoarthritis outcome score (KOOS. The same patients were readmitted for refilling the same questionnaire 24-48 hours after the first visit. Internal consistency, reliability, and validity were assessed. Results: There were statistically significant correlations between WOMAC and KOOS in case of the pain (P < 0.001 and stiffness (P = 0.004 scores subclass, the sum of difficulty with performing daily activity (DPDA score (P = 0.001 and also the total score (P < 0.001. Internal consistency with Cronbach’s alpha for the pain, stiffness, and physical function subscales were 0.96, 0.98, and 0.99, respectively. Internal consistency with Cronbach’s alpha for the total score of WOMAC was 0.99. Conclusion: We found that this Persian version of WOMAC questionnaire is a reliable and valid version for evaluating the knee OA.
SENSITIVITY AND SPECIFICITY OF INDIVIDUAL BERG BALANCE ITEMS COMPARED WITH THE TOTAL SCORE TO PREDICT FALLS IN COMMUNITY DWELLING ELDERLY INDIVIDUALS

Directory of Open Access Journals (Sweden)

Hazel Denzil Dias

2014-09-01

Full Text Available Background: Falls are a major problem in the elderly leading to increased morbidity and mortality in this population. Scores from objective clinical measures of balance have frequently been associated with falls in older adults. The Berg Balance Score (BBS which is a frequently used scale to test balance impairments in the elderly ,takes time to perform and has been found to have scoring inconsistencies. The purpose was to determine if individual items or a group of BBS items would have better accuracy than the total BBS in classifying community dwelling elderly individuals according to fall history. Method: 60 community dwelling elderly individuals were chosen based on a history of falls in this cross sectional study. Each BBS item was dichotomized at three points along the scoring scale of 0 – 4: between scores of 1 and 2, 2 and 3, and 3 and 4. Sensitivity (Sn, specificity (Sp, and positive (+LR and negative (-LR likelihood ratios were calculated for all items for each scoring dichotomy based on their accuracy in classifying subjects with a history of multiple falls. These findings were compared with the total BBS score where the cut-off score was derived from receiver operating characteristic curve analysis. Results: On analysing a combination of BBS items, B9 and B11 were found to have the best sensitivity and specificity when considered together. However the area under the curve of these items was 0.799 which did not match that of the total score (AUC= 0.837. A, combination of 4 BBS items - B9 B11 B12 and B13 also had good Sn and Sp but the AUC was 0.815. The combination with the AUC closest to that of the total score was a combination items B11 and B13. (AUC= 0.824. hence these two items can be used as the best predictor of falls with a cut off of 6.5 The ROC curve of the Total Berg balance Scale scores revealed a cut off score of 48.5. Conclusion: This study showed that combination of items B11 and B13 may be best predictors of falls in
SENSITIVITY AND SPECIFICITY OF INDIVIDUAL BERG BALANCE ITEMS COMPARED WITH THE TOTAL SCORE TO PREDICT FALLS IN COMMUNITY DWELLING ELDERLY INDIVIDUALS

Directory of Open Access Journals (Sweden)

Hazel Denzil Dias

2014-06-01

Full Text Available Background: Falls are a major problem in the elderly leading to increased morbidity and mortality in this population. Scores from objective clinical measures of balance have frequently been associated with falls in older adults. The Berg Balance Score (BBS which is a frequently used scale to test balance impairments in the elderly ,takes time to perform and has been found to have scoring inconsistencies. The purpose was to determine if individual items or a group of BBS items would have better accuracy than the total BBS in classifying community dwelling elderly individuals according to fall history. Method: 60 community dwelling elderly individuals were chosen based on a history of falls in this cross sectional study. Each BBS item was dichotomized at three points along the scoring scale of 0 – 4: between scores of 1 and 2, 2 and 3, and 3 and 4. Sensitivity (Sn, specificity (Sp, and positive (+LR and negative (-LR likelihood ratios were calculated for all items for each scoring dichotomy based on their accuracy in classifying subjects with a history of multiple falls. These findings were compared with the total BBS score where the cut-off score was derived from receiver operating characteristic curve analysis. Results: On analysing a combination of BBS items, B9 and B11 were found to have the best sensitivity and specificity when considered together. However the area under the curve of these items was 0.799 which did not match that of the total score (AUC= 0.837. A, combination of 4 BBS items - B9 B11 B12 and B13 also had good Sn and Sp but the AUC was 0.815. The combination with the AUC closest to that of the total score was a combination items B11 and B13. (AUC= 0.824. hence these two items can be used as the best predictor of falls with a cut off of 6.5 The ROC curve of the Total Berg balance Scale scores revealed a cut off score of 48.5. Conclusion: This study showed that combination of items B11 and B13 may be best predictors of falls in
Evaluating the test-retest reliability of symptom indices associated with the ImPACT post-concussion symptom scale (PCSS).

Science.gov (United States)

Merritt, Victoria C; Bradson, Megan L; Meyer, Jessica E; Arnett, Peter A

2018-05-01

The Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) is a commonly used tool in sports concussion assessment. While test-retest reliabilities have been established for the ImPACT cognitive composites, few studies have evaluated the psychometric properties of the ImPACT's Post-Concussion Symptom Scale (PCSS). The purpose of this study was to establish the test-retest reliability of symptom indices associated with the PCSS. Participants included 38 undergraduate students (50.0% male) who underwent neuropsychological testing as part of their participation in their psychology department's research subject pool. The majority of the participants were Caucasian (94.7%) and had no history of concussion (73.7%). All participants completed the ImPACT at two time points, approximately 6 weeks apart. The PCSS was the main outcome measure, and eight symptom indices were calculated (a total symptom score, three symptom summary indices, and four symptom clusters). Pearson correlations (r) and intraclass correlation coefficients (ICCs) were computed as measures of test-retest reliability. Overall, reliabilities ranged from low to high (r = .44 to .80; ICC = .44 to .77). The cognitive symptom cluster exhibited the highest test-retest reliability (r = .80, ICC = .77), followed by the positive symptom total (PST) index, an indicator of the total number of symptoms endorsed (r = .71, ICC = .69). In contrast, the commonly used total symptom score showed lower test-retest reliability (r = .67, ICC = .62). Paired-samples t tests revealed no significant differences between test and retest for any of the symptom variables (all p > .01). Finally, reliable change indices (RCI) were computed to determine whether differences observed between test and retest represented clinically significant change. RCI values were provided for each symptom index at the 80%, 90%, and 95% confidence intervals. These results suggest that evaluating additional symptom
Reliability analysis of visual ranking of coronary artery calcification on low-dose CT of the thorax for lung cancer screening: comparison with ECG-gated calcium scoring CT.

Science.gov (United States)

Kim, Yoon Kyung; Sung, Yon Mi; Cho, So Hyun; Park, Young Nam; Choi, Hye-Young

2014-12-01

Coronary artery calcification (CAC) is frequently detected on low-dose CT (LDCT) of the thorax. Concurrent assessment of CAC and lung cancer screening using LDCT is beneficial in terms of cost and radiation dose reduction. The aim of our study was to evaluate the reliability of visual ranking of positive CAC on LDCT compared to Agatston score (AS) on electrocardiogram (ECG)-gated calcium scoring CT. We studied 576 patients who were consecutively registered for health screening and undergoing both LDCT and ECG-gated calcium scoring CT. We excluded subjects with an AS of zero. The final study cohort included 117 patients with CAC (97 men; mean age, 53.4 ± 8.5). AS was used as the gold standard (mean score 166.0; range 0.4-3,719.3). Two board-certified radiologists and two radiology residents participated in an observer performance study. Visual ranking of CAC was performed according to four categories (1-10, 11-100, 101-400, and 401 or higher) for coronary artery disease risk stratification. Weighted kappa statistics were used to measure the degree of reliability on visual ranking of CAC on LDCT. The degree of reliability on visual ranking of CAC on LDCT compared to ECG-gated calcium scoring CT was excellent for board-certified radiologists and good for radiology residents. A high degree of association was observed with 71.6% of visual rankings in the same category as the Agatston category and 98.9% varying by no more than one category. Visual ranking of positive CAC on LDCT is reliable for predicting AS rank categorization.
Interobserver and Intraobserver Reliability of Three-Dimensional Preoperative Planning Software in Total Hip Arthroplasty.

Science.gov (United States)

Wako, Yasushi; Nakamura, Junichi; Miura, Michiaki; Kawarai, Yuya; Sugano, Masahiko; Nawata, Kento

2018-02-01

The purpose of this study is to clarify interobserver and intraobserver reliabilities of the three-dimensional (3D) templating of total hip arthroplasty (THA). We selected preoperative computed tomography from 60 hips in 46 patients (14 men and 32 women) who underwent primary THA. To evaluate interobserver and intraobserver reliability, 6 orthopedic surgeons performed 3D templating twice over a 4-week interval. We investigated intraclass correlation coefficients (ICCs) and percent agreement of component size and alignment, comparing morphological differences in the hip. Reproducibility was also compared between groups with osteoarthritis (OA) and those with osteonecrosis (ON). The interobserver reliabilities for mean cup size and stem size were excellent, with ICC = 0.907 and 0.944, respectively. The value was significantly higher in the ON group than in the OA group. In the OA group, the reliability of cup size and alignment decreased in hips with severe subluxation. Percent agreement of stem size was significantly different between the shapes of femoral canal. For intraobserver reliability, the mean ICC of cup size was 0.965 overall, while the value in the ON group was significantly higher than in the OA group. The mean ICC of stem size was 0.972 overall. Computed tomography-based 3D templating showed excellent reliability for component size and alignment in THA. Deformity of the affected joint influenced the reliability of preoperative planning. Copyright © 2017 Elsevier Inc. All rights reserved.
Reliability and validity of the international dementia alliance schedule for the assessment and staging of care in China.

Science.gov (United States)

Wang, Xiao; Sun, Zhenghai; Xiong, Lingchuan; Semrau, Maya; He, Jianhua; Li, Yang; Zhu, Jianzhong; Zhang, Nan; Wang, Aimin; Jiang, Qinpu; Mu, Nan; Zhao, Yuping; Chen, Wei; Wu, Donghui; Zheng, Zhanjie; Sun, Yongan; Zhang, Jing; Xu, Jun; Meng, Xue; Zhao, Mei; Zhang, Haifeng; Lv, Xiaozhen; Sartorius, Norman; Li, Tao; Yu, Xin; Wang, Huali

2017-11-21

Clinical and social services both are important for dementia care. The International Dementia Alliance (IDEAL) Schedule for the Assessment and Staging of Care was developed to guide clinical and social care for dementia. Our study aimed to assess the validity and reliability of the IDEAL schedule in China. Two hundred eighty-two dementia patients and their caregivers were recruited from 15 hospitals in China. Each patient-caregiver dyad was assessed with the IDEAL schedule by a rater and an observer simultaneously. The Clinical Dementia Rating (CDR), Mini-Mental Status Examination (MMSE), and Caregiver Burden Inventory (CBI) were assessed for criterion validity. IDEAL repeated assessment was conducted 7-10 days after the initial interview for 62 dyads. Two hundred seventy-seven patient-caregiver dyads completed the IDEAL assessment. Inter-rater reliability for the total score of the IDEAL schedule was 0.93 (95%CI = 0.92-0.95). The inter-class coefficient for the total score of IDEAL was 0.95 for the interviewers and 0.93 for the silent raters. The IDEAL total score correlated with the global CDR score (ρ = 0.72, p valid and reliable tool for the staging of care for dementia in the Chinese population.
Interrater and Intrarater Reliability of the Tuck Jump Assessment by Health Professionals of Varied Educational Backgrounds

Directory of Open Access Journals (Sweden)

Lisa A. Dudley

2013-01-01

Full Text Available Objective. The Tuck Jump Assessment (TJA, a clinical plyometric assessment, identifies 10 jumping and landing technique flaws. The study objective was to investigate TJA interrater and intrarater reliability with raters of different educational and clinical backgrounds. Methods. 40 participants were video recorded performing the TJA using published protocol and instructions. Five raters of varied educational and clinical backgrounds scored the TJA. Each score of the 10 technique flaws was summed for the total TJA score. Approximately one month later, 3 raters scored the videos again. Intraclass correlation coefficients determined interrater (5 and 3 raters for first and second session, resp. and intrarater (3 raters reliability. Results. Interrater reliability with 5 raters was poor (ICC = 0.47; 95% confidence intervals (CI 0.33–0.62. Interrater reliability between 3 raters who completed 2 scoring sessions improved from 0.52 (95% CI 0.35–0.68 for session one to 0.69 (95% CI 0.55–0.81 for session two. Intrarater reliability was poor to moderate, ranging from 0.44 (95% CI 0.22–0.68 to 0.72 (95% CI 0.55–0.84. Conclusion. Published protocol and training of raters were insufficient to allow consistent TJA scoring. There may be a learned effect with the TJA since interrater reliability improved with repetition. TJA instructions and training should be modified and enhanced before clinical implementation.
The reliability and validity of the Turkish version of Fullerton Advanced Balance (FAB-T) scale.

Science.gov (United States)

Iyigun, Gozde; Kirmizigil, Berkiye; Angin, Ender; Oksuz, Sevim; Can, Filiz; Eker, Levent; Rose, Debra J

2018-06-04

The aim of this study was to evaluate the reliability and validity of the Turkish version of the FAB(FAB-T) scale in the older Turkish adults. The reliability and validity of the scale was tested on 200 community-dwelling older adults. FAB-T scale was scored by different physiotherapists on different days to evaluate inter-rater and intrarater reliability. The Berg Balance Scale (BBS) was used for the evaluation of convergent validity, and the content validity of the FAB-T scale was investigated. The FAB-T scale showed very high inter- and intra-rater reliability. For inter-rater agreement, on the individual test items and total score ICC values were 0.92 (95 %CI; 0.90-0.94) and 0.96 (95% CI; 0.95-0.97) respectively. The intra-rater agreement, on the individual test items and total score ICC values were 0.93 (95 %CI; 0.91- 0.95) and 0.96 (95% CI; 0.95- 0.97) respectively. There was a good agreement between the FAB-T and BBS scales. A high correlation was found between the BBS and FAB-T scales [rho = 0.70 (%95 CI; 0.62-0.76)] indicating good convergent validity. Considering the content validity of the FAB-T scale, no floor (floor score: 0%) or ceiling (ceiling score: 6.5%) effect was detected. The FAB-T scale was successfully translated from the original English version (FAB) and demonstrated strong psychometric features. It was found that the FAB-T scale has very high inter-rater and intra-rater reliability. Considering the convergent validity, the scale has high correlation with the BBS. The FAB-T has no floor and ceiling effect. Copyright © 2018 Elsevier B.V. All rights reserved.
Reliability and Validity of the Beijing Version of the Montreal Cognitive Assessment in the Evaluation of Cognitive Function of Adult Patients with OSAHS.

Directory of Open Access Journals (Sweden)

Xiong Chen

Full Text Available The patients with obstructive sleep apnea hypopnea syndrome (OSAHS tend to develop cognitive deficits, which usually go unrecognized, and can affect their daily life. The Beijing version of the Montreal cognitive assessment (MoCA-BJ, a Chinese version of MoCA, has been used for the assessment of cognitive functions of OSAHS patients in clinical practice. So far, its reliability and validity have not been tested. This study examined the reliability and validity of MoCA-BJ in a cohort of adult OSAHS patients.152 OSAHS patients, ranging from mild, moderate to severe, 49 primary snoring subjects and 40 normal controls were evaluated for cognitive functions by employing both MoCA-BJ and the Mini Mental State Examination (MMSE. Forty of them were re-tested by MoCA-BJ 14 days after the first test. Internal consistency, test-retest reliability, discriminate and concurrent validity of MoCA-BJ were analyzed.Internal consistency reliability by Cronbach's alpha was adequate (0.73. Intra-class correlation coefficient (ICC, an measure of test-retest reliability, was 0.87 (P<0.001. The total MoCA-BJ scores were significant higher in normal controls than in OSAHS groups (p<0.05. The performances of visuospatial ability in severe OSAHS group were significantly weaker than in normal controls and primary snoring group. The performances of executive ability in severe OSAHS patients were weaker than in normal controls. An optimal cut-off between normal controls and non-normal subjects was at 26 points (total MoCA score. Moreover, cut-off between non-severe and severe OSAHS was at 2 points on visuospatial subscale. Analysis of the correlation between MoCA total scores and MMSE total scores revealed a statistically significant, though relatively weak, correlation (r=0.41, P<0.05.In conclusion, our study showed that the Beijing version of the MoCA was reliable and stable. The MoCA-BJ was capable of detecting cognitive dysfunction by visuospatial and total MoCA-BJ score.
Hospital Value-Based Purchasing (HVBP) – Total Performance Score

Data.gov (United States)

U.S. Department of Health & Human Services — A list of hospitals participating in the Hospital VBP Program and their Clinical Process of Care domain scores, Patient Experience of Care dimension scores, and...
Reliability and Validity of Beliefs about Substance Use (BSU Questionnaire in Alcohol Dependent Patients.

Directory of Open Access Journals (Sweden)

Selçuk ASLAN

2012-12-01

Results: Mean age of the addicted patients, healthy controls and social drinkers were 42,3± 7,0, 33,5± 9,9 and 33,2± 8,9, respectively. In patient group, mean BSU score was 46,4 ± 21,2. For alcohol addicts, internal reliability of BSU was foundto be adequate (Cronbach alfa=0.91 and item-total score correlations were between 0.33 and 0.69. Basic component analysis showed one basic factor. A positive correlation has been found between BSU and CBQ, and ATQ scores. No correlations have been found between total and subscale scores of DAS and total scores of CIWA, BAI and BSU. In evaluation of validity, BSU mean scores of alcohol addicts were found to be significantly higher than healthy controls and social drinkers. Conclusion: Our findings support that Turkish version of BSU is an adequate tool that can be used to evaluate alcohol addicted patients` cognitive believes about alcohol use. [JCBPR 2012; 1(3.000: 162-170
A clinical assessment tool used for physiotherapy students--is it reliable?

Science.gov (United States)

Lewis, Lucy K; Stiller, Kathy; Hardy, Frances

2008-01-01

Educational institutions providing professional programs such as physiotherapy must provide high-quality student assessment procedures. To ensure that assessment is consistent, assessment tools should have an acceptable level of reliability. There is a paucity of research evaluating the reliability of clinical assessment tools used for physiotherapy students. This study evaluated the inter- and intrarater reliability of an assessment tool used for physiotherapy students during a clinical placement. Five clinical educators and one academic participated in the study. Each rater independently marked 22 student written assessments that had been completed by students after viewing a videotaped patient physiotherapy assessment. The raters repeated the marking process 7 weeks later, with the assessments provided in a randomised order. The interrater reliability (Intraclass Correlation Coefficient) for the total scores was 0.32, representing a poor level of reliability. A high level of intrarater reliability (percentage agreement) was found for the clinical educators, with a difference in section scores of one mark or less on 93.4% of occasions. Further research should be undertaken to reevaluate the reliability of this clinical assessment tool following training. The reliability of clinical assessment tools used in other areas of physiotherapy education should be formally measured rather than assumed.
Reliability and Validity of the Pain Anxiety Symptom Scale in Persian Speaking Chronic Low Back Pain Patients.

Science.gov (United States)

Shanbehzadeh, Sanaz; Salavati, Mahyar; Tavahomi, Mahnaz; Khatibi, Ali; Talebian, Saeed; Khademi-Kalantari, Khosro

2017-11-01

Psychometric testing of the Persian version of Pain Anxiety Symptom Scale 20. The aim of this study was to assess the reliability and construct validity of the PASS-20 in nonspecific chronic low back pain (LBP) patients. The PASS-20 is a self-report questionnaire that assesses pain-related anxiety. The Psychometric properties of this instrument have not been assessed in Persian-speaking chronic LBP patients. One hundred and sixty participants with chronic LBP completed the Persian version of PASS-20, Tampa Scale of Kinesiophobia (TSK), Fear-Avoidance Beliefs Questionnaire (FABQ), Pain Catastrophizing Scale (PCS), trait form of the State-Trait Anxiety (STAI-T), Oswestry Low Back Pain Disability Index (ODI), Beck Depression Inventory (BDI-II), and Visual Analogue Scale (VAS). To evaluate test-retest reliability, 60 patients filled out the PASS-20, 6 to 8 days after the first visit. Test-retest reliability (intraclass correlation coefficient [ICC], standard error of measurement [SEM], and minimal detectable change [MDC]), internal consistency, dimensionality, and construct validity were examined. The ICCs of the PASS-20 subscales and total score ranged from 0.71 to 0.8. The SEMs for PASS-20 total score was 7.29 and for the subscales ranged from 2.43 to 2.98. The MDC for the total score was 20.14 and for the subscales ranged from 6.71 to 8.23. The Cronbach alpha values for the subscales and total score ranged from 0.70 to 0.91. Significant positive correlations were found between the PASS-20 total score and PCS, TSK, FABQ, ODI, BDI, STAI-T, and pain intensity. The Persian version of the PASS-20 showed acceptable psychometric properties for the assessment of pain-related anxiety in Persian-speaking patients with chronic LBP. 3.

MRI-based radiologic scoring system for extent of brain injury in children with hemiplegia.

Science.gov (United States)

Shiran, S I; Weinstein, M; Sirota-Cohen, C; Myers, V; Ben Bashat, D; Fattal-Valevski, A; Green, D; Schertz, M

2014-12-01

Brain MR imaging is recommended in children with cerebral palsy. Descriptions of MR imaging findings lack uniformity, due to the absence of a validated quantitative approach. We developed a quantitative scoring method for brain injury based on anatomic MR imaging and examined the reliability and validity in correlation to motor function in children with hemiplegia. Twenty-seven children with hemiplegia underwent MR imaging (T1, T2-weighted sequences, DTI) and motor assessment (Manual Ability Classification System, Gross Motor Functional Classification System, Assisting Hand Assessment, Jebsen Taylor Test of Hand Function, and Children's Hand Experience Questionnaire). A scoring system devised in our center was applied to all scans. Radiologic score covered 4 domains: number of affected lobes, volume and type of white matter injury, extent of gray matter damage, and major white matter tract injury. Inter- and intrarater reliability was evaluated and the relationship between radiologic score and motor assessments determined. Mean total radiologic score was 11.3 ± 4.5 (range 4-18). Good inter- (ρ = 0.909, P classification systems (ρ = 0.708, P high inter- and intrarater reliability and significant associations with manual ability classification systems and motor evaluations. This score provides a standardized radiologic assessment of brain injury extent in hemiplegic patients with predominantly unilateral injury, allowing comparison between groups, and providing an additional tool for counseling families. © 2014 by American Journal of Neuroradiology.
Cumulative trauma disorders in the upper extremities: reliability of the postural and repetitive risk-factors index.

Science.gov (United States)

James, C P; Harburn, K L; Kramer, J F

1997-08-01

This study addresses test-retest reliability of the Postural and Repetitive Risk-Factors Index (PRRI) for work-related upper body injuries. This assessment was developed by the present authors. A repeated measures design was used to assess the test-retest reliability of a videotaped work-site assessment of subjects' movements. Ten heavy users of video display terminals (VDTs) from a local banking industry participated in the study. The 10 subjects' movements were videotaped for 2 hours on each of 2 separate days, while working on-site at their VDTs. The videotaped assessment, which utilized known postural risk factors for developing musculoskeletal disorder, pain, and discomfort in heavy VDT users (ie, repetitiveness, awkward and static postures, and contraction time), was called the PRRI. The videotaped movement assessments were subsequently analyzed in 15-minute sessions (five sessions per 2-hour videotape, which produced a total of 10 sessions over the 2 testing days), and each session was chosen randomly from the videotape. The subjects' movements were given a postural risk score according to the criteria in the PRRI. Each subject was therefore tested a total of 10 times (ie, 10 sessions), over two days. The maximum PRRI score for both sides of the body was 216 points. Reliability coefficients (RCs) for the PRRI scores were calculated, and the reliability of any one session met the minimum criterion for excellent reliability, which was .75. A two-way analysis of variance (ANOVA) confirmed that there was no statistically significant difference between sessions (p < .05). Calculations using the standard error of measurement (SEM) indicated that an individual tested once, on one day and with a PRRI score of 25, required a change of at least 8 points in order to be confident that a true change in score had occurred. The significant results from the reliability tests indicated that the PRRI was a reliable measurement tool that could be used by occupational health
The Americleft Speech Project: A Training and Reliability Study.

Science.gov (United States)

Chapman, Kathy L; Baylis, Adriane; Trost-Cardamone, Judith; Cordero, Kelly Nett; Dixon, Angela; Dobbelsteyn, Cindy; Thurmes, Anna; Wilson, Kristina; Harding-Bell, Anne; Sweeney, Triona; Stoddard, Gregory; Sell, Debbie

2016-01-01

To describe the results of two reliability studies and to assess the effect of training on interrater reliability scores. The first study (1) examined interrater and intrarater reliability scores (weighted and unweighted kappas) and (2) compared interrater reliability scores before and after training on the use of the Cleft Audit Protocol for Speech-Augmented (CAPS-A) with British English-speaking children. The second study examined interrater and intrarater reliability on a modified version of the CAPS-A (CAPS-A Americleft Modification) with American and Canadian English-speaking children. Finally, comparisons were made between the interrater and intrarater reliability scores obtained for Study 1 and Study 2. The participants were speech-language pathologists from the Americleft Speech Project. In Study 1, interrater reliability scores improved for 6 of the 13 parameters following training on the CAPS-A protocol. Comparison of the reliability results for the two studies indicated lower scores for Study 2 compared with Study 1. However, this appeared to be an artifact of the kappa statistic that occurred due to insufficient variability in the reliability samples for Study 2. When percent agreement scores were also calculated, the ratings appeared similar across Study 1 and Study 2. The findings of this study suggested that improvements in interrater reliability could be obtained following a program of systematic training. However, improvements were not uniform across all parameters. Acceptable levels of reliability were achieved for those parameters most important for evaluation of velopharyngeal function.
A locally adapted functional outcome measurement score for total ...

African Journals Online (AJOL)

... in Europe or North America and seem not optimally suited for a general West ... We introduce a cross-cultural adaptation of the Lequesne index as a new score. ... Keywords: THR, Hip, Africa, Functional score, Hip replacement, Arthroscopy ...
The Validity and Reliability of Autism Behavior Checklist

Directory of Open Access Journals (Sweden)

Negin Yousefi

2015-11-01

Full Text Available Objectives: The aim of this study was to evaluate the psychometric features of the Persian version of the Autism Behavior Checklist (ABC. Method:The International Quality of Life Assessment (IQOLA approach was used to translate the English ABC into Persian. A total sample of 184 parents of children including 114 children with autism disorder (mean age =7.21, SD =1.65 and 70 typically developing children (mean age = 6.82, SD =1.75 completed the ABC. Internal consistency, test-retest reliability, concurrent and discriminant validity, and cut-off score were assessed. Results: The results of this study revealed that the Persian version of the ABC has an acceptable degree of internal consistency (.73. Test–retest comparisons using interclass correlation confirmed the instrument’s time stability (.83. The instrument’s concurrent validity with Gilliam Autism Rating Scale (GARS was verified; the correlation between total scores was .94. In the discriminant validity, the autism group had significantly higher scores compared to the normal group. Receiver Operating Characteristic (ROC analysis revealed that individuals with total scores below 25 are less likely to be in the autism group. Conclusion:The Persian version of the ABC can be used as an initial screening tool in clinical contexts.
[Reliability and validity of the modified Perceived Health Competence Scale (PHCS) Japanese version].

Science.gov (United States)

Togari, Taisuke; Yamazaki, Yoshihiko; Koide, Syotaro; Miyata, Ayako

2006-01-01

In community and workplace health plans, the Perceived Health Competence Scale (PHCS) is employed as an index of health competency. The purpose of this research was to examine the reliability and validity of a modified Japanese PHCS. Interviews were sought with 3,000 randomly selected Japanese individuals using a two-step stratified method. Valid PHCS responses were obtained from 1,910 individuals, yielding a 63.7% response rate. Reliability was assessed using Cronbach's alpha coefficient (henceforth, alpha) to evaluate internal consistency, and by employing item-total correlation and alpha coefficient analyses to assess the effect of removal of variables from the model. To examine content validity, we assessed the correlation between the PHCS score and four respondent attribute characteristics, that is, sex, age, the presence of chronic disease, and the existence of chronic disease at age 18. The correlation between PHCS score and commonly employed healthy lifestyle indices was examined to assess construct validity. General linear model statistical analysis was employed. The modified Japanese PHCS demonstrated a satisfactory alpha coefficient of 0.869. Moreover, reliability was confirmed by item-total correlation and alpha coefficient analyses after removal of variables from the model. Differences in PHCS scores were seen between individuals 60 years and older, and younger individuals. These with current chronic disease, or who had had a chronic disease at age 18, tended to have lower PHCS scores. After controlling for the presence of current or age 18 chronic disease, age, and sex, significant correlations were seen between PHCS scores and tobacco use, dietary habits, and exercise, but not alcohol use or frequency of medical consultation. This study supports the reliability and validity, and hence supports the use, of the modified Japanese PHCS. Future longitudinal research is needed to evaluate the predictive power of modified Japanese PHCS scores, to examine
Can the pre-operative Western Ontario and McMaster score predict patient satisfaction following total hip arthroplasty?

Science.gov (United States)

Rogers, B A; Alolabi, B; Carrothers, A D; Kreder, H J; Jenkinson, R J

2015-02-01

In this study we evaluated whether pre-operative Western Ontario and McMaster Universities (WOMAC) osteoarthritis scores can predict satisfaction following total hip arthroplasty (THA). Prospective data for a cohort of patients undergoing THA from two large academic centres were collected, and pre-operative and one-year post-operative WOMAC scores and a 25-point satisfaction questionnaire were obtained for 446 patients. Satisfaction scores were dichotomised into either improvement or deterioration. Scatter plots and Spearman's rank correlation coefficient were used to describe the association between pre-operative WOMAC and one-year post-operative WOMAC scores and patient satisfaction. Satisfaction was compared using receiver operating characteristic (ROC) analysis against pre-operative, post-operative and δ WOMAC scores. We found no relationship between pre-operative WOMAC scores and one-year post-operative WOMAC or satisfaction scores, with Spearman's rank correlation coefficients of 0.16 and -0.05, respectively. The ROC analysis showed areas under the curve (AUC) of 0.54 (pre-operative WOMAC), 0.67 (post-operative WOMAC) and 0.43 (δ WOMAC), respectively, for an improvement in satisfaction. We conclude that the pre-operative WOMAC score does not predict the post-operative WOMAC score or patient satisfaction after THA, and that WOMAC scores can therefore not be used to prioritise patient care. ©2015 The British Editorial Society of Bone & Joint Surgery.
A flexible latent class approach to estimating test-score reliability

NARCIS (Netherlands)

van der Palm, D.W.; van der Ark, L.A.; Sijtsma, K.

2014-01-01

The latent class reliability coefficient (LCRC) is improved by using the divisive latent class model instead of the unrestricted latent class model. This results in the divisive latent class reliability coefficient (DLCRC), which unlike LCRC avoids making subjective decisions about the best solution
[German validation of the Acute Cystitis Symptom Score].

Science.gov (United States)

Alidjanov, J F; Pilatz, A; Abdufattaev, U A; Wiltink, J; Weidner, W; Naber, K G; Wagenlehner, F

2015-09-01

The Uzbek version of the Acute Cystitis Symptom Score (ACSS) was developed as a simple self-reporting questionnaire to improve diagnosis and therapy of women with acute cystitis (AC). The purpose of this work was to validate the ACSS in the German language. The ACSS consists of 18 questions in four subscales: (1) typical symptoms, (2) differential diagnosis, (3) quality of life, and (4) additional circumstances. Translation of the ACSS into German was performed according to international guidelines. For the validation process 36 German-speaking women (age: 18-90 years), with and without symptoms of AC, were included in the study. Classification of participants into two groups (patients or controls) was based on the presence or absence of typical symptoms and significant bacteriuria (≥ 10(3) CFU/ml). Statistical evaluations of reliability, validity, and predictive ability were performed. ROC curve analysis was performed to assess sensitivity and specificity of ACSS and its subscales. The Mann-Whitney's U test and t-test were used to compare the scores of the groups. Of the 36 German-speaking women (age: 40 ± 19 years), 19 were diagnosed with AC (patient group), while 17 women served as controls. Cronbach's α for the German ACSS total scale was 0.87. A threshold score of ≥ 6 points in category 1 (typical symptoms) significantly predicted AC (sensitivity 94.7%, specificity 82.4%). There were no significant differences in ACSS scores in patients and controls compared to the original Uzbek version of the ACSS. The German version of the ACSS showed a high reliability and validity. Therefore, the German version of the ACSS can be reliably used in clinical practice and research for diagnosis and therapeutic monitoring of patients suffering from AC.
Intra-operative reliability of ShapeMatch cutting guide placement in total knee arthroplasty.

Science.gov (United States)

Clark, Gavin; Leong, Anthony; McEwen, Peter; Steele, Robert; Tran, Ton; Trivett, Adrian

2013-01-01

Custom cutting guides based on pre-operative imaging have been introduced for total knee arthroplasty (TKA). The aim of this prospective cohort study was to assess the reliability of repeated placement of custom cutting guides by multiple surgeons in a group of patients undergoing TKA. Custom cutting guides (ShapeMatch®, Stryker Orthopaedics) were designed from pre-operative MRI scans. The treating surgeon placed each guide on the femur and tibia of each patient three times without pinning the block. The three-dimensional position and orientation of the guide was measured for each repetition using a computer navigation system. The surgeon was blinded to the navigation system display. Data from 24 patients and 6 surgeons were analyzed. Intraclass correlation coefficients for all measurement parameters were in the range 0.889-0.997 (excellent), and all comparisons were statistically significant (p reliable.
Coronary calcium screening with dual-source CT: reliability of ungated, high-pitch chest CT in comparison with dedicated calcium-scoring CT

Energy Technology Data Exchange (ETDEWEB)

Hutt, Antoine; Faivre, Jean-Baptiste; Remy, Jacques; Remy-Jardin, Martine [CHRU et Universite de Lille, Department of Thoracic Imaging, Hospital Calmette (EA 2694), Lille (France); Duhamel, Alain; Deken, Valerie [CHRU et Universite de Lille, Department of Biostatistics (EA 2694), Lille (France); Molinari, Francesco [Centre Hospitalier General de Tourcoing, Department of Radiology, Tourcoing (France)

2016-06-15

To investigate the reliability of ungated, high-pitch dual-source CT for coronary artery calcium (CAC) screening. One hundred and eighty-five smokers underwent a dual-source CT examination with acquisition of two sets of images during the same session: (a) ungated, high-pitch and high-temporal resolution acquisition over the entire thorax (i.e., chest CT); (b) prospectively ECG-triggered acquisition over the cardiac cavities (i.e., cardiac CT). Sensitivity and specificity of chest CT for detecting positive CAC scores were 96.4 % and 100 %, respectively. There was excellent inter-technique agreement for determining the quantitative CAC score (ICC = 0.986). The mean difference between the two techniques was 11.27, representing 1.81 % of the average of the two techniques. The inter-technique agreement for categorizing patients into the four ranks of severity was excellent (weighted kappa = 0.95; 95 % CI 0.93-0.98). The inter-technique differences for quantitative CAC scores did not correlate with BMI (r = 0.05, p = 0.575) or heart rate (r = -0.06, p = 0.95); 87.2 % of them were explained by differences at the level of the right coronary artery (RCA: 0.8718; LAD: 0.1008; LCx: 0.0139; LM: 0.0136). Ungated, high-pitch dual-source CT is a reliable imaging mode for CAC screening in the conditions of routine chest CT examinations. (orig.)
Interrater reliability of the mind map assessment rubric in a cohort of medical students

Directory of Open Access Journals (Sweden)

Zipp Genevieve

2009-04-01

Full Text Available Abstract Background Learning strategies are thinking tools that students can use to actively acquire information. Examples of learning strategies include mnemonics, charts, and maps. One strategy that may help students master the tsunami of information presented in medical school is the mind map learning strategy. Currently, there is no valid and reliable rubric to grade mind maps and this may contribute to their underutilization in medicine. Because concept maps and mind maps engage learners similarly at a metacognitive level, a valid and reliable concept map assessment scoring system was adapted to form the mind map assessment rubric (MMAR. The MMAR can assess mind map depth based upon concept-links, cross-links, hierarchies, examples, pictures, and colors. The purpose of this study was to examine interrater reliability of the MMAR. Methods This exploratory study was conducted at a US medical school as part of a larger investigation on learning strategies. Sixty-six (N = 66 first-year medical students were given a 394-word text passage followed by a 30-minute presentation on mind mapping. After the presentation, subjects were again given the text passage and instructed to create mind maps based upon the passage. The mind maps were collected and independently scored using the MMAR by 3 examiners. Interrater reliability was measured using the intraclass correlation coefficient (ICC statistic. Statistics were calculated using SPSS version 12.0 (Chicago, IL. Results Analysis of the mind maps revealed the following: concept-links ICC = .05 (95% CI, -.42 to .38, cross-links ICC = .58 (95% CI, .37 to .73, hierarchies ICC = .23 (95% CI, -.15 to .50, examples ICC = .53 (95% CI, .29 to .69, pictures ICC = .86 (95% CI, .79 to .91, colors ICC = .73 (95% CI, .59 to .82, and total score ICC = .86 (95% CI, .79 to .91. Conclusion The high ICC value for total mind map score indicates strong MMAR interrater reliability. Pictures and colors demonstrated moderate
Interrater reliability of the mind map assessment rubric in a cohort of medical students.

Science.gov (United States)

D'Antoni, Anthony V; Zipp, Genevieve Pinto; Olson, Valerie G

2009-04-28

Learning strategies are thinking tools that students can use to actively acquire information. Examples of learning strategies include mnemonics, charts, and maps. One strategy that may help students master the tsunami of information presented in medical school is the mind map learning strategy. Currently, there is no valid and reliable rubric to grade mind maps and this may contribute to their underutilization in medicine. Because concept maps and mind maps engage learners similarly at a metacognitive level, a valid and reliable concept map assessment scoring system was adapted to form the mind map assessment rubric (MMAR). The MMAR can assess mind map depth based upon concept-links, cross-links, hierarchies, examples, pictures, and colors. The purpose of this study was to examine interrater reliability of the MMAR. This exploratory study was conducted at a US medical school as part of a larger investigation on learning strategies. Sixty-six (N = 66) first-year medical students were given a 394-word text passage followed by a 30-minute presentation on mind mapping. After the presentation, subjects were again given the text passage and instructed to create mind maps based upon the passage. The mind maps were collected and independently scored using the MMAR by 3 examiners. Interrater reliability was measured using the intraclass correlation coefficient (ICC) statistic. Statistics were calculated using SPSS version 12.0 (Chicago, IL). Analysis of the mind maps revealed the following: concept-links ICC = .05 (95% CI, -.42 to .38), cross-links ICC = .58 (95% CI, .37 to .73), hierarchies ICC = .23 (95% CI, -.15 to .50), examples ICC = .53 (95% CI, .29 to .69), pictures ICC = .86 (95% CI, .79 to .91), colors ICC = .73 (95% CI, .59 to .82), and total score ICC = .86 (95% CI, .79 to .91). The high ICC value for total mind map score indicates strong MMAR interrater reliability. Pictures and colors demonstrated moderate to strong interrater reliability. We conclude that the
[Reliability and validity of the Severe Impairment Battery, short form (SIB-s), in patients with dementia in Spain].

Science.gov (United States)

Cruz-Orduña, Isabel; Agüera-Ortiz, Luis F; Montorio-Cerrato, Ignacio; León-Salas, Beatriz; Valle de Juan, M Cristina; Martínez-Martín, Pablo

2015-01-01

People with progressive dementia evolve into a state where traditional neuropsychological tests are not effective. Severe Impairment Battery (SIB) and short form (SIB-s) were developed for evaluating the cognitive status in patients with severe dementia. To evaluate the psychometric attributes of the SIB-s in patients with severe dementia. 127 institutionalized patients (female: 86.6%; mean age: 82.6 ± 7.5 years-old) with dementia were assessed with the SIB-s, the Global Deterioration Scale (GDS), Mini-Mental State Examination (MMSE), Severe Mini-Mental State Examination (sMMSE), Barthel Index and FAST. SIB-s acceptability, reliability, validity and precision were analyzed. The mean total score for scale was 19.1 ± 15.34 (range: 0-48). Floor effect was 18.1%, only marginally higher than the desirable 15%. Factor analysis identified a single factor explaining 68% of the total variance of the scale. Cronbach's alpha coefficient was 0.96 and the item-total corrected correlation ranged from 0.27 to 0.83. The item homogeneity value was 0.43. Test-retest and inter-rater reliability for the total score was satisfactory (ICC: 0.96 and 0.95, respectively). The SIB-s showed moderate correlation with functional dependency scales (Barthel Index: 0.48, FAST: -0.74). Standard error of measurement was 3.07 for the total score. The SIB-s is a reliable and valid instrument for evaluating patients with severe dementia in the Spanish population of relatively brief instruments.
Novel Semiquantitative Bone Marrow Oedema Score and Fracture Score for the Magnetic Resonance Imaging Assessment of the Active Charcot Foot in Diabetes

Science.gov (United States)

Meacock, L.; Donaldson, Ana; Isaac, A.; Briody, A.; Ramnarine, R.; Edmonds, M. E.; Elias, D. A.

2017-01-01

There are no accepted methods to grade bone marrow oedema (BMO) and fracture on magnetic resonance imaging (MRI) scans in Charcot osteoarthropathy. The aim was to devise semiquantitative BMO and fracture scores on foot and ankle MRI scans in diabetic patients with active osteoarthropathy and to assess the agreement in using these scores. Three radiologists assessed 45 scans (Siemens Avanto 1.5T, dedicated foot and ankle coil) and scored independently twenty-two bones (proximal phalanges, medial and lateral sesamoids, metatarsals, tarsals, distal tibial plafond, and medial and lateral malleoli) for BMO (0—no oedema, 1—oedema 50% of bone volume) and fracture (0—no fracture, 1—fracture, and 2—collapse/fragmentation). Interobserver agreement and intraobserver agreement were measured using multilevel modelling and intraclass correlation (ICC). The interobserver agreement for the total BMO and fracture scores was very good (ICC = 0.83, 95% confidence intervals (CI) 0.76, 0.91) and good (ICC = 0.62; 95% CI 0.48, 0.76), respectively. The intraobserver agreement for the total BMO and fracture scores was good (ICC = 0.78, 95% CI 0.6, 0.95) and fair to moderate (ICC = 0.44; 95% CI 0.14, 0.74), respectively. The proposed BMO and fracture scores are reliable and can be used to grade the extent of bone damage in the active Charcot foot. PMID:29230422
The test-retest reliability of the latent construct of executive function depends on whether tasks are represented as formative or reflective indicators.

Science.gov (United States)

Willoughby, Michael T; Kuhn, Laura J; Blair, Clancy B; Samek, Anya; List, John A

2017-10-01

This study investigates the test-retest reliability of a battery of executive function (EF) tasks with a specific interest in testing whether the method that is used to create a battery-wide score would result in differences in the apparent test-retest reliability of children's performance. A total of 188 4-year-olds completed a battery of computerized EF tasks twice across a period of approximately two weeks. Two different approaches were used to create a score that indexed children's overall performance on the battery-i.e., (1) the mean score of all completed tasks and (2) a factor score estimate which used confirmatory factor analysis (CFA). Pearson and intra-class correlations were used to investigate the test-retest reliability of individual EF tasks, as well as an overall battery score. Consistent with previous studies, the test-retest reliability of individual tasks was modest (rs ≈ .60). The test-retest reliability of the overall battery scores differed depending on the scoring approach (r mean = .72; r factor_ score = .99). It is concluded that the children's performance on individual EF tasks exhibit modest levels of test-retest reliability. This underscores the importance of administering multiple tasks and aggregating performance across these tasks in order to improve precision of measurement. However, the specific strategy that is used has a large impact on the apparent test-retest reliability of the overall score. These results replicate our earlier findings and provide additional cautionary evidence against the routine use of factor analytic approaches for representing individual performance across a battery of EF tasks.
Construct validity and reliability of the Finnish version of the Knee Injury and Osteoarthritis Outcome Score.

Science.gov (United States)

Multanen, Juhani; Honkanen, Mikko; Häkkinen, Arja; Kiviranta, Ilkka

2018-05-22

The Knee Injury and Osteoarthritis Outcome Score (KOOS) is a commonly used knee assessment and outcome tool in both clinical work and research. However, it has not been formally translated and validated in Finnish. The purpose of this study was to translate and culturally adapt the KOOS questionnaire into Finnish and to determine its validity and reliability among Finnish middle-aged patients with knee injuries. KOOS was translated and culturally adapted from English into Finnish. Subsequently, 59 patients with knee injuries completed the Finnish version of KOOS, Western Ontario and McMaster Osteoarthritis Index (WOMAC), Short-Form 36 Health Survey (SF-36) and Numeric Pain Rating Scale (Pain-NRS). The same KOOS questionnaire was re-administered 2 weeks later. Psychometric assessment of the Finnish KOOS was performed by testing its construct validity and reliability by using internal consistency, test-retest reliability and measurement error. The floor and ceiling effects were also examined. The cross-cultural adaptation revealed only minor cultural differences and was well received by the patients. For construct validity, high to moderate Spearman's Correlation Coefficients were found between the KOOS subscales and the WOMAC, SF-36, and Pain-NRS subscales. The Cronbach's alpha was from 0.79 to 0.96 for all subscales indicating acceptable internal consistency. The test-retest reliability was good to excellent, with Intraclass Correlation Coefficients ranging from 0.73 to 0.86 for all KOOS subscales. The minimal detectable change ranged from 17 to 34 on an individual level and from 2 to 4 on a group level. No floor or ceiling effects were observed. This study yielded an appropriately translated and culturally adapted Finnish version of KOOS which demonstrated good validity and reliability. Our data indicate that the Finnish version of KOOS is suitable for assessment of the knee status of Finnish patients with different knee complaints. Further studies are needed to
Validity and reliability of short form-12 questionnaire in Iranian hemodialysis patients

DEFF Research Database (Denmark)

Pakpour, Amir H.; Nourozi, Saeedeh; Mølsted, Stig

2011-01-01

INTRODUCTION: The aim of the study was to assess the validity and reliability of the SF-12 questionnaire in a sample of Iranian patients undergoing hemodialysis. MATERIALS AND METHODS: One hundred and forty-four hemodialysis patients were included from dialysis centers in Zanjan, Iran, and were...... asked to complete the SF-12 and SF-36 questionnaires. An initial test-retest reliability evaluation was performed on a sample of 70 patients from the total group, with a retest interval of 14 days. Reliability was estimated by internal consistency and validity was assessed using known-group comparisons...... and construct validity on the patient group as a whole. A linear regression analysis was used to assess any variation in the physical component summary and mental component summary scores of the SF-36 with the respective component summary scores of the SF-12. In addition, the factor structure...
Reliability of patient specific instrumentation in total knee arthroplasty.

Science.gov (United States)

Jennart, Harold; Ngo Yamben, Marie-Ange; Kyriakidis, Theofylaktos; Zorman, David

2015-12-01

The aim of this study was to compare the precision between Patient Specific Instrumentation (PSI) and Conventional Instrumentation (CI) as determined intra-operatively by a pinless navigation system. Eighty patients were included in this prospective comparative study and they were divided into two homogeneous groups. We defined an original score from 6 to 30 points to evaluate the accuracy of the position of the cutting guides. This score is based on 6 objective criteria. The analysis indicated that PSI was not superior to conventional instrumentation in the overall score (p = 0.949). Moreover, no statistically significant difference was observed for any individual criteria of our score. Level of evidence II.
Crosscultural Adaptation and Validation of the Korean Version of the New Knee Society Knee Scoring System.

Science.gov (United States)

Kim, Seok Jin; Basur, Mohnish Singh; Park, Chang Kyu; Chong, Suri; Kang, Yeon Gwi; Kim, Moon Ju; Jeong, Jeong Seong; Kim, Tae Kyun

2017-06-01

The 2011 Knee Society Score © (2011 KS Score © ) is used to characterize the expectations, symptoms, physical activity, and satisfaction of patients who undergo TKA and is widely used to assess the outcome of TKA. However, it has not been adapted or validated for use in Korea. We developed a Korean version of the 2011 KS Score and evaluated the (1) test-retest reliability, (2) convergent validity, and (3) responsiveness of the Korean version. The Korean version of the 2011 KS Score was derived by using a well-established translational procedure based on international guidelines, which include translation, synthesis, back-translation, expert committee review, pretesting, and submission for appraisal. A total of 123 patients with knee osteoarthritis who were scheduled to undergo TKA were recruited for the study. Ninety percent of the patients (111 of 123) were women, which is an exact representation of the Korean population having TKAs. To evaluate reliability, the patients were evaluated twice during a 4-week interval using the questionnaire. Reliability was assessed by using intraclass correlation coefficients (ICCs) and internal consistency by using Cronbach's alpha to determine the validity of the Korean version of the 2011 KS Score. The patients were evaluated by using the validated Korean versions of the WOMAC and SF-36 questionnaires. Spearman's correlation coefficient was used for validation. Responsiveness was determined by calculating the standardized response mean from the preoperative and postoperative test scores in the Korean version of the 2011 KS Score. To address the gender disparity in our study we identified 53 males who underwent TKA for osteoarthritis after completion of this study and generated age-matched controlled groups to evaluate construct validity and responsiveness in Korean males. The reliability proved good to excellent with an ICC between 0.69 and 0.85, depending on the clinical properties tested, which included the following

Validity and reliability of the Dutch version of the Copenhagen Hip And Groin Outcome Score (HAGOS-NL in patients with hip pathology.

Directory of Open Access Journals (Sweden)

Hilde Giezen

Full Text Available The Copenhagen Hip And Groin Outcome Score (HAGOS was developed to assess disease-specific consequences in young to middle-aged, physically active hip and/or groin patients. The study aimed to determine validity and reliability of the Dutch version of the HAGOS (HAGOS-NL for middle-aged patients with hip complaints.To assess validity, 117 participants completed five questionnaires: HAGOS-NL, international Hip Outcome Tool (iHOT-12NL, Hip disability and Osteoarthritis Outcome Score (HOOS, RAND-36 Health Survey and Tegner activity scale. Structural validity was determined by conducting confirmatory factor analysis. Construct validity was analyzed by formulating predefined hypotheses regarding relationships between the HAGOS-NL and subscales of the iHOT-12NL, HOOS, RAND-36 and Tegner activity scale. The HAGOS-NL was filled out again by 67 patients to explore test-retest reliability. Reliability was assessed in terms of Cronbach's alpha, Intraclass Correlation Coefficient (ICC, Standard Error of Measurement (SEM and Minimal Detectable Change (MDC. The Bland and Altman method was used to explore absolute agreement.Factor analysis confirmed that the HAGOS-NL consists of six subscales. All hypotheses were confirmed, indicating good construct validity. Internal consistency was good, with Cronbach's alpha values ranging from 0.89 to 0.98. Test-retest reliability was considered good, with ICC values of 0.80 and higher. The SEM ranged from 6.6 to 12.3, and MDC at individual level from 18.3 to 34.1 and at group level from 2.3 to 4.4. Bland and Altman analyses showed no bias.The HAGOS-NL is a reliable and valid instrument for measuring pain, physical functioning and quality of life in middle-aged patients with hip complaints.
Rhythm and Melody Tasks for School-Aged Children With and Without Musical Training: Age-Equivalent Scores and Reliability

Directory of Open Access Journals (Sweden)

Kierla Ireland

2018-04-01

Full Text Available Measuring musical abilities in childhood can be challenging. When music training and maturation occur simultaneously, it is difficult to separate the effects of specific experience from age-based changes in cognitive and motor abilities. The goal of this study was to develop age-equivalent scores for two measures of musical ability that could be reliably used with school-aged children (7–13 with and without musical training. The children's Rhythm Synchronization Task (c-RST and the children's Melody Discrimination Task (c-MDT were adapted from adult tasks developed and used in our laboratories. The c-RST is a motor task in which children listen and then try to synchronize their taps with the notes of a woodblock rhythm while it plays twice in a row. The c-MDT is a perceptual task in which the child listens to two melodies and decides if the second was the same or different. We administered these tasks to 213 children in music camps (musicians, n = 130 and science camps (non-musicians, n = 83. We also measured children's paced tapping, non-paced tapping, and phonemic discrimination as baseline motor and auditory abilities We estimated internal-consistency reliability for both tasks, and compared children's performance to results from studies with adults. As expected, musically trained children outperformed those without music lessons, scores decreased as difficulty increased, and older children performed the best. Using non-musicians as a reference group, we generated a set of age-based z-scores, and used them to predict task performance with additional years of training. Years of lessons significantly predicted performance on both tasks, over and above the effect of age. We also assessed the relation between musician's scores on music tasks, baseline tasks, auditory working memory, and non-verbal reasoning. Unexpectedly, musician children outperformed non-musicians in two of three baseline tasks. The c-RST and c-MDT fill an important need for
Rhythm and Melody Tasks for School-Aged Children With and Without Musical Training: Age-Equivalent Scores and Reliability.

Science.gov (United States)

Ireland, Kierla; Parker, Averil; Foster, Nicholas; Penhune, Virginia

2018-01-01

Measuring musical abilities in childhood can be challenging. When music training and maturation occur simultaneously, it is difficult to separate the effects of specific experience from age-based changes in cognitive and motor abilities. The goal of this study was to develop age-equivalent scores for two measures of musical ability that could be reliably used with school-aged children (7-13) with and without musical training. The children's Rhythm Synchronization Task (c-RST) and the children's Melody Discrimination Task (c-MDT) were adapted from adult tasks developed and used in our laboratories. The c-RST is a motor task in which children listen and then try to synchronize their taps with the notes of a woodblock rhythm while it plays twice in a row. The c-MDT is a perceptual task in which the child listens to two melodies and decides if the second was the same or different. We administered these tasks to 213 children in music camps (musicians, n = 130) and science camps (non-musicians, n = 83). We also measured children's paced tapping, non-paced tapping, and phonemic discrimination as baseline motor and auditory abilities We estimated internal-consistency reliability for both tasks, and compared children's performance to results from studies with adults. As expected, musically trained children outperformed those without music lessons, scores decreased as difficulty increased, and older children performed the best. Using non-musicians as a reference group, we generated a set of age-based z-scores, and used them to predict task performance with additional years of training. Years of lessons significantly predicted performance on both tasks, over and above the effect of age. We also assessed the relation between musician's scores on music tasks, baseline tasks, auditory working memory, and non-verbal reasoning. Unexpectedly, musician children outperformed non-musicians in two of three baseline tasks. The c-RST and c-MDT fill an important need for researchers
Modified Ashworth scale and spasm frequency score in spinal cord injury

DEFF Research Database (Denmark)

Baunsgaard, C. B.; Nissen, U. V.; Christensen, K. B.

2016-01-01

.94 and inter-rater κweighted=0.93. Correlation between MAS and SFS showed non-significant correlation coefficients from-0.11 to 0.90. CONCLUSION: Reliability of MAS is highly affected by the weighting scheme. With a weighted-κ it was overall reliable and simple-κ overall unreliability. Repeated tests should......STUDY DESIGN: Intra- and inter-rater reliability study. OBJECTIVES: To assess intra- and inter-rater reliability of the Modified Ashworth Scale (MAS) and Spasm Frequency Score (SFS) in lower extremities in a population of spinal cord-injured persons, as well as correlations between the two scales....... SETTING: Clinic for Spinal Cord Injuries, Rigshospitalet, Hornbaek, Denmark. METHODS: Thirty-one persons participated in the study and were tested four times in total with MAS and SFS by three experienced raters. Cohen's kappa (κ), simple and quadratic weighted (nominal and ordinal scale level...
Interpreting Quality of Life after Brain Injury Scores: Cross-Walk with the Short Form-36.

Science.gov (United States)

Wilson, Lindsay; Marsden-Loftus, Isaac; Koskinen, Sanna; Bakx, Wilbert; Bullinger, Monika; Formisano, Rita; Maas, Andrew; Neugebauer, Edmund; Powell, Jane; Sarajuuri, Jaana; Sasse, Nadine; von Steinbuechel, Nicole; von Wild, Klaus; Truelle, Jean-Luc

2017-01-01

The Quality of Life after Brain Injury (QOLIBRI) instruments are traumatic brain injury (TBI)-specific assessments of health-related quality of life (HRQoL), with established validity and reliability. The purpose of the study is to help improve the interpretability of the two QOLIBRI summary scores (the QOLIBRI Total score and the QOLBRI Overall Scale [OS] score). An analysis was conducted of 761 patients with TBI who took part in the QOLIBRI validation studies. A cross-walk between QOLIBRI scores and the SF-36 Mental Component Summary norm-based scoring system was performed using geometric mean regression analysis. The exercise supports a previous suggestion that QOLIBRI Total scores GOSE), as a measure of global function, are presented in the form of means and standard deviations that allow comparison with other studies, and data on age and sex are presented for the QOLIBRI-OS. While bearing in mind the potential imprecision of the comparison, the findings provide a framework for evaluating QOLIBRI summary scores in relation to generic HRQoL that improves their interpretability.
Reliability and validity of the foot and ankle outcome score: a validation study from Iran.

Science.gov (United States)

Negahban, Hossein; Mazaheri, Masood; Salavati, Mahyar; Sohani, Soheil Mansour; Askari, Marjan; Fanian, Hossein; Parnianpour, Mohamad

2010-05-01

The aims of this study were to culturally adapt and validate the Persian version of Foot and Ankle Outcome Score (FAOS) and present data on its psychometric properties for patients with different foot and ankle problems. The Persian version of FAOS was developed after a standard forward-backward translation and cultural adaptation process. The sample included 93 patients with foot and ankle disorders who were asked to complete two questionnaires: FAOS and Short-Form 36 Health Survey (SF-36). To determine test-retest reliability, 60 randomly chosen patients completed the FAOS again 2 to 6 days after the first administration. Test-retest reliability and internal consistency were assessed using intraclass correlation coefficient (ICC) and Cronbach's alpha, respectively. To evaluate convergent and divergent validity of FAOS compared to similar and dissimilar concepts of SF-36, the Spearman's rank correlation was used. Dimensionality was determined by assessing item-subscale correlation corrected for overlap. The results of test-retest reliability show that all the FAOS subscales have a very high ICC, ranging from 0.92 to 0.96. The minimum Cronbach's alpha level of 0.70 was exceeded by most subscales. The Spearman's correlation coefficient for convergent construct validity fell within 0.32 to 0.58 for the main hypotheses presented a priori between FAOS and SF-36 subscales. For dimensionality, the minimum Spearman's correlation coefficient of 0.40 was exceeded by most items. In conclusion, the results of our study show that the Persian version of FAOS seems to be suitable for Iranian patients with various foot and ankle problems especially lateral ankle sprain. Future studies are needed to establish stronger psychometric properties for patients with different foot and ankle problems.
Automatic Sleep Scoring in Normals and in Individuals with Neurodegenerative Disorders According to New International Sleep Scoring Criteria

DEFF Research Database (Denmark)

Jensen, Peter S.; Sørensen, Helge Bjarup Dissing; Leonthin, Helle

2010-01-01

The aim of this study was to develop a fully automatic sleep scoring algorithm on the basis of a reproduction of new international sleep scoring criteria from the American Academy of Sleep Medicine. A biomedical signal processing algorithm was developed, allowing for automatic sleep depth....... Based on an observed reliability of the manual scorer of 92.5% (Cohen's Kappa: 0.87) in the normal group and 85.3% (Cohen's Kappa: 0.73) in the abnormal group, this study concluded that although the developed algorithm was capable of scoring normal sleep with an accuracy around the manual interscorer...... reliability, it failed in accurately scoring abnormal sleep as encountered for the Parkinson disease/multiple system atrophy patients....
Automatic sleep scoring in normals and in individuals with neurodegenerative disorders according to new international sleep scoring criteria

DEFF Research Database (Denmark)

Jensen, Peter S; Sorensen, Helge B D; Jennum, Poul

2010-01-01

The aim of this study was to develop a fully automatic sleep scoring algorithm on the basis of a reproduction of new international sleep scoring criteria from the American Academy of Sleep Medicine. A biomedical signal processing algorithm was developed, allowing for automatic sleep depth....... Based on an observed reliability of the manual scorer of 92.5% (Cohen's Kappa: 0.87) in the normal group and 85.3% (Cohen's Kappa: 0.73) in the abnormal group, this study concluded that although the developed algorithm was capable of scoring normal sleep with an accuracy around the manual interscorer...... reliability, it failed in accurately scoring abnormal sleep as encountered for the Parkinson disease/multiple system atrophy patients....
Fall Risk Score at the Time of Discharge Predicts Readmission Following Total Joint Arthroplasty.

Science.gov (United States)

Ravi, Bheeshma; Nan, Zhang; Schwartz, Adam J; Clarke, Henry D

2017-07-01

Readmission among Medicare recipients is a leading driver of healthcare expenditure. To date, most predictive tools are too coarse for direct clinical application. Our objective in this study is to determine if a pre-existing tool to identify patients at increased risk for inpatient falls, the Hendrich Fall Risk Score, could be used to accurately identify Medicare patients at increased risk for readmission following arthroplasty, regardless of whether the readmission was due to a fall. This study is a retrospective cohort study. We identified 2437 Medicare patients who underwent a primary elective total joint arthroplasty (TJA) of the hip or knee for osteoarthritis between 2011 and 2014. The Hendrich Fall Risk score was recorded for each patient preoperatively and postoperatively. Our main outcome measure was hospital readmission within 30 days of discharge. Of 2437 eligible TJA recipients, there were 226 (9.3%) patients who had a score ≥6. These patients were more likely to have an unplanned readmission (unadjusted odds ratio 2.84, 95% confidence interval 1.70-4.76, P 3 days (49.6% vs 36.6%, P = .0001), and were less likely to be sent home after discharge (20.8% vs 35.8%, P fall risk score after TJA is strongly associated with unplanned readmission. Application of this tool will allow hospitals to identify these patients and plan their discharge. Copyright © 2017 Elsevier Inc. All rights reserved.
Reliability and Validity of Beliefs about Substance Use (BSU Questionnaire in Alcohol Dependent Patients.

Directory of Open Access Journals (Sweden)

Selçuk ASLAN

2012-11-01

Full Text Available Objective: In this study, it is aimed to evaluate the validity and reliability of the Beliefs About Substance Use Questionnaire (BSU which was originally developed by Wright (1993. Method: Seventy alcohol addicted inpatients, who were admitted to Ankara Dışkapı Yıldırım Beyazıt Education and Research Hospital Psychiatry Clinic, 31 healthy volunteers who had never used alcohol and 33 social drinkers were evaluated. For all groups, BSU and Craving Beliefs Questionnaire (CBQ, for the patient groups, Beck Anxiety Inventory (BAI, Clinical Institute Withdrawal Assessment (CIWA, Dysfunctional Attitudes Questionnaire (DAS and Automatic Thoughts Questionnaire (ATQ were used as the assessment tools. The correlations and differences between the questionnaires were studied. Results: Mean age of the addicted patients, healthy controls and social drinkers were 42,3± 7,0, 33,5± 9,9 and 33,2± 8,9, respectively. In patient group, mean BSU score was 46,4 ± 21,2. For alcohol addicts, internal reliability of BSU was found to be adequate (Cronbach alfa=0.91 and item-total score correlations were between 0.33 and 0.69. Basic component analysis showed one basic factor. A positive correlation has been found between BSU and CBQ, and ATQ scores. No correlations have been found between total and subscale scores of DAS and total scores of CIWA, BAI and BSU. In evaluation of validity, BSU mean scores of alcohol addicts were found to be significantly higher than healthy controls and social drinkers. Conclusion: Our findings support that Turkish version of BSU is an adequate tool that can be used to evaluate alcohol addicted patients` cognitive believes about alcohol use
The validity, reliability and normative scores of the parent, teacher and self report versions of the Strengths and Difficulties Questionnaire in China

Directory of Open Access Journals (Sweden)

Coghill David

2008-04-01

Full Text Available Abstract Background The Strengths and Difficulties Questionnaire (SDQ has become one of the most widely used measurement tools in child and adolescent mental health work across the globe. The SDQ was originally developed and validated within the UK and whilst its reliability and validity have been replicated in several countries important cross cultural issues have been raised. We describe normative data, reliability and validity of the Chinese translation of the SDQ (parent, teacher and self report versions in a large group of children from Shanghai. Methods The SDQ was administered to the parents and teachers of students from 12 of Shanghai's 19 districts, aged between 3 and 17 years old, and to those young people aged between 11 and 17 years. Retest data was collected from parents and teachers for 45 students six weeks later. Data was analysed to describe normative scores, bandings and cut-offs for normal, borderline and abnormal scores. Reliability was assessed from analyses of internal consistency, inter-rater agreement, and temporal stability. Structural validity, convergent and discriminant validity were assessed. Results Full parent and teacher data was available for 1965 subjects and self report data for 690 subjects. Normative data for this Chinese urban population with bandings and cut-offs for borderline and abnormal scores are described. Principle components analysis indicates partial agreement with the original five factored subscale structure however this appears to hold more strongly for the Prosocial Behaviour, Hyperactivity – Inattention and Emotional Symptoms subscales than for Conduct Problems and Peer Problems. Internal consistency as measured by Cronbach's α coefficient were generally low ranging between 0.30 and 0.83 with only parent and teacher Hyperactivity – Inattention and teacher Prosocial Behaviour subscales having α > 0.7. Inter-rater correlations were similar to those reported previously (range 0.23 – 0
Reliability Generalization of the Alcohol Use Disorder Identification Test.

Science.gov (United States)

Shields, Alan L.; Caruso, John C.

2002-01-01

Evaluated the reliability of scores from the Alcohol Use Disorders Identification Test (AUDIT; J. Sounders and others, 1993) in a reliability generalization study based on 17 empirical journal articles. Results show AUDIT scores to be generally reliable for basic assessment. (SLD)
Acute Radiation Syndrome Severity Score System in Mouse Total-Body Irradiation Model.

Science.gov (United States)

Ossetrova, Natalia I; Ney, Patrick H; Condliffe, Donald P; Krasnopolsky, Katya; Hieber, Kevin P

2016-08-01

Radiation accidents or terrorist attacks can result in serious consequences for the civilian population and for military personnel responding to such emergencies. The early medical management situation requires quantitative indications for early initiation of cytokine therapy in individuals exposed to life-threatening radiation doses and effective triage tools for first responders in mass-casualty radiological incidents. Previously established animal (Mus musculus, Macaca mulatta) total-body irradiation (γ-exposure) models have evaluated a panel of radiation-responsive proteins that, together with peripheral blood cell counts, create a multiparametic dose-predictive algorithm with a threshold for detection of ~1 Gy from 1 to 7 d after exposure as well as demonstrate the acute radiation syndrome severity score systems created similar to the Medical Treatment Protocols for Radiation Accident Victims developed by Fliedner and colleagues. The authors present a further demonstration of the acute radiation sickness severity score system in a mouse (CD2F1, males) TBI model (1-14 Gy, Co γ-rays at 0.6 Gy min) based on multiple biodosimetric endpoints. This includes the acute radiation sickness severity Observational Grading System, survival rate, weight changes, temperature, peripheral blood cell counts and radiation-responsive protein expression profile: Flt-3 ligand, interleukin 6, granulocyte-colony stimulating factor, thrombopoietin, erythropoietin, and serum amyloid A. Results show that use of the multiple-parameter severity score system facilitates identification of animals requiring enhanced monitoring after irradiation and that proteomics are a complementary approach to conventional biodosimetry for early assessment of radiation exposure, enhancing accuracy and discrimination index for acute radiation sickness response categories and early prediction of outcome.
Developing of Individual Instrument Performance Anxiety Scale: ValidityReliability Study

Directory of Open Access Journals (Sweden)

Esra DALKIRAN

2016-07-01

Full Text Available In this study, it is intended to develop a scale unique to our culture, concerning individual instrument performance anxiety of the students who are getting instrument training in the Department of Music Education. In the study, the descriptive research model is used and qualitative research techniques are utilized. The study population consists of the students attending the 23 universities which has Music Education Department. The sample of the study consists of 438 girls and 312 boys, totally 750 students who are studying in the Department of Music Education of randomly selected 10 universities. As a result of the explanatory and confirmatory factor analyses that were performed, a onedimensional structure consisting of 14 items was obtained. Also, t-scores and the coefficient scores of total item correlation concerning the distinguishing power of the items, the difference in the scores of the set of lower and upper 27% was calculated, and it was observed that the items are distinguishing as a result of both analyses. Of the scale, Cronbach's alpha coefficient of internal consistency was calculated as .94, and test-retest reliability coefficient was calculated as .93. As a result, a valid and reliable assessment and evaluation instrument that measures the exam performance anxiety of the students studying in the Department of Music Education, has been developed.
Do Press Ganey Scores Correlate With Total Knee Arthroplasty-Specific Outcome Questionnaires in Postsurgical Patients?

Science.gov (United States)

Chughtai, Morad; Patel, Nirav K; Gwam, Chukwuweike U; Khlopas, Anton; Bonutti, Peter M; Delanois, Ronald E; Mont, Michael A

2017-09-01

The purpose of this study was to assess whether Center for Medicaid and Medicare services-implemented satisfaction (Press Ganey [PG]) survey results correlate with established total knee arthroplasty (TKA) assessment tools. Data from 736 patients who underwent TKA and received a PG survey between November 2009 and January 2015 were analyzed. The PG survey overall hospital rating scores were correlated with standardized validated outcome assessment tools for TKA (Short form-12 and 36 Health Survey; Knee Society Score; Western Ontario and McMaster Universities Arthritis Index; University of California, Los Angeles; and visual analog scale) at a mean follow-up of 1154 days post-TKA. There was no correlation between PG survey overall hospital rating score and the above-mentioned outcome assessment tools. Our study shows that there is no statistically significant relationship between established arthroplasty assessment tools and the PG overall hospital rating. Therefore, PG surveys may not be an appropriate tool to determine reimbursement for orthopedists performing TKAs. Copyright © 2017 Elsevier Inc. All rights reserved.
Reliability and Validity of 3 Methods of Assessing Orthopedic Resident Skill in Shoulder Surgery.

Science.gov (United States)

Bernard, Johnathan A; Dattilo, Jonathan R; Srikumaran, Uma; Zikria, Bashir A; Jain, Amit; LaPorte, Dawn M

Traditional measures for evaluating resident surgical technical skills (e.g., case logs) assess operative volume but not level of surgical proficiency. Our goal was to compare the reliability and validity of 3 tools for measuring surgical skill among orthopedic residents when performing 3 open surgical approaches to the shoulder. A total of 23 residents at different stages of their surgical training were tested for technical skill pertaining to 3 shoulder surgical approaches using the following measures: Objective Structured Assessment of Technical Skills (OSATS) checklists, the Global Rating Scale (GRS), and a final pass/fail assessment determined by 3 upper extremity surgeons. Adverse events were recorded. The Cronbach α coefficient was used to assess reliability of the OSATS checklists and GRS scores. Interrater reliability was calculated with intraclass correlation coefficients. Correlations among OSATS checklist scores, GRS scores, and pass/fail assessment were calculated with Spearman ρ. Validity of OSATS checklists was determined using analysis of variance with postgraduate year (PGY) as a between-subjects factor. Significance was set at p shoulder approaches. Checklist scores showed superior interrater reliability compared with GRS and subjective pass/fail measurements. GRS scores were positively correlated across training years. The incidence of adverse events was significantly higher among PGY-1 and PGY-2 residents compared with more experienced residents. OSATS checklists are a valid and reliable assessment of technical skills across 3 surgical shoulder approaches. However, checklist scores do not measure quality of technique. Documenting adverse events is necessary to assess quality of technique and ultimate pass/fail status. Multiple methods of assessing surgical skill should be considered when evaluating orthopedic resident surgical performance. Copyright Â© 2016 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights
Translation and Adaptation of Knee Injury and Osteoarthritis Outcome Score (KOOS in to Persian and Testing Persian Version Reliability Among Iranians with Osteoarthritis

Directory of Open Access Journals (Sweden)

Solaleh Saraei-Pour

2007-04-01

Full Text Available Objective: To achieve a reliable tool for measuring health related quality of life among Iranians with knee osteoarthritis, by translating and culturally adapting the Knee injury and Osteoarthritis Outcome Score(KOOS to Persian and testing the reliability and internal consistency of the Iranian version. Materials & Methods: It was a non experimental methodology study. KOOS was translated and adapted culturally to Persian language and culture in three phases with respect to IQOLA project. For examining test-retest reliability Iranians version of KOOS was corresponded twice with in at least two days or at most one week interval, by 30 Iranian people with knee OA whom were referred to Municipality and 110 physiotherapy clinics of Tehran with PT order by physicians. It was a non experimental methodological research and we used sample of convenience and non probability design for sampling. Psychometric evaluation: the collected data from the questionnaires was rated and analyzed with SPSS software from the aspects of test-retest reliability, absolute reliability, subscale and item internal consistency. Results: Internal consistency which was calculated by Cronbach '&alpha was high for all the subscales (at least 0.76, except for "symptom" subscale which was moderate, and showed that items of each subscale measured the same construct. Item internal consistency after correction for overlap, was higher than optimal value (0.4, except for the items of" symptom" subscale , which demonstrated good item internal consistency. SEM and ICC which were used for evaluating the absolute and test-retest reliability in respect showed that all the subscales had good test-retest reliability (0.7 and the absolute reliability was also very good in such away that the highest calculated SEM for Persian version was 7.44 which was less than Minimal Perceptible Clinical Improvement (MPCI that is estimated 8 to 10 for the KOOS questionnaire. Conclusion: With the Persian
The Reliability of Clock Drawing Test Scoring Systems Modeled on the Normative Data in Healthy Aging and Nonamnestic Mild Cognitive Impairment.

Science.gov (United States)

Mazancova, Adela Fendrych; Nikolai, Tomas; Stepankova, Hana; Kopecek, Miloslav; Bezdicek, Ondrej

2017-10-01

The Clock Drawing Test (CDT) is a commonly used tool in clinical practice and research for cognitive screening among older adults. The main goal of the present study was to analyze the interrater reliability of three different CDT scoring systems (by Shulman et al., Babins et al., and Cohen et al.). We used a clock with a predrawn circle. The CDT was evaluated by three independent raters based on the normative data set of healthy older and very old adults and patients with nonamnestic mild cognitive impairment (naMCI; N = 438; aged 61-94). We confirmed a high interrater reliability measured by the intraclass correlation coefficients (ICCs): Shulman ICC = .809, Babins ICC = .894, and Cohen ICC = .862, all p < .001. We found that age and education levels have a significant effect on CDT performance, yet there was no influence of gender. Finally, the scoring systems differentiated between naMCI and age- and education-matched controls: Shulman's area under the receiver operating characteristic curve (AUC) = .84, Cohen AUC = .71, all p < .001; and a slightly lower discriminative ability was shown by Babins: AUC = .65, p = .012.
Reliability and Accuracy of Cross-sectional Radiographic Assessment of Severe Knee Osteoarthritis: Role of Training and Experience.

Science.gov (United States)

Klara, Kristina; Collins, Jamie E; Gurary, Ellen; Elman, Scott A; Stenquist, Derek S; Losina, Elena; Katz, Jeffrey N

2016-07-01

To dêtermine the reliability of radiographic assessment of knee osteoarthritis (OA) by nonclinician readers compared to an experienced radiologist. The radiologist trained 3 nonclinicians to evaluate radiographic characteristics of knee OA. The radiologist and nonclinicians read preoperative films of 36 patients prior to total knee replacement. Intrareader and interreader reliability were measured using the weighted κ statistic and intraclass correlation coefficient (ICC). Scores κ reliability among nonclinicians (κ) ranged from 0.40 to 1.0 for individual radiographic features and 0.72 to 1.0 for Kellgren-Lawrence (KL) grade. ICC ranged from 0.89 to 0.98 for the Osteoarthritis Research Society International (OARSI) summary score. Interreader agreement among nonclinicians ranged from κ of 0.45 to 0.94 for individual features, and 0.66 to 0.97 for KL grade. ICC ranged from 0.87 to 0.96 for the OARSI Summary Score. Interreader reliability between nonclinicians and the radiologist ranged from κ of 0.56 to 0.85 for KL grade. ICC ranged from 0.79 to 0.88 for the OARSI Summary Score. Intrareader and interreader agreement was variable for individual radiograph features but substantial for summary KL grade and OARSI Summary Score. Investigators face tradeoffs between cost and reader experience. These data suggest that in settings where costs are constrained, trained nonclinicians may be suitable readers of radiographic knee OA, particularly if a summary score (KL grade or OARSI Score) is used to determine radiographic severity.
Development, reliability and validity of the psychosocial adaptation scale for Parkinson's disease in Chinese population.

Science.gov (United States)

Zhang, Tingting; Yin, Anchun; Sun, Xiaohong; Liu, Qigui; Song, Guirong; Li, Lianhong

2015-01-01

To develop psychosocial adaptation scale for Parkinson's disease (PD) in Chinese population and evaluate its reliability and validity. The items were designed by literature review, expert consultation and semi-structured interview. The methods of corrected item-total correlation, discrimination analysis and exploratory factor analysis were used for items selection. 427 valid scales from PD patients were collected in the study to test the reliability and validity. The scale incorporated six dimensions: anxiety, self-esteem, attitude, self-acceptance, self-efficacy and social support, a total of 32 items. The scale possessed good internal consistency. The test-retest correlation coefficient was 0.99 and average content validation rate was 0.97. The Hoehn and Yahr stage were correlated with total score of the scale. The psychosocial adaptation scale in this study showed good reliability and validity, it can be used as a reliable and valid instrument to evaluate the psychosocial adaptation of PD objectively and effectively.

Use of the Liverpool Elbow Score as a postal questionnaire for the assessment of outcome after total elbow arthroplasty.

Science.gov (United States)

Ashmore, Alexander M; Gozzard, Charles; Blewitt, Neil

2007-01-01

The Liverpool Elbow Score (LES) is a newly developed, validated elbow-specific score. It consists of a patient-answered questionnaire (PAQ) and a clinical assessment. The purpose of this study was to determine whether the PAQ portion of the LES could be used independently as a postal questionnaire for the assessment of outcome after total elbow arthroplasty and to correlate the LES and the Mayo Elbow Performance Score (MEPS). A series of 51 total elbow replacements were reviewed by postal questionnaire. Patients then attended the clinic for assessment by use of both the LES and the MEPS. There was an excellent response rate to the postal questionnaire (98%), and 44 elbows were available for clinical review. Good correlation was shown between the LES and the MEPS (Spearman correlation coefficient, 0.84; P PAQ portion of the LES and the MEPS (Spearman correlation coefficient, 0.76; P PAQ component and the MEPS, suggesting that outcome assessment is possible by postal questionnaire.
Translation, cross-cultural adaptation and validation of the Danish version of the Oxford hip score

DEFF Research Database (Denmark)

Paulsen, A; Odgaard, Anders; Overgaard, S

2012-01-01

missing to calculate a sum score. Construct validity was adequate and 80% of our predefined hypotheses regarding the correlation between scores on the Danish OHS and the other questionnaires were confirmed. The intraclass correlation (ICC) of the different items ranged from 0.80 to 0.95 and the average......Objectives The Oxford hip score (OHS) is a 12-item questionnaire designed and developed to assess function and pain from the perspective of patients who are undergoing total hip replacement (THR). The OHS has been shown to be consistent, reliable, valid and sensitive to clinical change following...
Automatic sleep scoring in normals and in individuals with neurodegenerative disorders according to new international sleep scoring criteria

DEFF Research Database (Denmark)

Jensen, Peter S.; Sørensen, Helge Bjarup Dissing; Jennum, P. J.

2010-01-01

Medicine (AASM). Methods: A biomedical signal processing algorithm was developed, allowing for automatic sleep depth quantification of routine polysomnographic (PSG) recordings through feature extraction, supervised probabilistic Bayesian classification, and heuristic rule-based smoothing. The performance......Introduction: Reliable polysomnographic classification is the basis for evaluation of sleep disorders in neurological diseases. Aim: To develop a fully automatic sleep scoring algorithm on the basis of a reproduction of new international sleep scoring criteria from the American Academy of Sleep....... Conclusion: The developed algorithm was capable of scoring normal sleep with an accuracy around the manual inter-scorer reliability, it failed in accurately scoring abnormal sleep as encountered for the PD/MSA patients, which is due to the abnormal micro- and macrostructure pattern in these patients....
Preliminary validation of 2 magnetic resonance image scoring systems for osteoarthritis of the hip according to the OMERACT filter.

Science.gov (United States)

Maksymowych, Walter P; Cibere, Jolanda; Loeuille, Damien; Weber, Ulrich; Zubler, Veronika; Roemer, Frank W; Jaremko, Jacob L; Sayre, Eric C; Lambert, Robert G W

2014-02-01

Development of a validated magnetic resonance image (MRI) scoring system is essential in hip OA because radiographs are insensitive to change. We assessed the feasibility and reliability of 2 previously developed scoring methods: (1) the Hip Inflammation MRI Scoring System (HIMRISS) and (2) the Hip Osteoarthritis MRI Scoring System (HOAMS). Six readers (3 radiologists, 3 rheumatologists) participated in 2 reading exercises. In Reading Exercise 1, MRI of the hip of 20 subjects were read at a single time point followed by further standardization of methodology. In Reading Exercise 2, MRI of the hip of 18 subjects from a randomized controlled trial, assessed at 2 timepoints, and 27 subjects from a cross-sectional study were read for HIMRISS and HOAMS bone marrow lesions (BML) and synovitis. Reliability was assessed using intraclass correlation coefficient (ICC) and kappa statistics. Both methods were considered feasible. For Reading 1, HIMRISS ICC were 0.52, 0.61, 0.70, and 0.58 for femoral BML, acetabular BML, effusion, and total scores, respectively; and for HOAMS, summed BML and synovitis ICC were 0.52 and 0.46, respectively. For Reading 2, HIMRISS and HOAMS ICC for BML and synovitis-effusion improved substantially. Interobserver reliability for change scores was 0.81 and 0.71 for HIMRISS femoral and HOAMS summed BML, respectively. Responsiveness and discrimination was moderate to high for synovitis-effusion. Significant associations were noted between BML or synovitis scores and Western Ontario and McMaster Universities Osteoarthritis Index pain scores for baseline values (p ≤ 0.001). The BML and synovitis-effusion components of both HIMRISS and HOAMS scoring systems are feasible and reliable, and should be validated further.
Reliability and validity of a brief sleep questionnaire for children in Japan.

Science.gov (United States)

Okada, Masakazu; Kitamura, Shingo; Iwadare, Yoshitaka; Tachimori, Hisateru; Kamei, Yuichi; Higuchi, Shigekazu; Mishima, Kazuo

2017-09-15

There is a dearth of sleep questionnaires with few items and confirmed reliability and validity that can be used for the early detection of sleep problems in children. The aim of this study was to develop a questionnaire with few items and assess its reliability and validity in both children at high risk of sleep disorders and a community population. Data for analysis were derived from two populations targeted by the Children's Sleep Habits Questionnaire (CSHQ): 178 children attending elementary school and 432 children who visited a pediatric psychiatric hospital (aged 6-12 years). The new questionnaire was constructed as a subset of the CSHQ. The newly developed short version of the sleep questionnaire for children (19 items) had an acceptable internal consistency (0.65). Using the cutoff value of the CSHQ, the total score of the new questionnaire was confirmed to have discriminant validity (27.2 ± 3.9 vs. 22.0 ± 2.1, p questionnaire was significantly correlated with total score (r = 0.81, p questionnaire demonstrated an adequate reliability and validity in both high-risk children and a community population, as well as similar screening ability to the CSHQ. It could thus be a convenient instrument to detect sleep problems in children.
Interobserver variability of the neurological optimality score

NARCIS (Netherlands)

Monincx, W. M.; Smolders-de Haas, H.; Bonsel, G. J.; Zondervan, H. A.

1999-01-01

To assess the interobserver reliability of the neurological optimality score. The neurological optimality score of 21 full term healthy, neurologically normal newborn infants was determined by two well trained observers. The interclass correlation coefficient was 0.31. Kappa for optimality (score of
How do cognitively impaired elderly patients define "testament": reliability and validity of the testament definition scale.

Science.gov (United States)

Heinik, J; Werner, P; Lin, R

1999-01-01

The testament definition scale (TDS) is a specifically designed six-item scale aimed at measuring the respondent's capacity to define "testament." We assessed the reliability and validity of this new short scale in 31 community-dwelling cognitively impaired elderly patients. Interrater reliability for the six items ranged from .87 to .97. The interrater reliability for the total score was .77. Significant correlations were found between the TDS score and the Mini-Mental State Examination (MMSE) and the Cambridge Cognitive Examination scores (r = .71 and .72 respectively, p = .001). Criterion validity yielded significantly different means for subjects with MMSE scores of 24-30 and 0-23: mean 3.9 and 1.6 respectively (t(20) = 4.7, p = .001). Using a cutoff point of 0-2 vs. 3+, 79% of the subjects were correctly classified as severely cognitively impaired, with only 8.3% false positives, and a positive predictive value of 94%. Thus, TDS was found both reliable and valid. This scale, however, is not synonymous with testamentary capacity. The discussion deals with the methodological limitations of this study, and highlights the practical as well as the theoretical relevance of TDS. Future studies are warranted to elucidate the relationships between TDS and existing legal requirements of testamentary capacity.
Association between Diet-Quality Scores, Adiposity, Total Cholesterol and Markers of Nutritional Status in European Adults: Findings from the Food4Me Study

Directory of Open Access Journals (Sweden)

Rosalind Fallaize

2018-01-01

Full Text Available Diet-quality scores (DQS, which are developed across the globe, are used to define adherence to specific eating patterns and have been associated with risk of coronary heart disease and type-II diabetes. We explored the association between five diet-quality scores (Healthy Eating Index, HEI; Alternate Healthy Eating Index, AHEI; MedDietScore, MDS; PREDIMED Mediterranean Diet Score, P-MDS; Dutch Healthy Diet-Index, DHDI and markers of metabolic health (anthropometry, objective physical activity levels (PAL, and dried blood spot total cholesterol (TC, total carotenoids, and omega-3 index in the Food4Me cohort, using regression analysis. Dietary intake was assessed using a validated Food Frequency Questionnaire. Participants (n = 1480 were adults recruited from seven European Union (EU countries. Overall, women had higher HEI and AHEI than men (p < 0.05, and scores varied significantly between countries. For all DQS, higher scores were associated with lower body mass index, lower waist-to-height ratio and waist circumference, and higher total carotenoids and omega-3-index (p trends < 0.05. Higher HEI, AHEI, DHDI, and P-MDS scores were associated with increased daily PAL, moderate and vigorous activity, and reduced sedentary behaviour (p trend < 0.05. We observed no association between DQS and TC. To conclude, higher DQS, which reflect better dietary patterns, were associated with markers of better nutritional status and metabolic health.
Basic Concepts in Classical Test Theory: Tests Aren't Reliable, the Nature of Alpha, and Reliability Generalization as a Meta-analytic Method.

Science.gov (United States)

Helms, LuAnn Sherbeck

This paper discusses the fact that reliability is about scores and not tests and how reliability limits effect sizes. The paper also explores the classical reliability coefficients of stability, equivalence, and internal consistency. Stability is concerned with how stable test scores will be over time, while equivalence addresses the relationship…
Manual muscle testing and hand-held dynamometry in people with inflammatory myopathy: An intra- and interrater reliability and validity study.

Science.gov (United States)

Baschung Pfister, Pierrette; de Bruin, Eling D; Sterkele, Iris; Maurer, Britta; de Bie, Rob A; Knols, Ruud H

2018-01-01

Manual muscle testing (MMT) and hand-held dynamometry (HHD) are commonly used in people with inflammatory myopathy (IM), but their clinimetric properties have not yet been sufficiently studied. To evaluate the reliability and validity of MMT and HHD, maximum isometric strength was measured in eight muscle groups across three measurement events. To evaluate reliability of HHD, intra-class correlation coefficients (ICC), the standard error of measurements (SEM) and smallest detectable changes (SDC) were calculated. To measure reliability of MMT linear Cohen`s Kappa was computed for single muscle groups and ICC for total score. Additionally, correlations between MMT8 and HHD were evaluated with Spearman Correlation Coefficients. Fifty people with myositis (56±14 years, 76% female) were included in the study. Intra-and interrater reliability of HHD yielded excellent ICCs (0.75-0.97) for all muscle groups, except for interrater reliability of ankle extension (0.61). The corresponding SEMs% ranged from 8 to 28% and the SDCs% from 23 to 65%. MMT8 total score revealed excellent intra-and interrater reliability (ICC>0.9). Intrarater reliability of single muscle groups was substantial for shoulder and hip abduction, elbow and neck flexion, and hip extension (0.64-0.69); moderate for wrist (0.53) and knee extension (0.49) and fair for ankle extension (0.35). Interrater reliability was moderate for neck flexion (0.54) and hip abduction (0.44); fair for shoulder abduction, elbow flexion, wrist and ankle extension (0.20-0.33); and slight for knee extension (0.08). Correlations between the two tests were low for wrist, knee, ankle, and hip extension; moderate for elbow flexion, neck flexion and hip abduction; and good for shoulder abduction. In conclusion, the MMT8 total score is a reliable assessment to consider general muscle weakness in people with myositis but not for single muscle groups. In contrast, our results confirm that HHD can be recommended to evaluate strength of
Conditional Standard Errors of Measurement for Scale Scores.

Science.gov (United States)

Kolen, Michael J.; And Others

1992-01-01

A procedure is described for estimating the reliability and conditional standard errors of measurement of scale scores incorporating the discrete transformation of raw scores to scale scores. The method is illustrated using a strong true score model, and practical applications are described. (SLD)
Reliability and Validity of the Multidimensional Scale of Perceived Social Support (MSPSS): Thai Version.

Science.gov (United States)

Wongpakaran, Tinakon; Wongpakaran, Nahathai; Ruktrakul, Ruk

2011-01-01

This study examines the Thai version of the Multidimensional Scale of Perceived Social Support (MSPSS) for its psychometric properties. In total 462 participants were recruited - 310 medical students from Chiang Mai University and 152 psychiatric patients, and they completed the Thai version of the MSPSS, the State Trait Anxiety Inventory (STAI), the Rosenberg Self-Esteem Scale (RSES) and the Thai Depression Inventory (TDI). Test-retest reliability was conducted over a four week period. Factor analysis produced three-factor solutions for both patient (PG) and student groups (SG), and overall the model demonstrated adequate fit indices. The mean total score and the sub-scale score for the SG were statistically higher than those in the PG, except for 'Significant Others'. The internal consistency of the scale was good, with a Cronbach's alpha of 0.91 for the SG and 0.87 for the PG. After a four week retest for reliability exercise, the intra-class correlation coefficient (ICC) was found to be 0.84. The Thai-MSPSS was found to have a negative correlation with the STAI and the TDI, but was positively correlated with the RSES. The Thai MSPSS is a reliable and valid instrument to use.
Translation, reliability, and clinical utility of the Melbourne Assessment 2.

Science.gov (United States)

Gerber, Corinna N; Plebani, Anael; Labruyère, Rob

2017-10-12

The aims were to (i) provide a German translation of the Melbourne Assessment 2 (MA2), a quantitative test to measure unilateral upper limb function in children with neurological disabilities and (ii) to evaluate its reliability and aspects of clinical utility. After its translation into German and approval of the back translation by the original authors, the MA2 was performed and videotaped twice with 30 children with neuromotor disorders. For each participant, two raters scored the video of the first test for inter-rater reliability. To determine test-retest reliability, one rater additionally scored the video of the second test while the other rater repeated the scoring of the first video to evaluate intra-rater reliability. Time needed for rater training, test administration, and scoring was recorded. The four subscale scores showed excellent intra-, inter-rater, and test-retest reliability with intraclass correlation coefficients of 0.90-1.00 (95%-confidence intervals 0.78-1.00). Score items revealed substantial to almost perfect intra-rater reliability (weighted kappa k w = 0.66-1.00) for the more affected side. Score item inter-rater and test-retest reliability of the same extremity were, with one exception, moderate to almost perfect (k w = 0.42-0.97; k w = 0.40-0.89). Furthermore, the MA2 was feasible and acceptable for patients and clinicians. The MA2 showed excellent subscale and moderate to almost perfect score item reliability. Implications for Rehabilitation There is a lack of high-quality studies about psychometric properties of upper limb measurement tools in the neuropediatric population. The Melbourne Assessment 2 is a promising tool for reliable measurement of unilateral upper limb movement quality in the neuropediatric population. The Melbourne Assessment 2 is acceptable and practicable to therapists and patients for routine use in clinical care.
Translation and validation of the Dutch new Knee Society Scoring System ©.

Science.gov (United States)

Van Der Straeten, Catherine; Witvrouw, Erik; Willems, Tine; Bellemans, Johan; Victor, Jan

2013-11-01

A new version of The Knee Society Knee Scoring System(©) (KSS) has recently been developed. Before this scale can be used in non-English-speaking populations, it has to be translated and validated for a particular population. We evaluated the construct and content validity, the test-retest reliability, and the internal consistency of the Dutch version of the New Knee Society KSS. A Dutch translation was performed using a forward-backward translation protocol. We tested the construct validity of the Dutch New KSS by comparing it with the Dutch versions of the WOMAC, Knee Injury and Osteoarthritis Outcome Score (KOOS), and SF-12 scores in 137 patients undergoing total knee arthroplasty (TKA). Content validity was assessed by comparing pre- and postoperative scores and by checking floor and ceiling effects. To evaluate test-retest reliability and consistency, 47 patients completed the questionnaire a second time with a mean of 8 days interval (range, 2-20 days) between tests. Construct validity was demonstrated because the Dutch New KSS correlated well with the Dutch WOMAC (r = -0.751; p Dutch KOOS (r = -0.723; p Dutch SF-12 (r = 0.569; p Dutch New KSS is an excellent instrument to evaluate TKA outcome in Dutch-speaking patients.
Reliability and validity of the korean version of the connor-davidson resilience scale.

Science.gov (United States)

Baek, Hyun-Sook; Lee, Kyoung-Uk; Joo, Eun-Jeong; Lee, Mi-Young; Choi, Kyeong-Sook

2010-06-01

The Connor-Davidson Resilience Scale (CD-RISC) measures various aspects of psychological resilience in patients with posttraumatic stress disorder (PTSD) and other psychiatric ailments. This study sought to assess the reliability and validity of the Korean version of the Connor-Davidson Resilience Scale (K-CD-RISC). In total, 576 participants were enrolled (497 females and 79 males), including hospital nurses, university students, and firefighters. Subjects were evaluated using the K-CD-RISC, the Beck Depression Inventory (BDI), the Impact of Event Scale-Revised (IES-R), the Rosenberg Self-Esteem Scale (RSES), and the Perceived Stress Scale (PSS). Test-retest reliability and internal consistency were examined as a measure of reliability, and convergent validity and factor analysis were also performed to evaluate validity. Cronbach's alpha coefficient and test-retest reliability were 0.93 and 0.93, respectively. The total score on the K-CD-RISC was positively correlated with the RSES (r=0.56, preliability and validity for measurement of resilience among Korean subjects.
Validity and Reliability of Visual Analog Scale Foot and Ankle: The Turkish Version.

Science.gov (United States)

Gur, Gozde; Turgut, Elif; Dilek, Burcu; Baltaci, Gul; Bek, Nilgun; Yakut, Yavuz

The present study tested the reliability and validity of the Turkish version of the visual analog scale foot and ankle (VAS-FA) among healthy subjects and patients with foot problems. A total of 128 participants, 65 healthy subjects and 63 patients with foot problems, were evaluated. The VAS-FA was translated into Turkish and administered to the 128 subjects on 2 separate occasions with a 5-day interval. The test-retest reliability and internal consistency were assessed with the intraclass correlation coefficient and Cronbach's α. The validity was assessed using the correlations with Turkish versions of the Foot Function Index, the Foot and Ankle Outcome Score, and the Short-Form 36-item Health Survey. A statistically significant difference was found between the healthy group and the patient group in the overall score and subscale scores of the VAS-FA (p Foot Function Index, Foot and Ankle Outcome Score, and Short-Form 36-item Health Survey scores in the healthy and patient groups both. The Turkish version of the VAS-FA is sensitive enough to distinguish foot and ankle-specific pathologic conditions from asymptomatic conditions. The Turkish version of the VAS-FA is a reliable and valid method and can be used for foot-related problems. Copyright © 2017 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.
Development, reliability and validity of the psychosocial adaptation scale for Parkinson’s disease in Chinese population

Science.gov (United States)

Zhang, Tingting; Yin, Anchun; Sun, Xiaohong; Liu, Qigui; Song, Guirong; Li, Lianhong

2015-01-01

Objective: To develop psychosocial adaptation scale for Parkinson’s disease (PD) in Chinese population and evaluate its reliability and validity. Methods: The items were designed by literature review, expert consultation and semi-structured interview. The methods of corrected item-total correlation, discrimination analysis and exploratory factor analysis were used for items selection. 427 valid scales from PD patients were collected in the study to test the reliability and validity. Results: The scale incorporated six dimensions: anxiety, self-esteem, attitude, self-acceptance, self-efficacy and social support, a total of 32 items. The scale possessed good internal consistency. The test-retest correlation coefficient was 0.99 and average content validation rate was 0.97. The Hoehn and Yahr stage were correlated with total score of the scale. Conclusions: The psychosocial adaptation scale in this study showed good reliability and validity, it can be used as a reliable and valid instrument to evaluate the psychosocial adaptation of PD objectively and effectively. PMID:26770638
External Validation and Evaluation of Reliability and Validity of the Modified Seoul National University Renal Stone Complexity Scoring System to Predict Stone-Free Status After Retrograde Intrarenal Surgery.

Science.gov (United States)

Park, Juhyun; Kang, Minyong; Jeong, Chang Wook; Oh, Sohee; Lee, Jeong Woo; Lee, Seung Bae; Son, Hwancheol; Jeong, Hyeon; Cho, Sung Yong

2015-08-01

The modified Seoul National University Renal Stone Complexity scoring system (S-ReSC-R) for retrograde intrarenal surgery (RIRS) was developed as a tool to predict stone-free rate (SFR) after RIRS. We externally validated the S-ReSC-R. We retrospectively reviewed 159 patients who underwent RIRS. The S-ReSC-R was assigned from 1 to 12 according to the location and number of sites involved. The stone-free status was defined as no evidence of a stone or with clinically insignificant residual fragment stones less than 2 mm. Interobserver and test-retest reliabilities were evaluated. Statistical performance of the prediction model was assessed by its predictive accuracy, predictive probability, and clinical usefulness. Overall SFR was 73.0%. The SFRs were 86.7%, 70.2%, and 48.6% in low-score (1-2), intermediate-score (3-4), and high-score (5-12) groups, respectively (pR revealed an area under the curve (AUC) of 0.731 (95% CI 0.650-0.813). The AUC of the three-titered S-ReSC-R was 0.701 (95% CI 0.609-0.794). The calibration plot showed that the predicted probability of SFR had a concordance comparable to that of observed frequency. The Hosmer-Lemeshow goodness of fit test revealed a p-value of 0.01 for the S-ReSC-R and 0.90 for the three-titered S-ReSC-R. Interobserver and test-retest reliabilities revealed an almost perfect level of agreement. The present study proved the predictive value of S-ReSC-R to predict SFR following RIRS in an independent cohort. Interobserver and test-retest reliabilities confirmed that S-ReSC-R was reliable and valid.
Reliability and Validity of the Korean Cancer Pain Assessment Tool (KCPAT)

Science.gov (United States)

Kim, Jeong A; Lee, Juneyoung; Park, Jeanno; Lee, Myung Ah; Yeom, Chang Hwan; Jang, Se Kwon; Yoon, Duck Mi; Kim, Jun Suk

2005-01-01

The Korean Cancer Pain Assessment Tool (KCPAT), which was developed in 2003, consists of questions concerning the location of pain, the nature of pain, the present pain intensity, the symptoms associated with the pain, and psychosocial/spiritual pain assessments. This study was carried out to evaluate the reliability and validity of the KCPAT. A stratified, proportional-quota, clustered, systematic sampling procedure was used. The study population (903 cancer patients) was 1% of the target population (90,252 cancer patients). A total of 314 (34.8%) questionnaires were collected. The results showed that the average pain score (5 point on Likert scale) according to the cancer type and the at-present average pain score (VAS, 0-10) were correlated (r=0.56, p<0.0001), and showed moderate agreement (kappa=0.364). The mean satisfaction score was 3.8 (1-5). The average time to complete the questionnaire was 8.9 min. In conclusion, the KCPAT is a reliable and valid instrument for assessing cancer pain in Koreans. PMID:16224166
Reliability of four experimental mechanical pain tests in children

Directory of Open Access Journals (Sweden)

Soee AL

2013-02-01

Full Text Available Ann-Britt L Soee,1 Lise L Thomsen,2 Birte Tornoe,1,3 Liselotte Skov11Department of Pediatrics, Children’s Headache Clinic, Copenhagen University Hospital Herlev, Copenhagen, Denmark; 2Department of Neuropediatrics, Juliane Marie Centre, Copenhagen University Hospital Rigshospitalet, København Ø, Denmark; 3Department of Physiotherapy, Medical Department O, Copenhagen University Hospital Herlev, Herlev, DenmarkPurpose: In order to study pain in children, it is necessary to determine whether pain measurement tools used in adults are reliable measurements in children. The aim of this study was to explore the intrasession reliability of pressure pain thresholds (PPT in healthy children. Furthermore, the aim was also to study the intersession reliability of the following four tests: (1 Total Tenderness Score; (2 PPT; (3 Visual Analog Scale score at suprapressure pain threshold; and (4 area under the curve (stimulus–response functions for pressure versus pain.Participants and methods: Twenty-five healthy school children, 8–14 years of age, participated. Test 2, PPT, was repeated three times at 2 minute intervals on the same day to estimate PPT intrasession reliability using Cronbach’s alpha. Tests 1–4 were repeated after median 21 (interquartile range 10.5–22 days, and Pearson’s correlation coefficient was used to describe the intersession reliability.Results: The PPT test was precise and reliable (Cronbach’s alpha ≥ 0.92. All tests showed a good to excellent correlation between days (intersessions r = 0.66–0.81. There were no indications of significant systematic differences found in any of the four tests between days.Conclusion: All tests seemed to be reliable measurements in pain evaluation in healthy children aged 8–14 years. Given the small sample size, this conclusion needs to be confirmed in future studies.Keywords: repeatability, intraindividual reliability, pressure pain threshold, pain measurement, algometer

Prediction of true test scores from observed item scores and ancillary data.

Science.gov (United States)

Haberman, Shelby J; Yao, Lili; Sinharay, Sandip

2015-05-01

In many educational tests which involve constructed responses, a traditional test score is obtained by adding together item scores obtained through holistic scoring by trained human raters. For example, this practice was used until 2008 in the case of GRE(®) General Analytical Writing and until 2009 in the case of TOEFL(®) iBT Writing. With use of natural language processing, it is possible to obtain additional information concerning item responses from computer programs such as e-rater(®). In addition, available information relevant to examinee performance may include scores on related tests. We suggest application of standard results from classical test theory to the available data to obtain best linear predictors of true traditional test scores. In performing such analysis, we require estimation of variances and covariances of measurement errors, a task which can be quite difficult in the case of tests with limited numbers of items and with multiple measurements per item. As a consequence, a new estimation method is suggested based on samples of examinees who have taken an assessment more than once. Such samples are typically not random samples of the general population of examinees, so that we apply statistical adjustment methods to obtain the needed estimated variances and covariances of measurement errors. To examine practical implications of the suggested methods of analysis, applications are made to GRE General Analytical Writing and TOEFL iBT Writing. Results obtained indicate that substantial improvements are possible both in terms of reliability of scoring and in terms of assessment reliability. © 2015 The British Psychological Society.
Sway Area and Velocity Correlated With MobileMat Balance Error Scoring System (BESS) Scores.

Science.gov (United States)

Caccese, Jaclyn B; Buckley, Thomas A; Kaminski, Thomas W

2016-08-01

The Balance Error Scoring System (BESS) is often used for sport-related concussion balance assessment. However, moderate intratester and intertester reliability may cause low initial sensitivity, suggesting that a more objective balance assessment method is needed. The MobileMat BESS was designed for objective BESS scoring, but the outcome measures must be validated with reliable balance measures. Thus, the purpose of this investigation was to compare MobileMat BESS scores to linear and nonlinear measures of balance. Eighty-eight healthy collegiate student-athletes (age: 20.0 ± 1.4 y, height: 177.7 ± 10.7 cm, mass: 74.8 ± 13.7 kg) completed the MobileMat BESS. MobileMat BESS scores were compared with 95% area, sway velocity, approximate entropy, and sample entropy. MobileMat BESS scores were significantly correlated with 95% area for single-leg (r = .332) and tandem firm (r = .474), and double-leg foam (r = .660); and with sway velocity for single-leg (r = .406) and tandem firm (r = .601), and double-leg (r = .575) and single-leg foam (r = .434). MobileMat BESS scores were not correlated with approximate or sample entropy. MobileMat BESS scores were low to moderately correlated with linear measures, suggesting the ability to identify changes in the center of mass-center of pressure relationship, but not higher-order processing associated with nonlinear measures. These results suggest that the MobileMat BESS may be a clinically-useful tool that provides objective linear balance measures.
Coronary artery calcification score by multislice computed tomography predicts the outcome of dobutamine cardiovascular magnetic resonance imaging

International Nuclear Information System (INIS)

Janssen, Caroline H.C.; Vliegenthart, Rozemarijn; Overbosch, Jelle; Oudkerk, Matthijs; Kuijpers, Dirkjan; Dijkman, Paul R.M. van; Zijlstra, Felix

2005-01-01

The aim of this study was to determine whether a coronary artery calcium (CAC) score of less than 11 can reliably rule out myocardial ischemia detected by dobutamine cardiovascular magnetic resonance imaging (CMR) in patients suspected of having myocardial ischemia. In 114 of 136 consecutive patients clinically suspected of myocardial ischemia with an inconclusive diagnosis of myocardial ischemia, dobutamine CMR was performed and the CAC score was determined. The CAC score was obtained by 16-row multidetector computed tomography (MDCT) and was calculated according to the method of Agatston. The CAC score and the results of the dobutamine CMR were correlated and the positive predictive value (PPV) and the negative predictive value (NPV) of the CAC score for dobutamine CMR were calculated. A total of 114 (87%) of the patients were eligible for this study. There was a significant correlation between the CAC score and dobutamine CMR (p<0.001). Patients with a CAC score of less than 11 showed no signs of inducible ischemia during dobutamine CMR. For a CAC score of less than 101, the NPV and the PPV of the CAC score for the outcome of dobutamine CMR were, respectively, 0.96 and 0.29. In patients with an inconclusive diagnosis of myocardial ischemia a MDCT CAC score of less than 11 reliably rules out myocardial ischemia detected by dobutamine CMR. (orig.)
Coronary artery calcification score by multislice computed tomography predicts the outcome of dobutamine cardiovascular magnetic resonance imaging.

Science.gov (United States)

Janssen, Caroline H C; Kuijpers, Dirkjan; Vliegenthart, Rozemarijn; Overbosch, Jelle; van Dijkman, Paul R M; Zijlstra, Felix; Oudkerk, Matthijs

2005-06-01

The aim of this study was to determine whether a coronary artery calcium (CAC) score of less than 11 can reliably rule out myocardial ischemia detected by dobutamine cardiovascular magnetic resonance imaging (CMR) in patients suspected of having myocardial ischemia. In 114 of 136 consecutive patients clinically suspected of myocardial ischemia with an inconclusive diagnosis of myocardial ischemia, dobutamine CMR was performed and the CAC score was determined. The CAC score was obtained by 16-row multidetector compued tomography (MDCT) and was calculated according to the method of Agatston. The CAC score and the results of the dobutamine CMR were correlated and the positive predictive value (PPV) and the negative predictive value (NPV) of the CAC score for dobutamine CMR were calculated. A total of 114 (87%) of the patients were eligible for this study. There was a significant correlation between the CAC score and dobutamine CMR (p<0.001). Patients with a CAC score of less than 11 showed no signs of inducible ischemia during dobutamine CMR. For a CAC score of less than 101, the NPV and the PPV of the CAC score for the outcome of dobutamine CMR were, respectively, 0.96 and 0.29. In patients with an inconclusive diagnosis of myocardial ischemia a MDCT CAC score of less than 11 reliably rules out myocardial ischemia detected by dobutamine CMR.
Coronary artery calcification score by multislice computed tomography predicts the outcome of dobutamine cardiovascular magnetic resonance imaging

Energy Technology Data Exchange (ETDEWEB)

Janssen, Caroline H.C.; Vliegenthart, Rozemarijn; Overbosch, Jelle; Oudkerk, Matthijs [University Hospital Groningen, Department of Radiology, Groningen (Netherlands); Kuijpers, Dirkjan [University Hospital Groningen, Department of Radiology, Groningen (Netherlands); Bronovo Hospital, Department of Radiology, The Hague (Netherlands); Dijkman, Paul R.M. van [Bronovo Hospital, Department of Cardiology, The Hague (Netherlands); Zijlstra, Felix [University Hospital Groningen, Department of Cardiology, Groningen (Netherlands)

2005-06-01

The aim of this study was to determine whether a coronary artery calcium (CAC) score of less than 11 can reliably rule out myocardial ischemia detected by dobutamine cardiovascular magnetic resonance imaging (CMR) in patients suspected of having myocardial ischemia. In 114 of 136 consecutive patients clinically suspected of myocardial ischemia with an inconclusive diagnosis of myocardial ischemia, dobutamine CMR was performed and the CAC score was determined. The CAC score was obtained by 16-row multidetector computed tomography (MDCT) and was calculated according to the method of Agatston. The CAC score and the results of the dobutamine CMR were correlated and the positive predictive value (PPV) and the negative predictive value (NPV) of the CAC score for dobutamine CMR were calculated. A total of 114 (87%) of the patients were eligible for this study. There was a significant correlation between the CAC score and dobutamine CMR (p<0.001). Patients with a CAC score of less than 11 showed no signs of inducible ischemia during dobutamine CMR. For a CAC score of less than 101, the NPV and the PPV of the CAC score for the outcome of dobutamine CMR were, respectively, 0.96 and 0.29. In patients with an inconclusive diagnosis of myocardial ischemia a MDCT CAC score of less than 11 reliably rules out myocardial ischemia detected by dobutamine CMR. (orig.)
Reliability of histologic assessment in patients with eosinophilic oesophagitis.

Science.gov (United States)

Warners, M J; Ambarus, C A; Bredenoord, A J; Verheij, J; Lauwers, G Y; Walsh, J C; Katzka, D A; Nelson, S; van Viegen, T; Furuta, G T; Gupta, S K; Stitt, L; Zou, G; Parker, C E; Shackelton, L M; D Haens, G R; Sandborn, W J; Dellon, E S; Feagan, B G; Collins, M H; Jairath, V; Pai, R K

2018-04-01

The validity of the eosinophilic oesophagitis (EoE) histologic scoring system (EoEHSS) has been demonstrated, but only preliminary reliability data exist. Formally assess the reliability of the EoEHSS and additional histologic features. Four expert gastrointestinal pathologists independently reviewed slides from adult patients with EoE (N = 45) twice, in random order, using standardised training materials and scoring conventions for the EoEHSS and additional histologic features agreed upon during a modified Delphi process. Intra- and inter-rater reliability for scoring the EoEHSS, a visual analogue scale (VAS) of overall histopathologic disease severity, and additional histologic features were assessed using intra-class correlation coefficients (ICCs). Almost perfect intra-rater reliability was observed for the composite EoEHSS scores and the VAS. Inter-rater reliability was also almost perfect for the composite EoEHSS scores and substantial for the VAS. Of the EoEHSS items, eosinophilic inflammation was associated with the highest ICC estimates and consistent with almost perfect intra- and inter-rater reliability. With the exception of dyskeratotic epithelial cells and surface epithelial alteration, ICC estimates for the remaining EoEHSS items were above the benchmarks for substantial intra-rater, and moderate inter-rater reliability. Estimation of peak eosinophil count and number of lamina propria eosinophils were associated with the highest ICC estimates among the exploratory items. The composite EoEHSS and most component items are associated with substantial reliability when assessed by central pathologists. Future studies should assess responsiveness of the score to change after a therapeutic intervention to facilitate its use in clinical trials. © 2018 John Wiley & Sons Ltd.
Interformat reliability of digital psychiatric self-report questionnaires: a systematic review.

Science.gov (United States)

Alfonsson, Sven; Maathz, Pernilla; Hursti, Timo

2014-12-03

Research on Internet-based interventions typically use digital versions of pen and paper self-report symptom scales. However, adaptation into the digital format could affect the psychometric properties of established self-report scales. Several studies have investigated differences between digital and pen and paper versions of instruments, but no systematic review of the results has yet been done. This review aims to assess the interformat reliability of self-report symptom scales used in digital or online psychotherapy research. Three databases (MEDLINE, Embase, and PsycINFO) were systematically reviewed for studies investigating the reliability between digital and pen and paper versions of psychiatric symptom scales. From a total of 1504 publications, 33 were included in the review, and interformat reliability of 40 different symptom scales was assessed. Significant differences in mean total scores between formats were found in 10 of 62 analyses. These differences were found in just a few studies, which indicates that the results were due to study effects and sample effects rather than unreliable instruments. The interformat reliability ranged from r=.35 to r=.99; however, the majority of instruments showed a strong correlation between format scores. The quality of the included studies varied, and several studies had insufficient power to detect small differences between formats. When digital versions of self-report symptom scales are compared to pen and paper versions, most scales show high interformat reliability. This supports the reliability of results obtained in psychotherapy research on the Internet and the comparability of the results to traditional psychotherapy research. There are, however, some instruments that consistently show low interformat reliability, suggesting that these conclusions cannot be generalized to all questionnaires. Most studies had at least some methodological issues with insufficient statistical power being the most common issue
The MOBID-2 pain scale: reliability and responsiveness to pain in patients with dementia.

Science.gov (United States)

Husebo, B S; Ostelo, R; Strand, L I

2014-11-01

Mobilization-Observation-Behavior-Intensity-Dementia-2 (MOBID-2) pain scale is a staff-administered pain tool for patients with dementia. This study explores MOBID-2's test-retest reliability, measurement error and responsiveness to change. Analyses are based upon data from a cluster randomized trial including 352 patients with advanced dementia from 18 Norwegian nursing homes. Test-retest reliability between baseline and week 2 (n = 163), and weeks 2 and 4 (n = 159) was examined in patients not expected to change (controls), using intraclass correlation coefficient (ICC2.1 ), standard error of measurement (SEM) and smallest detectable change (SDC). Responsiveness was examined by testing six priori-formulated hypotheses about the association between change scores on MOBID-2 and other outcome measures. ICCs of the total MOBID-2 scores were 0.81 (0-2 weeks) and 0.85 (2-4 weeks). SEM and SDC were 1.9 and 3.1 (0-2 weeks) and 1.4 and 2.3 (2-4 weeks), respectively. Five out of six hypotheses were confirmed: MOBID-2 discriminated (p Mini-Mental State Examination, Functional Assessment Staging and Activity of Daily Living. Expected associations between change scores of MOBID-2 and Neuropsychiatric Inventory - Nursing Home version were not confirmed. The SEM and SDC in connection with the MOBID-2 pain scale indicate that the instrument is responsive to a decrease in pain after a SPTP. Satisfactory test-retest reliability across test periods was demonstrated. Change scores ≥ 3 on total and subscales are clinically relevant and are beyond measurement error. © 2014 The Authors. European Journal of Pain published by John Wiley & Sons Ltd on behalf of European Pain Federation - EFIC®.
Effect of Antihypertensive Therapy on SCORE-Estimated Total Cardiovascular Risk: Results from an Open-Label, Multinational Investigation—The POWER Survey

Directory of Open Access Journals (Sweden)

Guy De Backer

2013-01-01

Full Text Available Background. High blood pressure is a substantial risk factor for cardiovascular disease. Design & Methods. The Physicians' Observational Work on patient Education according to their vascular Risk (POWER survey was an open-label investigation of eprosartan-based therapy (EBT for control of high blood pressure in primary care centers in 16 countries. A prespecified element of this research was appraisal of the impact of EBT on estimated 10-year risk of a fatal cardiovascular event as determined by the Systematic Coronary Risk Evaluation (SCORE model. Results. SCORE estimates of CVD risk were obtained at baseline from 12,718 patients in 15 countries (6504 men and from 9577 patients at 6 months. During EBT mean (±SD systolic/diastolic blood pressures declined from 160.2 ± 13.7/94.1 ± 9.1 mmHg to 134.5 ± 11.2/81.4 ± 7.4 mmHg. This was accompanied by a 38% reduction in mean SCORE-estimated CVD risk and an improvement in SCORE risk classification of one category or more in 3506 patients (36.6%. Conclusion. Experience in POWER affirms that (a effective pharmacological control of blood pressure is feasible in the primary care setting and is accompanied by a reduction in total CVD risk and (b the SCORE instrument is effective in this setting for the monitoring of total CVD risk.
Effect of Antihypertensive Therapy on SCORE-Estimated Total Cardiovascular Risk: Results from an Open-Label, Multinational Investigation—The POWER Survey

Science.gov (United States)

De Backer, Guy; Petrella, Robert J.; Goudev, Assen R.; Radaideh, Ghazi Ahmad; Rynkiewicz, Andrzej; Pathak, Atul

2013-01-01

Background. High blood pressure is a substantial risk factor for cardiovascular disease. Design & Methods. The Physicians' Observational Work on patient Education according to their vascular Risk (POWER) survey was an open-label investigation of eprosartan-based therapy (EBT) for control of high blood pressure in primary care centers in 16 countries. A prespecified element of this research was appraisal of the impact of EBT on estimated 10-year risk of a fatal cardiovascular event as determined by the Systematic Coronary Risk Evaluation (SCORE) model. Results. SCORE estimates of CVD risk were obtained at baseline from 12,718 patients in 15 countries (6504 men) and from 9577 patients at 6 months. During EBT mean (±SD) systolic/diastolic blood pressures declined from 160.2 ± 13.7/94.1 ± 9.1 mmHg to 134.5 ± 11.2/81.4 ± 7.4 mmHg. This was accompanied by a 38% reduction in mean SCORE-estimated CVD risk and an improvement in SCORE risk classification of one category or more in 3506 patients (36.6%). Conclusion. Experience in POWER affirms that (a) effective pharmacological control of blood pressure is feasible in the primary care setting and is accompanied by a reduction in total CVD risk and (b) the SCORE instrument is effective in this setting for the monitoring of total CVD risk. PMID:23997946
[Validity and Reliability Studies of Modified Mini Mental State Examination (MMSE-E) For Turkish Illiterate Patients With Diagnosis of Alzheimer Disease].

Science.gov (United States)

Babacan-Yıldız, Gülsen; Ur-Özçelik, Emel; Kolukısa, Mehmet; Işık, Ahmet Turan; Gürsoy, Esra; Kocaman, Gülşen; Çelebi, Arif

2016-01-01

To investigate the validity and reliability of modified Mini Mental State Examination (MMSE-E) for illiterate patients in a Turkish population with Alzheimer's disease (AD). A total of 107 illiterate patients with Alzheimer's Disease (women: 65, men: 42) and 68 illiterate healthy volunteer subjects (women: 36, men: 32) were included in the study. MMSE-I and Geriatrics Depression Scale were performed on all subjects, Alzheimer patients were also administered Basic Activities of Daily Living (B- ADL). Clinical Dementia Rating (CDR) was used to determine the severity of disease, while a receiver operating characteristic (ROC) analysis was performed to analyze the cut-off scores of MMSE-I, and the positive/negative predictive values that were calculated for the optimal cut-off scores. Internal consistency was measured using Cronbach's coefﬁcient . Additionally, correlations between total MMSE-I score and the CDR, B-ADL, and GDS scores were examined. The MMSE-I scores signiﬁcantly and inversely correlated with CDR (-0.82, p=0.000) and B-ADL scores (-0.051, p=0.000). The optimal cut-off points of MMSE-I were 23/24, which yielded a sensitivity of 99.0% - %100.0, a specificity of 98.5% - 97.0%, and an AUC of 1.0/1.0, respectively. Reliability of the MMSE-I was high α = 0.70). The total MMSE-I score was able to differentiate the AD group from the control group.
Reliability and comparative validity of a Diet Quality Index for assessing dietary patterns of preschool-aged children in Sydney, Australia.

Science.gov (United States)

Kunaratnam, Kanita; Halaki, Mark; Wen, Li Ming; Baur, Louise A; Flood, Victoria M

2018-03-01

To report on the reliability and validity of a Diet Quality Index (DQI) to assess preschoolers dietary patterns using a short food frequency questionnaire (sFFQ) and 3-day food records (3d-FR). Seventy-seven preschool carers/parents completed a telephone interview on preschoolers (2-5-year olds) dietary habits in metropolitan Sydney. Agreement in scores was assessed using intraclass correlation (ICC) and paired t-tests for repeated sFFQ-DQI scores and Bland-Altman methods and paired t-tests for sFFQ-DQI and 3d-FR-DQI scores. Mean-total sFFQ-DQI ICC scores was high = 0.89, 95% CI (0.81, 0.93). There was weak agreement between sFFQ-DQI and 3d-FR-DQI scores (r = 0.36, p < 0.01). The 3d-FR-DQI scores were positively associated with carbohydrate, folate, ß-carotene, magnesium, calcium, protein, total fat and negatively associated with sugar, starch, niacin, vitamin C, phosphorus, polyunsaturated fat, and monounsaturated fat. The sFFQ-DQI demonstrated good reliability but weak validity. Associations between nutrients and 3d-FR-DQI scores indicate promising usability and warrants further investigation. Further research is needed to establish its validity in accurately scoring children's diet quality using sFFQ compared to 3d-FR before the tool can be implemented for use in population settings.
Evidences of validity and reliability of the Luria-Nebraska Test for Children

Directory of Open Access Journals (Sweden)

Ricardo Franco de Lima

2016-01-01

Full Text Available Abstract This paper aimed to verify evidences of validity and reliability of Luria-Nebraska Test for Children (TLN-C, in Portuguese. Three hundred eighty-seven students aged 6–13 years old, with learning difficulties, comprised the study. They were assessed with the Wechsler Intelligence Scale for Children (WISC-III and TLN-C; and effect of age differences, as well as accuracy rating by internal consistency were investigated. Age effects were found for all subtests and in the general score, except for receptive speech subtest, even when total IQ effect was controlled. Reliability analysis had satisfactory results (0.79. The TLN-C showed evidences of validity and reliability. Receptive speech subtest requires revision.
The Depression Anxiety Stress Scales-21 (DASS-21): further examination of dimensions, scale reliability, and correlates.

Science.gov (United States)

Osman, Augustine; Wong, Jane L; Bagge, Courtney L; Freedenthal, Stacey; Gutierrez, Peter M; Lozano, Gregorio

2012-12-01

We conducted two studies to examine the dimensions, internal consistency reliability estimates, and potential correlates of the Depression Anxiety Stress Scales-21 (DASS-21; Lovibond & Lovibond, 1995). Participants in Study 1 included 887 undergraduate students (363 men and 524 women, aged 18 to 35 years; mean [M] age = 19.46, standard deviation [SD] = 2.17) recruited from two public universities to assess the specificity of the individual DASS-21 items and to evaluate estimates of internal consistency reliability. Participants in a follow-up study (Study 2) included 410 students (168 men and 242 women, aged 18 to 47 years; M age = 19.65, SD = 2.88) recruited from the same universities to further assess factorial validity and to evaluate potential correlates of the original DASS-21 total and scale scores. Item bifactor and confirmatory factor analyses revealed that a general factor accounted for the greatest proportion of common variance in the DASS-21 item scores (Study 1). In Study 2, the fit statistics showed good fit for the bifactor model. In addition, the DASS-21 total scale score correlated more highly with scores on a measure of mixed depression and anxiety than with scores on the proposed specific scales of depression or anxiety. Coefficient omega estimates for the DASS-21 scale scores were good. Further investigations of the bifactor structure and psychometric properties of the DASS-21, specifically its incremental and discriminant validity, using known clinical groups are needed. © 2012 Wiley Periodicals, Inc.
The Korean Version of the Cognitive Assessment Scale for Stroke Patients (K-CASP): A Reliability and Validity Study.

Science.gov (United States)

Park, Kwon-Hee; Lee, Hee-Won; Park, Kee-Boem; Lee, Jin-Youn; Cho, Ah-Ra; Oh, Hyun-Mi; Park, Joo Hyun

2017-06-01

To develop the Korean version of the Cognitive Assessment Scale for Stroke Patients (K-CASP) and to evaluate the test reliability and validity of the K-CASP in stroke patients. The original CASP was translated into Korean, back-translated into English, then reviewed and compared with the original version. Thirty-three stroke patients were assessed independently by two examiners using the K-CASP twice, with a one-day interval, for a total of four test results. To evaluate the reliability of the K-CASP, intra-class correlation coefficients were used. Pearson correlations were calculated and simple regression analyses performed with the Korean version of Mini-Mental State Examination (K-MMSE) and the aphasia quotient (AQ) to assess the validity. The mean score was 24.42±9.47 (total score 36) for the K-CASP and 21.50±7.01 (total score 30) for the K-MMSE. The inter-rater correlation coefficients of the K-CASP were 0.992 on the first day and 0.995 on the second day. The intra-rater correlation coefficients of the K-CASP were 0.997 for examiner 1 and 0.996 for examiner 2. In the Pearson correlation analysis, the K-CASP score significantly correlated with the K-MMSE score (r=0.825, preliable and valid instrument for cognitive dysfunction screening in post-stroke patients. It is more applicable than other cognitive assessment tools in stroke patients with aphasia.
A Novel Risk Scoring System Reliably Predicts Readmission Following Pancreatectomy

Science.gov (United States)

Valero, Vicente; Grimm, Joshua C.; Kilic, Arman; Lewis, Russell L.; Tosoian, Jeffrey J.; He, Jin; Griffin, James; Cameron, John L.; Weiss, Matthew J.; Vollmer, Charles M.; Wolfgang, Christopher L.

2015-01-01

Background Postoperative readmissions have been proposed by Medicare as a quality metric and may impact provider reimbursement. Since readmission following pancreatectomy is common, we sought to identify factors associated with readmission in order to establish a predictive risk scoring system (RSS). Study Design A retrospective analysis of 2,360 pancreatectomies performed at nine, high-volume pancreatic centers between 2005 and 2011 was performed. Forty-five factors strongly associated with readmission were identified. To derive and validate a RSS, the population was randomly divided into two cohorts in a 4:1 fashion. A multivariable logistic regression model was constructed and scores were assigned based on the relative odds ratio of each independent predictor. A composite Readmission After Pancreatectomy (RAP) score was generated and then stratified to create risk groups. Results Overall, 464 (19.7%) patients were readmitted within 90-days. Eight pre- and postoperative factors, including prior myocardial infarction (OR 2.03), ASA Class ≥ 3 (OR 1.34), dementia (OR 6.22), hemorrhage (OR 1.81), delayed gastric emptying (OR 1.78), surgical site infection (OR 3.31), sepsis (OR 3.10) and short length of stay (OR 1.51), were independently predictive of readmission. The 32-point RAP score generated from the derivation cohort was highly predictive of readmission in the validation cohort (AUC 0.72). The low (0-3), intermediate (4-7) and high risk (>7) groups correlated to 11.7%, 17.5% and 45.4% observed readmission rates, respectively (preadmission following pancreatectomy. Identification of patients with increased risk of readmission using the RAP score will allow efficient resource allocation aimed to attenuate readmission rates. It also has potential to serve as a new metric for comparative research and quality assessment. PMID:25797757
Clinical outcome scoring of intra-articular calcaneal fractures

NARCIS (Netherlands)

Schepers, Tim; Heetveld, Martin J.; Mulder, Paul G. H.; Patka, Peter

2008-01-01

Outcome reporting of intra-articular calcaneal fractures is inconsistent. This study aimed to identify the most cited outcome scores in the literature and to analyze their reliability and validity. A systematic literature search identified 34 different outcome scores. The most cited outcome score
Rating scales for dystonia in cerebral palsy: reliability and validity.

Science.gov (United States)

Monbaliu, E; Ortibus, E; Roelens, F; Desloovere, K; Deklerck, J; Prinzie, P; de Cock, P; Feys, H

2010-06-01

This study investigated the reliability and validity of the Barry-Albright Dystonia Scale (BADS), the Burke-Fahn-Marsden Movement Scale (BFMMS), and the Unified Dystonia Rating Scale (UDRS) in patients with bilateral dystonic cerebral palsy (CP). Three raters independently scored videotapes of 10 patients (five males, five females; mean age 13 y 3 mo, SD 5 y 2 mo, range 5-22 y). One patient each was classified at levels I-IV in the Gross Motor Function Classification System and six patients were classified at level V. Reliability was measured by (1) intraclass correlation coefficient (ICC) for interrater reliability, (2) standard error of measurement (SEM) and smallest detectable difference (SDD), and (3) Cronbach's alpha for internal consistency. Validity was assessed by Pearson's correlations among the three scales used and by content analysis. Moderate to good interrater reliability was found for total scores of the three scales (ICC: BADS=0.87; BFMMS=0.86; UDRS=0.79). However, many subitems showed low reliability, in particular for the UDRS. SEM and SDD were respectively 6.36% and 17.72% for the BADS, 9.88% and 27.39% for the BFMMS, and 8.89% and 24.63% for the UDRS. High internal consistency was found. Pearson's correlations were high. Content validity showed insufficient accordance with the new CP definition and classification. Our results support the internal consistency and concurrent validity of the scales; however, taking into consideration the limitations in reliability, including the large SDD values and the content validity, further research on methods of assessment of dystonia is warranted.
The accuracy of Internet search engines to predict diagnoses from symptoms can be assessed with a validated scoring system.

Science.gov (United States)

Shenker, Bennett S

2014-02-01

To validate a scoring system that evaluates the ability of Internet search engines to correctly predict diagnoses when symptoms are used as search terms. We developed a five point scoring system to evaluate the diagnostic accuracy of Internet search engines. We identified twenty diagnoses common to a primary care setting to validate the scoring system. One investigator entered the symptoms for each diagnosis into three Internet search engines (Google, Bing, and Ask) and saved the first five webpages from each search. Other investigators reviewed the webpages and assigned a diagnostic accuracy score. They rescored a random sample of webpages two weeks later. To validate the five point scoring system, we calculated convergent validity and test-retest reliability using Kendall's W and Spearman's rho, respectively. We used the Kruskal-Wallis test to look for differences in accuracy scores for the three Internet search engines. A total of 600 webpages were reviewed. Kendall's W for the raters was 0.71 (psearch engines is a valid and reliable instrument. The scoring system may be used in future Internet research. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Impact of the Occlusion Duration on the Performance of J-CTO Score in Predicting Failure of Percutaneous Coronary Intervention for Chronic Total Occlusion.

Science.gov (United States)

de Castro-Filho, Antonio; Lamas, Edgar Stroppa; Meneguz-Moreno, Rafael A; Staico, Rodolfo; Siqueira, Dimytri; Costa, Ricardo A; Braga, Sergio N; Costa, J Ribamar; Chamié, Daniel; Abizaid, Alexandre

2017-06-01

The present study examined the association between Multicenter CTO Registry in Japan (J-CTO) score in predicting failure of percutaneous coronary intervention (PCI) correlating with the estimated duration of chronic total occlusion (CTO). The J-CTO score does not incorporate estimated duration of the occlusion. This was an observational retrospective study that involved all consecutive procedures performed at a single tertiary-care cardiology center between January 2009 and December 2014. A total of 174 patients, median age 59.5 years (interquartile range [IQR], 53-65 years), undergoing CTO-PCI were included. The median estimated occlusion duration was 7.5 months (IQR, 4.0-12.0 months). The lesions were classified as easy (score = 0), intermediate (score = 1), difficult (score = 2), and very difficult (score ≥3) in 51.1%, 33.9%, 9.2%, and 5.7% of the patients, respectively. Failure rate significantly increased with higher J-CTO score (7.9%, 20.3%, 50.0%, and 70.0% in groups with J-CTO scores of 0, 1, 2, and ≥3, respectively; PJ-CTO score predicted failure of CTO-PCI independently of the estimated occlusion duration (P=.24). Areas under receiver-operating characteristic curves were computed and it was observed that for each occlusion time period, the discriminatory capacity of the J-CTO score in predicting CTO-PCI failure was good, with a C-statistic >0.70. The estimated duration of occlusion had no influence on the J-CTO score performance in predicting failure of PCI in CTO lesions. The probability of failure was mainly determined by grade of lesion complexity.

Validity and Reliability of the Turkish Chronic Pain Acceptance Questionnaire

Science.gov (United States)

Akmaz, Hazel Ekin; Uyar, Meltem; Kuzeyli Yıldırım, Yasemin; Akın Korhan, Esra

2018-05-29

Pain acceptance is the process of giving up the struggle with pain and learning to live a worthwhile life despite it. In assessing patients with chronic pain in Turkey, making a diagnosis and tracking the effectiveness of treatment is done with scales that have been translated into Turkish. However, there is as yet no valid and reliable scale in Turkish to assess the acceptance of pain. To validate a Turkish version of the Chronic Pain Acceptance Questionnaire developed by McCracken and colleagues. Methodological and cross sectional study. A simple randomized sampling method was used in selecting the study sample. The sample was composed of 201 patients, more than 10 times the number of items examined for validity and reliability in the study, which totaled 20. A patient identification form, the Chronic Pain Acceptance Questionnaire, and the Brief Pain Inventory were used to collect data. Data were collected by face-to-face interviews. In the validity testing, the content validity index was used to evaluate linguistic equivalence, content validity, construct validity, and expert views. In reliability testing of the scale, Cronbach’s α coefficient was calculated, and item analysis and split-test reliability methods were used. Principal component analysis and varimax rotation were used in factor analysis and to examine factor structure for construct concept validity. The item analysis established that the scale, all items, and item-total correlations were satisfactory. The mean total score of the scale was 21.78. The internal consistency coefficient was 0.94, and the correlation between the two halves of the scale was 0.89. The Chronic Pain Acceptance Questionnaire, which is intended to be used in Turkey upon confirmation of its validity and reliability, is an evaluation instrument with sufficient validity and reliability, and it can be reliably used to examine patients’ acceptance of chronic pain.
Evaluation of revised trauma score in poly- traumatized patients

International Nuclear Information System (INIS)

Ahmad, H.N.

2004-01-01

Objective: To determine the prognostic value and reliability of revised trauma score (RTS) in polytraumatized patients. Subjects and Methods: Thirty adult patients of road traffic accidents sustaining multisystem injuries due to high energy blunt trauma were managed according to the protocols of advanced trauma life support (ATLS) and from their first set of data RTS was calculated. Score of each patient was compared with his final outcome at the time of discharge from the hospital. Results: The revised trauma score was found to be a reliable predictor of prognosis of polytraumatized patients but a potentially weak predictor for those patients having severe injury involving a single anatomical region. The higher the RTS the better the prognosis of polytrauma patient and vice versa. Revised trauma score <8 turned out to be an indicator of severe injury with high mortality and morbidity and overall mortality in polytraumatized patients was 26.66%. However, RTS-6 was associated with 50% mortality. Conclusion: The revised trauma score is a reliable indicator of prognosis of polytraumatized patients. Therefore, it can be used for field and emergency room triage. (author)
Reliability of maximal isometric knee strength testing with modified hand-held dynamometry in patients awaiting total knee arthroplasty: useful in research and individual patient settings? A reliability study

NARCIS (Netherlands)

Koblbauer, Ian F. H.; Lambrecht, Yannick; van der Hulst, Micheline L. M.; Neeter, Camille; Engelbert, Raoul H. H.; Poolman, Rudolf W.; Scholtes, Vanessa A.

2011-01-01

Patients undergoing total knee arthroplasty (TKA) often experience strength deficits both pre- and post-operatively. As these deficits may have a direct impact on functional recovery, strength assessment should be performed in this patient population. For these assessments, reliable measurements
Reliability of maximal isometric knee strength testing with modified hand-held dynamometry in patients awaiting total knee arthroplasty: useful in research and individual patient settings? A reliability study

NARCIS (Netherlands)

Koblbauer, I.F.H.; Lambrecht, Y.; van der Hulst, M.L.M.; Neeter, C.; Engelbert, R.H.H.; Poolman, R.W.; Scholtes, V.A.

2011-01-01

Background: Patients undergoing total knee arthroplasty (TKA) often experience strength deficits both pre- and post-operatively. As these deficits may have a direct impact on functional recovery, strength assessment should be performed in this patient population. For these assessments, reliable
Hypertension Knowledge-Level Scale (HK-LS: A Study on Development, Validity and Reliability

Directory of Open Access Journals (Sweden)

Cemalettin Kalyoncu

2012-03-01

Full Text Available This study was conducted to develop a scale to measure knowledge about hypertension among Turkish adults. The Hypertension Knowledge-Level Scale (HK-LS was generated based on content, face, and construct validity, internal consistency, test re-test reliability, and discriminative validity procedures. The final scale had 22 items with six sub-dimensions. The scale was applied to 457 individuals aged ≥18 years, and 414 of them were re-evaluated for test-retest reliability. The six sub-dimensions encompassed 60.3% of the total variance. Cronbach alpha coefficients were 0.82 for the entire scale and 0.92, 0.59, 0.67, 0.77, 0.72, and 0.76 for the sub-dimensions of definition, medical treatment, drug compliance, lifestyle, diet, and complications, respectively. The scale ensured internal consistency in reliability and construct validity, as well as stability over time. Significant relationships were found between knowledge score and age, gender, educational level, and history of hypertension of the participants. No correlation was found between knowledge score and working at an income-generating job. The present scale, developed to measure the knowledge level of hypertension among Turkish adults, was found to be valid and reliable.
Hypertension Knowledge-Level Scale (HK-LS): a study on development, validity and reliability.

Science.gov (United States)

Erkoc, Sultan Baliz; Isikli, Burhanettin; Metintas, Selma; Kalyoncu, Cemalettin

2012-03-01

This study was conducted to develop a scale to measure knowledge about hypertension among Turkish adults. The Hypertension Knowledge-Level Scale (HK-LS) was generated based on content, face, and construct validity, internal consistency, test re-test reliability, and discriminative validity procedures. The final scale had 22 items with six sub-dimensions. The scale was applied to 457 individuals aged ≥ 18 years, and 414 of them were re-evaluated for test-retest reliability. The six sub-dimensions encompassed 60.3% of the total variance. Cronbach alpha coefficients were 0.82 for the entire scale and 0.92, 0.59, 0.67, 0.77, 0.72, and 0.76 for the sub-dimensions of definition, medical treatment, drug compliance, lifestyle, diet, and complications, respectively. The scale ensured internal consistency in reliability and construct validity, as well as stability over time. Significant relationships were found between knowledge score and age, gender, educational level, and history of hypertension of the participants. No correlation was found between knowledge score and working at an income-generating job. The present scale, developed to measure the knowledge level of hypertension among Turkish adults, was found to be valid and reliable.
The revised Generalized Expectancy for Success Scale: a validity and reliability study.

Science.gov (United States)

Hale, W D; Fiedler, L R; Cochran, C D

1992-07-01

The Generalized Expectancy for Success Scale (GESS; Fibel & Hale, 1978) was revised and assessed for reliability and validity. The revised version was administered to 199 college students along with other conceptually related measures, including the Rosenberg Self-Esteem Scale, the Life Orientation Test, and Rotter's Internal-External Locus of Control Scale. One subsample of students also completed the Eysenck Personality Inventory, while another subsample performed a criterion-related task that involved risk taking. Item analysis yielded 25 items with correlations of .45 or higher with the total score. Results indicated high internal consistency and test-retest reliability.
[Validity, reliability, and acceptability of the brief version of the self-management knowledge, attitude, and behavior assessment scale for diabetes patients].

Science.gov (United States)

Wu, Y Z; Wang, W J; Feng, N P; Chen, B; Li, G C; Liu, J W; Liu, H L; Yang, Y Y

2016-07-06

To evaluate the validity, reliability, and acceptability of the brief version of the self-management knowledge, attitude, and behavior (KAB) assessment scale for diabetes patients. Diabetes patients who were managed at the Xinkaipu Community Health Service Center of Tianxin in Changsha, Hunan Province were selected for survey by cluster sampling. A total of 350 diabetes patients were surveyed using the brief scale to collect data on knowledge, attitudes, and behaviors of self-management. Content validity was evaluated by Pearson correlation coefficient between the brief scale and subscales of knowledge, attitude, and behavior. Structure validity was evaluated by factor analysis, and discrimination validity was evaluated by an independent sample t-test between the high-score and low-score groups. Reliability was tested by internal consistency reliability and split-half reliability. The evaluation indexes of internal consistency reliability were Cronbach's α coefficients, θ coefficient, and Ω coefficient. Acceptability was evaluated by valid response rate and completion time of the brief scale. A total of 346(98.9%) valid questionnaires were returned, with average survey time of (11.43±3.4) minutes. Average score of the brief scale was 78.85 ± 11.22; scores of the knowledge, attitude, and behavior subscales were 16.45 ± 4.42, 21.33 ± 2.03, and 41.07 ± 8.34, respectively. Pearson correlation coefficients between the brief scale and the knowledge, attitude, and behavior subscales were 0.92, 0.42, and 0.60, respectively; P-values were all less than 0.01, indicating that the face validity and content validity of the brief scale were achieved to a good level. The common factor cumulative variance contribution rate of the brief scale and three subscales was from 53.66% to 61.75%, which achieved more than 50% of the approved standard. There were 11 common factors; 41 of the total 42 items had factor loadings above 0.40 in their relevant common factor, indicating
The reliability and validity of the Everyday Feelings Questionnaire in a clinical population.

Science.gov (United States)

Mann, Joanna; Henley, William; O'Mahen, Heather; Ford, Tamsin

2013-06-01

Depression could be considered to be on a continuum with well-being and some have argued that it is important to measure well-being as well as distress. The Everyday Feelings Questionnaire was designed to measure both these aspects. Its validity has been assessed in a nonclinical population. This project aims to assess the validity and reliability of the EFQ in a clinical population. The EFQ was completed by 105 clients within a mental health clinical setting. The following aspects of the EFQ were explored: its internal structure, concurrent validity, re-test reliability and internal consistency. The EFQ had good internal consistency and correlated highly with other measures of anxiety and depression. The correlation between total EFQ scores on the two occasions was reasonable and there was no effect of time during completion. A Bland-Altman plot showed no obvious pattern between the difference between EFQ scores and the mean score. A one factor model showed a moderate fit to the data. This study does not explore the acceptability or sensitivity to change of the EFQ, and a larger sample size would be needed to extend the analysis conducted. The EFQ is a valid and reliable measure when used in this clinical population. Copyright © 2012 Elsevier B.V. All rights reserved.
Translation, cross-culturally adaptation and validation of the Danish version of Oxford Hip Score (OHS)

DEFF Research Database (Denmark)

Paulsen, Aksel

there was no properly translated, adapted and validated Danish language version available, a translation to Danish, cross-culturally adaptation and validation of the Danish Oxford Hip Score was warranted. Material and Methods: We translated and cross-culturally adapted the Oxford Hip Score into Danish, in accordance......Objective: The Oxford Hip Score is a patient reported outcome questionnaire designed to assess pain and function in patients undergoing total hip arthroplaty (THA). The Oxford Hip Score is valid, reliable and consistent, and different language versions have been developed. Since.......9 % ceiling effect on this cohort of postoperative patients. Only in 1.2 % of the patients no sum score could be calculated, due to missing items. In relation to construct validity 80 % of predefined hypothesis were confirmed. The different items had an intraclass correlation in the range of 0...
The Vocal Cord Dysfunction Questionnaire: Validity and Reliability of the Persian Version.

Science.gov (United States)

Ghaemi, Hamide; Khoddami, Seyyedeh Maryam; Soleymani, Zahra; Zandieh, Fariborz; Jalaie, Shohreh; Ahanchian, Hamid; Khadivi, Ehsan

2017-12-25

The aim of this study was to develop, validate, and assess the reliability of the Persian version of Vocal Cord Dysfunction Questionnaire (VCDQ P ). The study design was cross-sectional or cultural survey. Forty-four patients with vocal fold dysfunction (VFD) and 40 healthy volunteers were recruited for the study. To assess the content validity, the prefinal questions were given to 15 experts to comment on its essential. Ten patients with VFD rated the importance of VCDQ P in detecting face validity. Eighteen of the patients with VFD completed the VCDQ 1 week later for test-retest reliability. To detect absolute reliability, standard error of measurement and smallest detected change were calculated. Concurrent validity was assessed by completing the Persian Chronic Obstructive Pulmonary Disease (COPD) Assessment Test (CAT) by 34 patients with VFD. Discriminant validity was measured from 34 participants. The VCDQ was further validated by administering the questionnaire to 40 healthy volunteers. Validation of the VCDQ as a treatment outcome tool was conducted in 18 patients with VFD using pre- and posttreatment scores. The internal consistency was confirmed (Cronbach α = 0.78). The test-retest reliability was excellent (intraclass correlation coefficient = 0.97). The standard error of measurement and smallest detected change values were acceptable (0.39 and 1.08, respectively). There was a significant correlation between the VCDQ P and the CAT total scores (P validity was significantly different. The VCDQ scores in patients with VFD before and after treatment was significantly different (P valid and reliable self-administered questionnaire in Persian-speaking population. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
The Structured Interview & Scoring Tool-Massachusetts Alzheimer's Disease Research Center (SIST-M): development, reliability, and cross-sectional validation of a brief structured clinical dementia rating interview.

Science.gov (United States)

Okereke, Olivia I; Copeland, Maura; Hyman, Bradley T; Wanggaard, Taylor; Albert, Marilyn S; Blacker, Deborah

2011-03-01

The Clinical Dementia Rating (CDR) and CDR Sum-of-Boxes can be used to grade mild but clinically important cognitive symptoms of Alzheimer disease. However, sensitive clinical interview formats are lengthy. To develop a brief instrument for obtaining CDR scores and to assess its reliability and cross-sectional validity. Using legacy data from expanded interviews conducted among 347 community-dwelling older adults in a longitudinal study, we identified 60 questions (from a possible 131) about cognitive functioning in daily life using clinical judgment, inter-item correlations, and principal components analysis. Items were selected in 1 cohort (n=147), and a computer algorithm for generating CDR scores was developed in this same cohort and re-run in a replication cohort (n=200) to evaluate how well the 60 items retained information from the original 131 items. Short interviews based on the 60 items were then administered to 50 consecutively recruited older individuals, with no symptoms or mild cognitive symptoms, at an Alzheimer's Disease Research Center. Clinical Dementia Rating scores based on short interviews were compared with those from independent long interviews. In the replication cohort, agreement between short and long CDR interviews ranged from κ=0.65 to 0.79, with κ=0.76 for Memory, κ=0.77 for global CDR, and intraclass correlation coefficient for CDR Sum-of-Boxes=0.89. In the cross-sectional validation, short interview scores were slightly lower than those from long interviews, but good agreement was observed for global CDR and Memory (κ≥0.70) as well as for CDR Sum-of-Boxes (intraclass correlation coefficient=0.73). The Structured Interview & Scoring Tool-Massachusetts Alzheimer's Disease Research Center is a brief, reliable, and sensitive instrument for obtaining CDR scores in persons with symptoms along the spectrum of mild cognitive change.
Anterior Cruciate Ligament OsteoArthritis Score (ACLOAS)

DEFF Research Database (Denmark)

Roemer, Frank W; Frobell, Richard; Lohmander, Stefan

2014-01-01

OBJECTIVE: To develop a whole joint scoring system, the Anterior Cruciate Ligament OsteoArthritis Score (ACLOAS), for magnetic resonance imaging (MRI)-based assessment of acute anterior cruciate ligament (ACL) injury and follow-up of structural sequelae, and to assess its reliability. DESIGN...
Genomic dark matter: the reliability of short read mapping illustrated by the genome mappability score.

Science.gov (United States)

Lee, Hayan; Schatz, Michael C

2012-08-15

Genome resequencing and short read mapping are two of the primary tools of genomics and are used for many important applications. The current state-of-the-art in mapping uses the quality values and mapping quality scores to evaluate the reliability of the mapping. These attributes, however, are assigned to individual reads and do not directly measure the problematic repeats across the genome. Here, we present the Genome Mappability Score (GMS) as a novel measure of the complexity of resequencing a genome. The GMS is a weighted probability that any read could be unambiguously mapped to a given position and thus measures the overall composition of the genome itself. We have developed the Genome Mappability Analyzer to compute the GMS of every position in a genome. It leverages the parallelism of cloud computing to analyze large genomes, and enabled us to identify the 5-14% of the human, mouse, fly and yeast genomes that are difficult to analyze with short reads. We examined the accuracy of the widely used BWA/SAMtools polymorphism discovery pipeline in the context of the GMS, and found discovery errors are dominated by false negatives, especially in regions with poor GMS. These errors are fundamental to the mapping process and cannot be overcome by increasing coverage. As such, the GMS should be considered in every resequencing project to pinpoint the 'dark matter' of the genome, including of known clinically relevant variations in these regions. The source code and profiles of several model organisms are available at http://gma-bio.sourceforge.net
Reliability of Holistic Scoring for the 1985 MCAT Essays.

Science.gov (United States)

Mitchell, Karen J.; Anderson, Judith A.

A pilot essay was included in the 1985 Spring and Fall administrations of the Medical College Admission Test. A sample of 320 of the essays written by Fall examinees who had expressed an interest in allopathic medicine was used to calculate interrater reliability estimates. Sixteen of 20 readers who had been trained by White's suggestions for…
Reliability and Model Fit

Science.gov (United States)

Stanley, Leanne M.; Edwards, Michael C.

2016-01-01

The purpose of this article is to highlight the distinction between the reliability of test scores and the fit of psychometric measurement models, reminding readers why it is important to consider both when evaluating whether test scores are valid for a proposed interpretation and/or use. It is often the case that an investigator judges both the…
Validity Evidence and Scoring Guidelines for Standardized Patient Encounters and Patient Notes From a Multisite Study of Clinical Performance Examinations in Seven Medical Schools.

Science.gov (United States)

Park, Yoon Soo; Hyderi, Abbas; Heine, Nancy; May, Win; Nevins, Andrew; Lee, Ming; Bordage, Georges; Yudkowsky, Rachel

2017-11-01

To examine validity evidence of local graduation competency examination scores from seven medical schools using shared cases and to provide rater training protocols and guidelines for scoring patient notes (PNs). Between May and August 2016, clinical cases were developed, shared, and administered across seven medical schools (990 students participated). Raters were calibrated using training protocols, and guidelines were developed collaboratively across sites to standardize scoring. Data included scores from standardized patient encounters for history taking, physical examination, and PNs. Descriptive statistics were used to examine scores from the different assessment components. Generalizability studies (G-studies) using variance components were conducted to estimate reliability for composite scores. Validity evidence was collected for response process (rater perception), internal structure (variance components, reliability), relations to other variables (interassessment correlations), and consequences (composite score). Student performance varied by case and task. In the PNs, justification of differential diagnosis was the most discriminating task. G-studies showed that schools accounted for less than 1% of total variance; however, for the PNs, there were differences in scores for varying cases and tasks across schools, indicating a school effect. Composite score reliability was maximized when the PN was weighted between 30% and 40%. Raters preferred using case-specific scoring guidelines with clear point-scoring systems. This multisite study presents validity evidence for PN scores based on scoring rubric and case-specific scoring guidelines that offer rigor and feedback for learners. Variability in PN scores across participating sites may signal different approaches to teaching clinical reasoning among medical schools.
Introducing the HOPE (Hypospadias Objective Penile Evaluation)-score : A validation study of an objective scoring system for evaluating cosmetic appearance in hypospadias patients

NARCIS (Netherlands)

van der Toorn, Fred; de Jong, Tom P. V. M.; de Gier, Robert P. E.; Callewaert, Piet R. H.; van der Horst, Eric H. J. R.; Steffens, Martijn G.; Hoebeke, Piet; Nijman, Rien J. M.; Bush, Nicol C.; Wolffenbuttel, Katja P.; van den Heijkant, Marleen M. C.; van Capelle, Jan-Willem; Wildhagen, Mark; Timman, Reinier; van Busschbach, Jan J. V.

2013-01-01

Objective: To determine the reliability and internal validity of the Hypospadias Objective Penile Evaluation (HOPE)-score, a newly developed scoring system assessing the cosmetic outcome in hypospadias. Patients and methods: The HOPE scoring system incorporates all surgically-correctable items:
Validity and Reliability of the Upper Extremity Work Demands Scale.

Science.gov (United States)

Jacobs, Nora W; Berduszek, Redmar J; Dijkstra, Pieter U; van der Sluis, Corry K

2017-12-01

Purpose To evaluate validity and reliability of the upper extremity work demands (UEWD) scale. Methods Participants from different levels of physical work demands, based on the Dictionary of Occupational Titles categories, were included. A historical database of 74 workers was added for factor analysis. Criterion validity was evaluated by comparing observed and self-reported UEWD scores. To assess structural validity, a factor analysis was executed. For reliability, the difference between two self-reported UEWD scores, the smallest detectable change (SDC), test-retest reliability and internal consistency were determined. Results Fifty-four participants were observed at work and 51 of them filled in the UEWD twice with a mean interval of 16.6 days (SD 3.3, range = 10-25 days). Criterion validity of the UEWD scale was moderate (r = .44, p = .001). Factor analysis revealed that 'force and posture' and 'repetition' subscales could be distinguished with Cronbach's alpha of .79 and .84, respectively. Reliability was good; there was no significant difference between repeated measurements. An SDC of 5.0 was found. Test-retest reliability was good (intraclass correlation coefficient for agreement = .84) and all item-total correlations were >.30. There were two pairs of highly related items. Conclusion Reliability of the UEWD scale was good, but criterion validity was moderate. Based on current results, a modified UEWD scale (2 items removed, 1 item reworded, divided into 2 subscales) was proposed. Since observation appeared to be an inappropriate gold standard, we advise to investigate other types of validity, such as construct validity, in further research.
Using the Nursing Culture Assessment Tool (NCAT) in Long-Term Care: An Update on Psychometrics and Scoring Standardization.

Science.gov (United States)

Kennerly, Susan; Heggestad, Eric D; Myers, Haley; Yap, Tracey L

2015-07-29

An effective workforce performing within the context of a positive cultural environment is central to a healthcare organization's ability to achieve quality outcomes. The Nursing Culture Assessment Tool (NCAT) provides nurses with a valid and reliable tool that captures the general aspects of nursing culture. This study extends earlier work confirming the tool's construct validity and dimensionality by standardizing the scoring approach and establishing norm-referenced scoring. Scoring standardization provides a reliable point of comparison for NCAT users. NCAT assessments support nursing's ability to evaluate nursing culture, use results to shape the culture into one that supports change, and advance nursing's best practices and care outcomes. Registered nurses, licensed practical nurses, and certified nursing assistants from 54 long-term care facilities in Kentucky, Nevada, North Carolina, and Oregon were surveyed. Confirmatory factor analysis yielded six first order factors forming the NCAT's subscales (Expectations, Behaviors, Teamwork, Communication, Satisfaction, Commitment) (Comparative Fit Index 0.93) and a second order factor-The Total Culture Score. Aggregated facility level comparisons of observed group variance with expected random variance using rwg(J) statistics is presented. Normative scores and cumulative rank percentages and how the NCAT can be used in implementing planned change are provided.

Evaluation of the Validity and Reliability of the Chinese Healthy Eating Index

Directory of Open Access Journals (Sweden)

Ya-Qun Yuan

2018-01-01

Full Text Available The Chinese Healthy Eating Index (CHEI is a measuring instrument of diet quality in accordance with the Dietary Guidelines for Chinese (DGC-2016. The objective of the study was to evaluate the validity and reliability of the CHEI. Data from 12,473 adults from the China Health and Nutrition Survey (CHNS-2011, including 3-day–24-h dietary recalls were used in this study. The CHEI was assessed by four exemplary menus developed by the DGC-2016, the general linear models, the independent t-test and the Mann–Whitney U-test, the Spearman’s correlation analysis, the principal components analysis (PCA, the Cronbach’s coefficient, and the Pearson correlation with nutrient intakes. A higher CHEI score was linked with lower exposure to known risk factors of Chinese diets. The CHEI scored nearly perfect for exemplary menus for adult men (99.8, adult women (99.7, and the healthy elderly (99.1, but not for young children (91.2. The CHEI was able to distinguish the difference in diet quality between smokers and non-smokers (P < 0.0001, people with higher and lower education levels (P < 0.0001, and people living in urban and rural areas (P < 0.0001. Low correlations with energy intake for the CHEI total and component scores (|r| < 0.34, P < 0.01 supported the index assessed diet quality independently of diet quantity. The PCA indicated that underlying multiple dimensions compose the CHEI, and Cronbach’s coefficient α was 0.22. Components of dairy, fruits and cooking oils had the greatest impact on the total score. People with a higher CHEI score had not only a higher absolute intake of nutrients (P < 0.001, but also a more nutrient-dense diet (P < 0.001. Our findings support the validity and reliability of the CHEI when using the 3-day–24-h recalls.
Neurology objective structured clinical examination reliability using generalizability theory.

Science.gov (United States)

Blood, Angela D; Park, Yoon Soo; Lukas, Rimas V; Brorson, James R

2015-11-03

This study examines factors affecting reliability, or consistency of assessment scores, from an objective structured clinical examination (OSCE) in neurology through generalizability theory (G theory). Data include assessments from a multistation OSCE taken by 194 medical students at the completion of a neurology clerkship. Facets evaluated in this study include cases, domains, and items. Domains refer to areas of skill (or constructs) that the OSCE measures. G theory is used to estimate variance components associated with each facet, derive reliability, and project the number of cases required to obtain a reliable (consistent, precise) score. Reliability using G theory is moderate (Φ coefficient = 0.61, G coefficient = 0.64). Performance is similar across cases but differs by the particular domain, such that the majority of variance is attributed to the domain. Projections in reliability estimates reveal that students need to participate in 3 OSCE cases in order to increase reliability beyond the 0.70 threshold. This novel use of G theory in evaluating an OSCE in neurology provides meaningful measurement characteristics of the assessment. Differing from prior work in other medical specialties, the cases students were randomly assigned did not influence their OSCE score; rather, scores varied in expected fashion by domain assessed. © 2015 American Academy of Neurology.
Carpal erosions in children with juvenile idiopathic arthritis: repeatability of a newly devised MR-scoring system

International Nuclear Information System (INIS)

Boavida, Peter; Lambot-Juhan, Karen; Ording Mueller, Lil-Sofie; Damasio, Beatrice; Malattia, Clara; Tanturri de Horatio, Laura; Owens, Catherine M.; Rosendahl, Karen

2015-01-01

Juvenile idiopathic arthritis (JIA) is characterized by synovial inflammation, with potential risk of developing progressive joint destruction. Personalized state-of-the-art treatment depends on valid markers for disease activity to monitor response; however, no such markers exist. To evaluate the reliability of scoring of carpal bone erosions on MR in children with JIA using two semi-quantitative scoring systems. A total of 1,236 carpal bones (91 MR wrist examinations) were scored twice by two independent pediatric musculoskeletal radiologists. Bony erosions were scored according to estimated bone volume loss using a 0-4 scale and a 0-10 scale. An aggregate erosion score comprising the sum total carpal bone volume loss was calculated for each examination. The 0-4 scoring system resulted in good intra-reader agreement and moderate to good inter-observer agreement in the assessment of individual bones. Fair and moderate agreement were achieved for inter-reader and intra-reader agreement, respectively, using the 0-10 scale. Intra- and particularly inter-reader aggregate score variability were much less favorable, with wide limits of agreement. Further analysis of erosive disease patterns compared with normal subjects is required, and to facilitate the development of an alternative means of quantifying disease. (orig.)
Carpal erosions in children with juvenile idiopathic arthritis: repeatability of a newly devised MR-scoring system

Energy Technology Data Exchange (ETDEWEB)

Boavida, Peter [Great Ormond Street Hospital for Children, Department of Radiology, London (United Kingdom); Lambot-Juhan, Karen [Hospital Necker Enfants Malades, Department of Radiology, Paris (France); Ording Mueller, Lil-Sofie [Oslo University Hospital, Department of Radiology, Oslo (Norway); Damasio, Beatrice; Malattia, Clara [Ospedale Pediatrico Gaslini, Department of Rheumatology, Genoa (Italy); Tanturri de Horatio, Laura [Ospedale Pediatrico Bambino Gesu, Department of Radiology, Rome (Italy); Owens, Catherine M. [Great Ormond Street Hospital for Children, Department of Radiology, London (United Kingdom); UCL, Institute of Child Health, London (United Kingdom); Rosendahl, Karen [Haukeland University Hospital, Department of Radiology, Bergen (Norway); University of Bergen, Department of Clinical Medicine, Bergen (Norway)

2015-12-15

Juvenile idiopathic arthritis (JIA) is characterized by synovial inflammation, with potential risk of developing progressive joint destruction. Personalized state-of-the-art treatment depends on valid markers for disease activity to monitor response; however, no such markers exist. To evaluate the reliability of scoring of carpal bone erosions on MR in children with JIA using two semi-quantitative scoring systems. A total of 1,236 carpal bones (91 MR wrist examinations) were scored twice by two independent pediatric musculoskeletal radiologists. Bony erosions were scored according to estimated bone volume loss using a 0-4 scale and a 0-10 scale. An aggregate erosion score comprising the sum total carpal bone volume loss was calculated for each examination. The 0-4 scoring system resulted in good intra-reader agreement and moderate to good inter-observer agreement in the assessment of individual bones. Fair and moderate agreement were achieved for inter-reader and intra-reader agreement, respectively, using the 0-10 scale. Intra- and particularly inter-reader aggregate score variability were much less favorable, with wide limits of agreement. Further analysis of erosive disease patterns compared with normal subjects is required, and to facilitate the development of an alternative means of quantifying disease. (orig.)
Reliability and validity of the Chinese version of the Patient Health Questionnaire (PHQ-9) in the general population.

Science.gov (United States)

Wang, Wenzheng; Bian, Qian; Zhao, Yan; Li, Xu; Wang, Wenwen; Du, Jiang; Zhang, Guofang; Zhou, Qing; Zhao, Min

2014-01-01

Depression is one of the most common mental illnesses. The reliability and the validity of the Patient Health Questionnaire (PHQ)-9, a depression screening tool, have not been examined in the general population in China. Thus, this study evaluated the reliability and the validity of the Chinese version of the PHQ-9 in detecting major depression in residents of a Chinese community. A total of 1045 participants from a Shanghai community were enrolled in our study. Participants completed the Chinese versions of the PHQ-9, the Self-Rating Depression Scale (SDS), the 36-item Short Form Health Survey (SF-36), and the Mini International Neuropsychiatric Interview. One hundred participants were randomly selected to complete the PHQ-9 again 2 weeks after the initial assessment. The reliability, the validity and the receiver operating characteristic (ROC) curve of the PHQ-9 were analyzed. Cronbach's alpha for the internal consistency reliability of the Chinese version of the PHQ-9 was 0.86 for the entire scale. The correlation coefficient for the 2-week test-retest of the total score was 0.86. The PHQ-9 scale correlated positively with the SDS (r=0.29, p<0.001) and correlated negatively with all subscale scores of the SF-36 (correlation coefficients ranged from -0.11 to -0.47, p<0.001). The area under the curve of the ROC was 0.92 (95% confidence interval: 0.86-0.97). A cutoff score of 7 or higher on the PHQ-9 had a sensitivity of 0.86 and a specificity of 0.86. In the general Chinese population, the Chinese version of the PHQ-9 is a valid and efficient tool for screening depression, with a recommended cutoff score of 7 or more. Copyright © 2014 Elsevier Inc. All rights reserved.
Simulation-based Assessment to Reliably Identify Key Resident Performance Attributes.

Science.gov (United States)

Blum, Richard H; Muret-Wagstaff, Sharon L; Boulet, John R; Cooper, Jeffrey B; Petrusa, Emil R; Baker, Keith H; Davidyuk, Galina; Dearden, Jennifer L; Feinstein, David M; Jones, Stephanie B; Kimball, William R; Mitchell, John D; Nadelberg, Robert L; Wiser, Sarah H; Albrecht, Meredith A; Anastasi, Amanda K; Bose, Ruma R; Chang, Laura Y; Culley, Deborah J; Fisher, Lauren J; Grover, Meera; Klainer, Suzanne B; Kveraga, Rikante; Martel, Jeffrey P; McKenna, Shannon S; Minehart, Rebecca D; Mitchell, John D; Mountjoy, Jeremi R; Pawlowski, John B; Pilon, Robert N; Shook, Douglas C; Silver, David A; Warfield, Carol A; Zaleski, Katherine L

2018-04-01

Obtaining reliable and valid information on resident performance is critical to patient safety and training program improvement. The goals were to characterize important anesthesia resident performance gaps that are not typically evaluated, and to further validate scores from a multiscenario simulation-based assessment. Seven high-fidelity scenarios reflecting core anesthesiology skills were administered to 51 first-year residents (CA-1s) and 16 third-year residents (CA-3s) from three residency programs. Twenty trained attending anesthesiologists rated resident performances using a seven-point behaviorally anchored rating scale for five domains: (1) formulate a clear plan, (2) modify the plan under changing conditions, (3) communicate effectively, (4) identify performance improvement opportunities, and (5) recognize limits. A second rater assessed 10% of encounters. Scores and variances for each domain, each scenario, and the total were compared. Low domain ratings (1, 2) were examined in detail. Interrater agreement was 0.76; reliability of the seven-scenario assessment was r = 0.70. CA-3s had a significantly higher average total score (4.9 ± 1.1 vs. 4.6 ± 1.1, P = 0.01, effect size = 0.33). CA-3s significantly outscored CA-1s for five of seven scenarios and domains 1, 2, and 3. CA-1s had a significantly higher proportion of worrisome ratings than CA-3s (chi-square = 24.1, P < 0.01, effect size = 1.50). Ninety-eight percent of residents rated the simulations more educational than an average day in the operating room. Sensitivity of the assessment to CA-1 versus CA-3 performance differences for most scenarios and domains supports validity. No differences, by experience level, were detected for two domains associated with reflective practice. Smaller score variances for CA-3s likely reflect a training effect; however, worrisome performance scores for both CA-1s and CA-3s suggest room for improvement.
Testing the reliability of the Fall Risk Screening Tool in an elderly ambulatory population.

Science.gov (United States)

Fielding, Susan J; McKay, Michael; Hyrkas, Kristiina

2013-11-01

To identify and test the reliability of a fall risk screening tool in an ambulatory outpatient clinic. The Fall Risk Screening Tool (Albert Lea Medical Center, MN, USA) was scripted for an interview format. Two interviewers separately screened a convenience sample of 111 patients (age ≥ 65 years) in an ambulatory outpatient clinic in a northeastern US city. The interviewers' scoring of fall risk categories was similar. There was good internal consistency (Cronbach's α = 0.834-0.889) and inter-rater reliability [intra-class correlation coefficients (ICC) = 0.824-0.881] for total, Risk Factor and Client's Health Status subscales. The Physical Environment scores indicated acceptable internal consistency (Cronbach's α = 0.742) and adequate reliability (ICC = 0.688). Two Physical Environment items (furniture and medical equipment condition) had low reliabilities [Kappa (K) = 0.323, P = 0.08; K = -0.078, P = 0.648), respectively. The scripted Fall Risk Screening Tool demonstrated good reliability in this sample. Rewording two Physical Environment items will be considered. A reliable instrument such as the scripted Fall Risk Screening Tool provides a standardised assessment for identifying high fall risk patients. This tool is especially useful because it assesses personal, behavioural and environmental factors specific to community-dwelling patients; the interview format also facilitates patient-provider interaction. © 2013 John Wiley & Sons Ltd.
Measurement Properties of Performance-Specific Pain Ratings of Patients Awaiting Total Joint Arthroplasty as a Consequence of Osteoarthritis

Science.gov (United States)

Stratford, Paul W.; Kennedy, Deborah M.; Woodhouse, Linda J.; Spadoni, Gregory

2008-01-01

Purpose: To estimate the test–retest reliability of the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) pain sub-scale and performance-specific assessments of pain, as well as the association between these measures for patients awaiting primary total hip or knee arthroplasty as a consequence of osteoarthritis. Methods: A total of 164 patients awaiting unilateral primary hip or knee arthroplasty completed four performance measures (self-paced walk, timed up and go, stair test, six-minute walk) and the WOMAC. Scores for 22 of these patients provided test–retest reliability data. Estimates of test–retest reliability (Type 2,1 intraclass correlation coefficient [ICC] and standard error of measurement [SEM]) and the association between measures were examined. Results: ICC values for individual performance-specific pain ratings were between 0.70 and 0.86; SEM values were between 0.97 and 1.33 pain points. ICC estimates for the four-item performance pain ratings and the WOMAC pain sub-scale were 0.82 and 0.57 respectively. The correlation between the sum of the pain scores for the four performance measures and the WOMAC pain sub-scale was 0.62. Conclusion: Reliability estimates for the performance-specific assessments of pain using the numeric pain rating scale were consistent with values reported for patients with a spectrum of musculoskeletal conditions. The reliability estimate for the WOMAC pain sub-scale was lower than typically reported in the literature. The level of association between the WOMAC pain sub-scale and the various performance-specific pain scales suggests that the scores can be used interchangeably when applied to groups but not for individual patients. PMID:20145758
The reliability, validity and responsiveness of the Dutch version of the Oxford elbow score

Directory of Open Access Journals (Sweden)

Patka Peter

2011-07-01

Full Text Available Abstract Background The Oxford elbow score (OES is an English questionnaire that measures the patients' subjective experience of elbow surgery. The OES comprises three domains: elbow function, pain, and social-psychological effects. This questionnaire can be completed by the patient and used as an outcome measure after elbow surgery. The aim of this study was to develop and evaluate the Dutch version of the translated OES for reliability, validity and responsiveness with respect to patients after elbow trauma and surgery. Methods The 12 items of the English-language OES were translated into Dutch and then back-translated; the back-translated questionnaire was then compared to the original English version. The OES Dutch version was completed by 69 patients (group A, 60 of whom had an elbow luxation, four an elbow fracture and five an epicondylitis. QuickDASH, the visual analogue pain scale (VAS and the Mayo Elbow Performance Index (MEPI were also completed to examine the convergent validity of the OES in group A. To calculate the test-retest reliability and responsiveness of the OES, this questionnaire was completed three times by 43 different patients (group B. An average of 52 days elapsed between therapy and the administration of the third OES (SD = 24.1. Results The Cronbach's α coefficients for the function, pain and social-psychological domains were 0.90, 0.87 and 0.90, respectively. The intra-class correlation coefficients for the domains were 0.87 for function, 0.89 for pain and 0.87 for social-psychological. The standardised response means for the domains were 0.69, 0.46 and 0.60, respectively, and the minimal detectable changes were 27.6, 21.7 and 24.0, respectively. The convergent validity for the function, pain and social-psychological domains, which were measured as the Spearman's correlation of the OES domains with the MEPI, were 0.68, 0.77 and 0.77, respectively. The Spearman's correlations of the OES domains with QuickDASH were
A Test-Retest Reliability Study of the Whiplash Disability Questionnaire in Patients With Acute Whiplash-Associated Disorders.

Science.gov (United States)

Stupar, Maja; Côté, Pierre; Beaton, Dorcas E; Boyle, Eleanor; Cassidy, J David

2015-01-01

The purpose of this study was to determine the test-retest reliability and the Minimal Detectable Change (MDC) of the Whiplash Disability Questionnaire (WDQ) in individuals with acute whiplash-associated disorders (WADs). We performed a test-retest reliability study. We included insurance claimants from Ontario who were at least 18 years of age, within 21 days of their motor vehicle collision and diagnosed as having acute WAD grades I to III. The WDQ, a 13-item questionnaire scored from 0 (no disability) to 130 (complete disability), was administered to all participants at baseline and by telephone 3 days later. We computed the intraclass correlation coefficient (model 2,1) and the MDC with 95% confidence intervals (CIs; MDC95). The mean (SD) age of the 66 participants was 41.6 (12.7) years and 71.2% were female. Twenty-nine percent had WAD I and 71.2% had WAD II. Time since injury ranged from 0 to 19 days. The mean (SD) baseline WDQ score was 49.3 (28.8) and 46.5 (29.8) 3 days later. The intraclass correlation coefficient for the WDQ total score was 0.89 (95% CI, 0.85-0.92) in the entire sample and 0.83 (95% CI, 0.69-0.93) for the 15 participants reporting no change in neck pain. The MDC95 of the WDQ was 21.4 (SD = 14.9) for participants reporting no change. The WDQ was reliable in individuals with acute WAD. There is 95% confidence that a change of approximately one-sixth of the total score is beyond the daily variation of a stable condition. This level of measurement error must be taken into consideration when interpreting change in WDQ scores. Copyright © 2015 National University of Health Sciences. Published by Elsevier Inc. All rights reserved.
Reliability and Validity of Objective Structured Clinical Examination for Residents of Obstetrics and Gynecology at Kermanshah University of Medical Sciences

Directory of Open Access Journals (Sweden)

Nasrin Jalilian

2012-11-01

Full Text Available Introduction: Objective structured clinical examination (OSCE is used for the evaluation of the clinical competence in medicine for which it is essential to measure validity and reliability. This study aimed to investigate the validity and reliability of OSCE for residents of obstetrics and gynecology at Kermanshah University of Medical Sciences in 2011.Methods: A descriptive-correlation study was designed and the data of OSCE for obstetrics and gynecology were collected via learning behavior checklists in method stations and multiple choice questions in question stations. The data were analyzed through Pearson correlation coefficient and Cronbach's alpha, using SPSS software (version 16. To determine the criterion validity, correlation of OSCE scores with scores of resident promotion test, direct observation of procedural skills, and theoretical knowledge was determined; for reliability, however, Cronbach's alpha was used. Total sample consisted of 25 participants taking part in 14 stations. P value of less than 0.05 was considered as significant.Results: The mean OSCE scores was 22.66 (±6.85. Criterion validity of the stations with resident promotion theoretical test, first theoretical knowledge test, second theoretical knowledge, and direct observation of procedural skills (DOPS was 0.97, 0.74, 0.49, and 0.79, respectively. In question stations, criterion validity was 0.15, and total validity of OSCE was 0.77.Conclusion: Findings of the present study indicated acceptable validity and reliability of OSCE for residents of obstetrics and gynecology.
Test-retest reliability and minimal detectable change of two simplified 3-point balance measures in patients with stroke.

Science.gov (United States)

Chen, Yi-Miau; Huang, Yi-Jing; Huang, Chien-Yu; Lin, Gong-Hong; Liaw, Lih-Jiun; Lee, Shih-Chieh; Hsieh, Ching-Lin

2017-10-01

The 3-point Berg Balance Scale (BBS-3P) and 3-point Postural Assessment Scale for Stroke Patients (PASS-3P) were simplified from the BBS and PASS to overcome the complex scoring systems. The BBS-3P and PASS-3P were more feasible in busy clinical practice and showed similarly sound validity and responsiveness to the original measures. However, the reliability of the BBS-3P and PASS-3P is unknown limiting their utility and the interpretability of scores. We aimed to examine the test-retest reliability and minimal detectable change (MDC) of the BBS-3P and PASS-3P in patients with stroke. Cross-sectional study. The rehabilitation departments of a medical center and a community hospital. A total of 51 chronic stroke patients (64.7% male). Both balance measures were administered twice 7 days apart. The test-retest reliability of both the BBS-3P and PASS-3P were examined by intraclass correlation coefficients (ICC). The MDC and its percentage over the total score (MDC%) of each measure was calculated for examining the random measurement errors. The ICC values of the BBS-3P and PASS-3P were 0.99 and 0.97, respectively. The MDC% (MDC) of the BBS-3P and PASS-3P were 9.1% (5.1 points) and 8.4% (3.0 points), respectively, indicating that both measures had small and acceptable random measurement errors. Our results showed that both the BBS-3P and the PASS-3P had good test-retest reliability, with small and acceptable random measurement error. These two simplified 3-level balance measures can provide reliable results over time. Our findings support the repeated administration of the BBS-3P and PASS-3P to monitor the balance of patients with stroke. The MDC values can help clinicians and researchers interpret the change scores more precisely.
Construct Validity and Reliability of the SARA Gait and Posture Sub-scale in Early Onset Ataxia

Directory of Open Access Journals (Sweden)

Tjitske F. Lawerman

2017-12-01

Full Text Available Aim: In children, gait and posture assessment provides a crucial marker for the early characterization, surveillance and treatment evaluation of early onset ataxia (EOA. For reliable data entry of studies targeting at gait and posture improvement, uniform quantitative biomarkers are necessary. Until now, the pediatric test construct of gait and posture scores of the Scale for Assessment and Rating of Ataxia sub-scale (SARA is still unclear. In the present study, we aimed to validate the construct validity and reliability of the pediatric (SARAGAIT/POSTURE sub-scale.Methods: We included 28 EOA patients [15.5 (6–34 years; median (range]. For inter-observer reliability, we determined the ICC on EOA SARAGAIT/POSTURE sub-scores by three independent pediatric neurologists. For convergent validity, we associated SARAGAIT/POSTURE sub-scores with: (1 Ataxic gait Severity Measurement by Klockgether (ASMK; dynamic balance, (2 Pediatric Balance Scale (PBS; static balance, (3 Gross Motor Function Classification Scale -extended and revised version (GMFCS-E&R, (4 SARA-kinetic scores (SARAKINETIC; kinetic function of the upper and lower limbs, (5 Archimedes Spiral (AS; kinetic function of the upper limbs, and (6 total SARA scores (SARATOTAL; i.e., summed SARAGAIT/POSTURE, SARAKINETIC, and SARASPEECH sub-scores. For discriminant validity, we investigated whether EOA co-morbidity factors (myopathy and myoclonus could influence SARAGAIT/POSTURE sub-scores.Results: The inter-observer agreement (ICC on EOA SARAGAIT/POSTURE sub-scores was high (0.97. SARAGAIT/POSTURE was strongly correlated with the other ataxia and functional scales [ASMK (rs = -0.819; p < 0.001; PBS (rs = -0.943; p < 0.001; GMFCS-E&R (rs = -0.862; p < 0.001; SARAKINETIC (rs = 0.726; p < 0.001; AS (rs = 0.609; p = 0.002; and SARATOTAL (rs = 0.935; p < 0.001]. Comorbid myopathy influenced SARAGAIT/POSTURE scores by concurrent muscle weakness, whereas comorbid myoclonus predominantly influenced
Reliability of maximal isometric knee strength testing with modified hand-held dynamometry in patients awaiting total knee arthroplasty: useful in research and individual patient settings? A reliability study

Directory of Open Access Journals (Sweden)

Koblbauer Ian FH

2011-10-01

Full Text Available Abstract Background Patients undergoing total knee arthroplasty (TKA often experience strength deficits both pre- and post-operatively. As these deficits may have a direct impact on functional recovery, strength assessment should be performed in this patient population. For these assessments, reliable measurements should be used. This study aimed to determine the inter- and intrarater reliability of hand-held dynamometry (HHD in measuring isometric knee strength in patients awaiting TKA. Methods To determine interrater reliability, 32 patients (81.3% female were assessed by two examiners. Patients were assessed consecutively by both examiners on the same individual test dates. To determine intrarater reliability, a subgroup (n = 13 was again assessed by the examiners within four weeks of the initial testing procedure. Maximal isometric knee flexor and extensor strength were tested using a modified Citec hand-held dynamometer. Both the affected and unaffected knee were tested. Reliability was assessed using the Intraclass Correlation Coefficient (ICC. In addition, the Standard Error of Measurement (SEM and the Smallest Detectable Difference (SDD were used to determine reliability. Results In both the affected and unaffected knee, the inter- and intrarater reliability were good for knee flexors (ICC range 0.76-0.94 and excellent for knee extensors (ICC range 0.92-0.97. However, measurement error was high, displaying SDD ranges between 21.7% and 36.2% for interrater reliability and between 19.0% and 57.5% for intrarater reliability. Overall, measurement error was higher for the knee flexors than for the knee extensors. Conclusions Modified HHD appears to be a reliable strength measure, producing good to excellent ICC values for both inter- and intrarater reliability in a group of TKA patients. High SEM and SDD values, however, indicate high measurement error for individual measures. This study demonstrates that a modified HHD is appropriate to
Validity and Reliability of the Arabic Version of the Positive and Negative Syndrome Scale.

Science.gov (United States)

Yehya, Arij; Ghuloum, Suhaila; Mahfoud, Ziyad; Opler, Mark; Khan, Anzalee; Hammoudeh, Samer; Abdulhakam, Abdulmoneim; Al-Mujalli, Azza; Hani, Yahya; Elsherbiny, Reem; Al-Amin, Hassen

The Positive and Negative Syndrome Scale (PANSS) is widely used for patients with schizophrenia. This scale is reliable and valid. The PANSS was translated and validated in several languages. The aim of this study was to translate and validate the PANSS in the Arab population. The PANSS was translated into formal Arabic language using the back-translation method. 101 Arab patients with schizophrenia and 98 Arabs with no diagnosis of any mental disorder were recruited. The Arabic version of the Mini International Neuropsychiatric Interview (MINI-6) was used as a diagnostic tool to confirm the diagnosis of schizophrenia or rule out any diagnosis for the healthy control group. Reliability of the scale was assessed by calculating internal consistency, interrater reliability and test-retest reliability. Construct validity was assessed using the Arabic version of the MINI-6. PANSS total scores were correlated with the Clinical Global Impression-Severity scale. Our findings showed that the internal consistency was good (0.92). Scores on the PANSS of the patients were much higher than those of the healthy controls. The PANSS showed good interrater reliability and test-retest reliability (0.92 and 0.75, respectively). In comparison with the MINI-6, the PANSS showed good sensitivity and specificity, which implies good construct validity of this version. In conclusion, the Arabic version of the PANSS is a reliable and valid instrument for the assessment of patients with schizophrenia in the Arab population. © 2016 S. Karger AG, Basel.
Reliability of ultrasound thickness measurement of the abdominal muscles during clinical isometric endurance tests.

Science.gov (United States)

ShahAli, Shabnam; Arab, Amir Massoud; Talebian, Saeed; Ebrahimi, Esmaeil; Bahmani, Andia; Karimi, Noureddin; Nabavi, Hoda

2015-07-01

The study was designed to evaluate the intra-examiner reliability of ultrasound (US) thickness measurement of abdominal muscles activity when supine lying and during two isometric endurance tests in subjects with and without Low back pain (LBP). A total of 19 women (9 with LBP, 10 without LBP) participated in the study. Within-day reliability of the US thickness measurements at supine lying and the two isometric endurance tests were assessed in all subjects. The intra-class correlation coefficient (ICC) was used to assess the relative reliability of thickness measurement. The standard error of measurement (SEM), minimal detectable change (MDC) and the coefficient of variation (CV) were used to evaluate the absolute reliability. Results indicated high ICC scores (0.73-0.99) and also small SEM and MDC scores for within-day reliability assessment. The Bland-Altman plots of agreement in US measurement of the abdominal muscles during the two isometric endurance tests demonstrated that 95% of the observations fall between the limits of agreement for test and retest measurements. Together the results indicate high intra-tester reliability for the US measurement of the thickness of abdominal muscles in all the positions tested. According to the study's findings, US imaging can be used as a reliable method for assessment of abdominal muscles activity in supine lying and the two isometric endurance tests employed, in participants with and without LBP. Copyright © 2014 Elsevier Ltd. All rights reserved.
Assessing the suitability of written stroke materials: an evaluation of the interrater reliability of the suitability assessment of materials (SAM) checklist.

Science.gov (United States)

Hoffmann, Tammy; Ladner, Yvette

2012-01-01

Written materials are frequently used to provide education to stroke patients and their carers. However, poor quality materials are a barrier to effective information provision. A quick and reliable method of evaluating material quality is needed. This study evaluated the interrater reliability of the Suitability Assessment of Materials (SAM) checklist in a sample of written stroke education materials. Two independent raters evaluated the materials (n = 25) using the SAM, and ratings were analyzed to reveal total percentage agreements and weighted kappa values for individual items and overall SAM rating. The majority of the individual SAM items had high interrater reliability, with 17 of the 22 items achieving substantial, almost perfect, or perfect weighted kappa value scores. The overall SAM rating achieved a weighted kappa value of 0.60, with a percentage total agreement of 96%. Health care professionals should evaluate the content and design characteristics of written education materials before using them with patients. A tool such as the SAM checklist can be used; however, raters should exercise caution when interpreting results from items with more subjective scoring criteria. Refinements to the scoring criteria for these items are recommended. The value of the SAM is that it can be used to identify specific elements that should be modified before education materials are provided to patients.
Investigation of reliability, validity and normality Persian version of the California Critical Thinking Skills Test; Form B (CCTST

Directory of Open Access Journals (Sweden)

Khallli H

2003-04-01

Full Text Available Background: To evaluate the effectiveness of the present educational programs in terms of students' achieving problem solving, decision making and critical thinking skills, reliable, valid and standard instrument are needed. Purposes: To Investigate the Reliability, validity and Norm of CCTST Form.B .The California Critical Thinking Skills Test contain 34 multi-choice questions with a correct answer in the jive Critical Thinking (CT cognitive skills domain. Methods: The translated CCTST Form.B were given t0405 BSN nursing students ojNursing Faculties located in Tehran (Tehran, Iran and Shahid Beheshti Universitiesthat were selected in the through random sampling. In order to determine the face and content validity the test was translated and edited by Persian and English language professor and researchers. it was also confirmed by judgments of a panel of medical education experts and psychology professor's. CCTST reliability was determined with internal consistency and use of KR-20. The construct validity of the test was investigated with factor analysis and internal consistency and group difference. Results: The test coefficien for reliablity was 0.62. Factor Analysis indicated that CCTST has been formed from 5 factor (element namely: Analysis, Evaluation, lriference, Inductive and Deductive Reasoning. Internal consistency method shows that All subscales have been high and positive correlation with total test score. Group difference method between nursing and philosophy students (n=50 indicated that there is meaningfUl difference between nursing and philosophy students scores (t=-4.95,p=0.OOO1. Scores percentile norm also show that percentile offifty scores related to 11 raw score and 95, 5 percentiles are related to 17 and 6 raw score ordinary. Conclusions: The Results revealed that the questions test is sufficiently reliable as a research tool, and all subscales measure a single construct (Critical Thinking and are able to distinguished the
Evaluation of the Influence of the Logistic Operations Reliability on the Total Costs of a Supply Chain

Directory of Open Access Journals (Sweden)

Lukinskiy Valery

2016-12-01

Full Text Available Nowadays in logistics integral processes between the material and related flows in supply chains are getting developed more and more. However, in spite of increasing volume of statistical data which reflect the integral processes, the influence evaluation issues of the logistic operations reliability indexes on the total logistics costs remain open and require the corresponding researches implementation.
Validity and reliability of English and Marathi Oswestry Disability Index (version 2.1a) in Indian population.

Science.gov (United States)

Joshi, Veena D; Raiturker, Pradyumna P Pai; Kulkarni, Aditi A

2013-05-15

A total of 200 patients with low back pain (LBP) completed an English and Marathi Oswestry Disability Index (ODI) questionnaires (100 each), visual analogue scale, and Roland-Morris Disability Questionnaire. To validate the English and Marathi versions of ODI (version 2.1a). Patient-orientated assessment methods are important in the evaluation of treatment outcome. The ODI is one of the condition-specific questionnaires recommended for the use of patients with LBP. An adaptation of the ODI (version 2.1a) for Marathi language was carried out according to established guidelines. Average age of patients who answered the English ODI was 42 ± 15, whereas that of Marathi-speaking patients was 52 ± 15 years. About 40% were males. The Cronbach α reliability score was 0.877 for English and 0.943 for Marathi. Forty-seven and 53 of these patients were retested with English and Marathi ODI within 2 weeks (to assess test-retest reliability). The intraclass correlation coefficient (ICC) for the test-retest reliability of the questionnaire was 0.877 and 0.943 for English and Marathi respectively. The ODI scores correlated with visual analogue scale pain intensity (r = 0.67, P Disability Questionnaire score (r = 0.71, P Disability Questionnaire scores (r = 0.503, P Oswestry questionnaire is reliable and valid, and shows psychometric characteristics as good as the English version. It should represent a valuable tool for use in future patient-orientated outcome studies for population with LBP in India.

TEST-RETEST RELIABILITY OF THE CLOSED KINETIC CHAIN UPPER EXTREMITY STABILITY TEST (CKCUEST) IN ADOLESCENTS: RELIABILITY OF CKCUEST IN ADOLESCENTS.

Science.gov (United States)

de Oliveira, Valéria M A; Pitangui, Ana C R; Nascimento, Vinícius Y S; da Silva, Hítalo A; Dos Passos, Muana H P; de Araújo, Rodrigo C

2017-02-01

The Closed Kinetic Chain Upper Extremity Stability Test (CKCUEST) has been proposed as an option to assess upper limb function and stability; however, there are few studies that support the use of this test in adolescents. The purpose of the present study was to investigate the intersession reliability and agreement of three CKCUEST scores in adolescents and establish clinimetric values for this test. Test-retest reliability. Twenty-five healthy adolescents of both sexes were evaluated. The subjects performed two CKCUEST with an interval of one week between the tests. An intraclass correlation coefficient (ICC 3,3 ) two-way mixed model with a 95% interval of confidence was utilized to determine intersession reliability. A Bland-Altman graph was plotted to analyze the agreement between assessments. The presence of systematic error was evaluated by a one-sample t test. The difference between the evaluation and reevaluation was observed using a paired-sample t test. The level of significance was set at 0.05. Standard error of measurements and minimum detectable changes were calculated. The intersession reliability of the average touches score, normalized score, and power score were 0.68, 0.68 and 0.87, the standard error of measurement were 2.17, 1.35 and 6.49, and the minimal detectable change was 6.01, 3.74 and 17.98, respectively. The presence of systematic error (p test with moderate to excellent reliability when used with adolescents. The CKCUEST is a measurement with moderate to excellent reliability for adolescents. 2b.
The Truth about Scores Children Achieve on Tests.

Science.gov (United States)

Brown, Jonathan R.

1989-01-01

The importance of using the standard error of measurement (SEm) in determining reliability in test scores is emphasized. The SEm is compared to the hypothetical true score for standardized tests, and procedures for calculation of the SEm are explained. (JDD)
Reliable porcine coronary model of chronic total occlusion using copper wire stents and bioabsorbable levo-polylactic acid polymer.

Science.gov (United States)

Sim, Doo Sun; Jeong, Myung Ho; Cha, Kyoung Rae; Park, Suk Ho; Park, Jong Oh; Shin, Young Min; Shin, Heungsoo; Hong, Young Joon; Ahn, Youngkeun; Schwartz, Robert S; Kang, Jung Chaee

2012-12-01

Chronic total occlusion (CTO) remains a challenge in interventional cardiology. We investigated the feasibility and reliability of copper wire stents and levo-polylactic acid (l-PLA) as a means of CTO induction in a porcine model. In one group of 20 swine, copper stents were crimped on a 3.0mm angioplasty balloon and inserted into the mid-left anterior descending coronary artery (LAD). In the other group of 20 swine, l-PLA was wrapped on a guidewire and pushed into the distal LAD with a 3.0mm balloon catheter to induce embolization. Of 20 swine which underwent copper stent implantation, 13 died of stent thrombosis. In the remaining 7 swine, total or near total occlusion with collateral circulation was observed at 5 weeks. Of 20 swine which underwent l-PLA embolization, 4 died of ventricular fibrillation during or shortly after the procedure. Serial histopathologic studies showed complete absorption of the polymer with replacement by fibrotic tissue approximately 4 weeks following the polymer implantation. CTO could be reliably induced in porcine coronary arteries by copper stents and l-PLA. These models may support investigation of new percutaneous devices to facilitate CTO interventions. Copyright © 2012 Japanese College of Cardiology. Published by Elsevier Ltd. All rights reserved.
Inter-rater Reliability for Metrics Scored in a Binary Fashion-Performance Assessment for an Arthroscopic Bankart Repair.

Science.gov (United States)

Gallagher, Anthony G; Ryu, Richard K N; Pedowitz, Robert A; Henn, Patrick; Angelo, Richard L

2018-05-02

To determine the inter-rater reliability (IRR) of a procedure-specific checklist scored in a binary fashion for the evaluation of surgical skill and whether it meets a minimum level of agreement (≥0.8 between 2 raters) required for high-stakes assessment. In a prospective randomized and blinded fashion, and after detailed assessment training, 10 Arthroscopy Association of North America Master/Associate Master faculty arthroscopic surgeons (in 5 pairs) with an average of 21 years of surgical experience assessed the video-recorded 3-anchor arthroscopic Bankart repair performance of 44 postgraduate year 4 or 5 residents from 21 Accreditation Council for Graduate Medical Education orthopaedic residency training programs from across the United States. No paired scores of resident surgeon performance evaluated by the 5 teams of faculty assessors dropped below the 0.8 IRR level (mean = 0.93; range 0.84-0.99; standard deviation = 0.035). A comparison between the 5 assessor groups with 1 factor analysis of variance showed that there was no significant difference between the groups (P = .205). Pearson's product-moment correlation coefficient revealed a strong and statistically significant negative correlation, that is, -0.856 (P fashion meet the need and can show a high (>80%) IRR. Copyright © 2018 Arthroscopy Association of North America. Published by Elsevier Inc. All rights reserved.
[Reliability and validity of the Turkish version of the internalized stigma of mental illness scale].

Science.gov (United States)

Ersoy, Mehmet Akif; Varan, Azmi

2007-01-01

The aim of this study was to evaluate the reliability and validity of the Turkish version of the Internalized Stigma of Mental Illness Scale (ISMI) in patients with psychiatric disorders. The study included 203 patients diagnosed with various psychiatric disorders in a psychiatry outpatient clinic of a university hospital. The reliability of the scale was assessed by investigation of its internal consistency and split-half reliability. The convergent validity of the scale was demonstrated by the relationship between the Turkish form of the ISMI and various criteria scales. Cronbach's alpha value was 0.93 for the entire scale and ranged between 0.63 and 0.87 for the 5 subscales of the ISMI. In terms of convergent validity, the total score of the Turkish ISMI significantly correlated with the Beck Depression Inventory, Rosenberg Self-Esteem Scale, Sociotropy-Autonomy Scale, Brief Symptom Inventory, Multidimensional Scale of Perceived Social Support, Clinical Global Impression Scale, and Global Assessment of Functioning Scale scores. All values were in the expected direction. In the light of the findings, it was concluded that the Turkish version of ISMI could be used as a reliable and valid tool in assessing internalized stigma of the Turkish psychiatric patients.
Reliability of a novel, semi-quantitative scale for classification of structural brain magnetic resonance imaging in children with cerebral palsy.

Science.gov (United States)

Fiori, Simona; Cioni, Giovanni; Klingels, Katrjin; Ortibus, Els; Van Gestel, Leen; Rose, Stephen; Boyd, Roslyn N; Feys, Hilde; Guzzetta, Andrea

2014-09-01

To describe the development of a novel rating scale for classification of brain structural magnetic resonance imaging (MRI) in children with cerebral palsy (CP) and to assess its interrater and intrarater reliability. The scale consists of three sections. Section 1 contains descriptive information about the patient and MRI. Section 2 contains the graphical template of brain hemispheres onto which the lesion is transposed. Section 3 contains the scoring system for the quantitative analysis of the lesion characteristics, grouped into different global scores and subscores that assess separately side, regions, and depth. A larger interrater and intrarater reliability study was performed in 34 children with CP (22 males, 12 females; mean age at scan of 9 y 5 mo [SD 3 y 3 mo], range 4 y-16 y 11 mo; Gross Motor Function Classification System level I, [n=22], II [n=10], and level III [n=2]). Very high interrater and intrarater reliability of the total score was found with indices above 0.87. Reliability coefficients of the lobar and hemispheric subscores ranged between 0.53 and 0.95. Global scores for hemispheres, basal ganglia, brain stem, and corpus callosum showed reliability coefficients above 0.65. This study presents the first visual, semi-quantitative scale for classification of brain structural MRI in children with CP. The high degree of reliability of the scale supports its potential application for investigating the relationship between brain structure and function and examining treatment response according to brain lesion severity in children with CP. © 2014 Mac Keith Press.
Reliability of the Phi angle to assess rotational alignment of the talar component in total ankle replacement.

Science.gov (United States)

Manzi, Luigi; Villafañe, Jorge Hugo; Indino, Cristian; Tamini, Jacopo; Berjano, Pedro; Usuelli, Federico Giuseppe

2017-11-08

The purpose of this study was to investigate the test-retest reliability of the Phi angle in patients undergoing total ankle replacement (TAR) for end stage ankle osteoarthritis (OA) to assess the rotational alignment of the talar component. Retrospective observational cross-sectional study of prospectively collected data. Post-operative anteroposterior radiographs of the foot of 170 patients who underwent TAR for the ankle OA were evaluated. Three physicians measured Phi on the 170 randomly sorted and anonymized radiographs on two occasions, one week apart (test and retest conditions), inter and intra-observer agreement were evaluated. Test-retest reliability of Phi angle measurement was excellent for patients with Hintegra TAR (ICC=0.995; pPhi angle measurement between patients with Hintegra vs. Zimmer implants (p>0.05). Measurement of Phi angle on weight-bearing dorsoplantar radiograph showed an excellent reliability among orthopaedic surgeons in determining the position of the talar component in the axial plane. Level II, cross sectional study. Copyright © 2017 European Foot and Ankle Society. Published by Elsevier Ltd. All rights reserved.
The development, validity, and reliability of the Addiction Profile Index (API).

Science.gov (United States)

Ögel, Kültegin; Evren, Cüneyt; Karadağ, Figen; Gürol, Defne Tamar

2012-01-01

The objective of this study was to develop a practical questionnaire for multidimensional assessment of problems associated with alcohol and substance abuse that would also be useful for treatment planning. The Addiction Profile Index (API) is a self-report questionnaire consisting of 37 items and the following 5 subscales: characteristics of substance use; dependency diagnosis; the effects of subsance use on the user; craving; motivation to quit using substances. The study included 345 alcohol and/or substance abusers from 2 addiction treatment clinics and a prison addiction service. The validity of the questionnaire was assessed using the Michigan Alcoholism Screening Test (MAST), Readiness to Change Questionnaire (SOCRATES), Penn Alcohol Craving Scale (PACS), Drug Craving Scale (DCS), Structured Clinical Interview for DSM-IV Axis I Disorders (SCID-I), and Addiction Severity Index (ASI). The Cronbach's alpha coefficient for the total API was 0.89 and for the subscales it ranged from 0.63 to 0.86. Item-total correlation coefficients ranged from 0.42 to 0.89. The Spearman Brown split-half method coefficient for the total API was 0.83. In all, 4 factors were obtained using explanatory factor analysis that represented 52.3% of the total variance. The API craving subscale was observed to be consistent with PACS and the API motivation subscale was consistent with SOCRATES. The API total score was strongly correlated with the mean MAST score, and the composite ASI medical status, substance use, legal status, and family social relations subscale scores. Based on ROC analyses, the area under curve was 0.90. With a total API cut-off score of 4, the scale's sensitivity and specificity 0.85 was 0.78, respectively. The findings show that the API is a valid and reliable questionnaire that can be used to measure the severity of different dimensions of substance dependency.
Assessing the reliability of the borderline regression method as a standard setting procedure for objective structured clinical examination

Directory of Open Access Journals (Sweden)

Sara Mortaz Hejri

2013-01-01

Full Text Available Background: One of the methods used for standard setting is the borderline regression method (BRM. This study aims to assess the reliability of BRM when the pass-fail standard in an objective structured clinical examination (OSCE was calculated by averaging the BRM standards obtained for each station separately. Materials and Methods: In nine stations of the OSCE with direct observation the examiners gave each student a checklist score and a global score. Using a linear regression model for each station, we calculated the checklist score cut-off on the regression equation for the global scale cut-off set at 2. The OSCE pass-fail standard was defined as the average of all station′s standard. To determine the reliability, the root mean square error (RMSE was calculated. The R2 coefficient and the inter-grade discrimination were calculated to assess the quality of OSCE. Results: The mean total test score was 60.78. The OSCE pass-fail standard and its RMSE were 47.37 and 0.55, respectively. The R2 coefficients ranged from 0.44 to 0.79. The inter-grade discrimination score varied greatly among stations. Conclusion: The RMSE of the standard was very small indicating that BRM is a reliable method of setting standard for OSCE, which has the advantage of providing data for quality assurance.
Differences of wells scores accuracy, caprini scores and padua scores in deep vein thrombosis diagnosis

Science.gov (United States)

Gatot, D.; Mardia, A. I.

2018-03-01

Deep Vein Thrombosis (DVT) is the venous thrombus in lower limbs. Diagnosis is by using venography or ultrasound compression. However, these examinations are not available yet in some health facilities. Therefore many scoring systems are developed for the diagnosis of DVT. The scoring method is practical and safe to use in addition to efficacy, and effectiveness in terms of treatment and costs. The existing scoring systems are wells, caprini and padua score. There have been many studies comparing the accuracy of this score but not in Medan. Therefore, we are interested in comparative research of wells, capriniand padua score in Medan.An observational, analytical, case-control study was conducted to perform diagnostic tests on the wells, caprini and padua score to predict the risk of DVT. The study was at H. Adam Malik Hospital in Medan.From a total of 72 subjects, 39 people (54.2%) are men and the mean age are 53.14 years. Wells score, caprini score and padua score has a sensitivity of 80.6%; 61.1%, 50% respectively; specificity of 80.65; 66.7%; 75% respectively, and accuracy of 87.5%; 64.3%; 65.7% respectively.Wells score has better sensitivity, specificity and accuracy than caprini and padua score in diagnosing DVT.
Validity and Reliability of Baseline Testing in a Standardized Environment.

Science.gov (United States)

Higgins, Kathryn L; Caze, Todd; Maerlender, Arthur

2017-08-11

The Immediate Postconcussion Assessment and Cognitive Testing (ImPACT) is a computerized neuropsychological test battery commonly used to determine cognitive recovery from concussion based on comparing post-injury scores to baseline scores. This model is based on the premise that ImPACT baseline test scores are a valid and reliable measure of optimal cognitive function at baseline. Growing evidence suggests that this premise may not be accurate and a large contributor to invalid and unreliable baseline test scores may be the protocol and environment in which baseline tests are administered. This study examined the effects of a standardized environment and administration protocol on the reliability and performance validity of athletes' baseline test scores on ImPACT by comparing scores obtained in two different group-testing settings. Three hundred-sixty one Division 1 cohort-matched collegiate athletes' baseline data were assessed using a variety of indicators of potential performance invalidity; internal reliability was also examined. Thirty-one to thirty-nine percent of the baseline cases had at least one indicator of low performance validity, but there were no significant differences in validity indicators based on environment in which the testing was conducted. Internal consistency reliability scores were in the acceptable to good range, with no significant differences between administration conditions. These results suggest that athletes may be reliably performing at levels lower than their best effort would produce. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Clinical Outcome Scoring of Intra-articular Calcaneal Fractures

NARCIS (Netherlands)

T. Schepers (Tim); M.J. Heetveld (Martin); P.G.H. Mulder (Paul); P. Patka (Peter)

2008-01-01

textabstractOutcome reporting of intra-articular calcaneal fractures is inconsistent. This study aimed to identify the most cited outcome scores in the literature and to analyze their reliability and validity. A systematic literature search identified 34 different outcome scores. The most cited
Hip Inflammation MRI Scoring System (HIMRISS) to predict response to hyaluronic acid injection in hip osteoarthritis

DEFF Research Database (Denmark)

Deseyne, Nicolas; Conrozier, Thierry; Lellouche, Henri

2018-01-01

OBJECTIVE: To assess predictors of response, according to hip MRI inflammatory scoring system (HIMRISS), in a sample of patients with hip osteoarthritis (OA) treated by hyaluronic acid (HA) injection. METHOD: Sixty patients with hip OA were included. Clinical outcomes were assessed at baseline...... SP=0.97, sensitivity SN=0.39, and positive and negative predictive values of 0.91 and 0.64, respectively. CONCLUSION: HIMRISS is reliable for total scores and sub-domains. It permits identification of responders to HA injection in hip OA patients........64, 0.83 and 0.78. Associations between MRI features and clinical data were assessed. Logistic regression (univariate and multivariate) was used to explore associations between MRI features and response to HA injection, according to WOMAC50 response at three months. RESULTS: In total, 45.5% of patients...
[Evaluation on the validity and reliability of the Diabetes Self-management Knowledge, Attitude, and Behavior Assessment Scale (DSKAB)].

Science.gov (United States)

Liu, Xiaoli; Dai, Long; Chen, Bo; Feng, Nongping; Wu, Qianhui; Lin, Yonghai; Zhang, Lan; Tan, Dong; Zhang, Jinhua; Tu, Huijuan; Li, Changfeng; Wang, Wenjuan

2016-01-01

To evaluate the validity and reliability of Diabetes Self-management Knowledge, Attitude, and Behavior Assessment Scale (DSKAB). We selected 460 patients with diabetes in the community, used the scale which was after two rounds of the Delphi method and pilot study. Investigators surveyed the patients by the way of face to face. by draw lots, we selected 25 community diabetes randomly for repeating investigations after one week. The validity analyses included face validity, content validity, construct validity and discriminant validity. The reliability analyses included Cronbach's α coefficient, θ coefficient, Ω coefficient, split-half reliability and test-retest reliability. This study distributed a total of 460 questionnaires, reclaimed 442, qualified 432. The score of the scale was 254.59 ± 28.90, the scores of the knowledge, attitude, behavior sub-scales were 82.44 ± 11.24, 63.53 ± 5.77 and 108.61 ± 17.55, respectively. It had excellent face validity and content validity. The correlation coefficient was from 0.71 to 0.91 among three sub-scales and the scale, Pvalidity. The scores of high group and low group in three sub-scales were: knowledge (91.12 ± 3.62) and (69.96 ± 11.20), attitude (68.75 ± 4.51) and (58.79 ± 4.87), behavior (129.38 ± 8.53) and (89.65 ± 11.34),mean scores of three sub-scales were apparently different, which compared between high score group and low score group, the t value were - 19.45, -16.24 and -30.29, respectively, Pvalidity. The Cronbach's α coefficient of the scale and three sub-scales was from 0.79 to 0.93, the θ coefficient was from 0.86 to 0.95, the Ω coefficient was from 0.90 to 0.98, split-half reliability was from 0.89 to 0.95.Test-retest reliability of the scale was 0.51;the three sub-scales was from 0.46 to 0.52, Pvalidity and reliability of the Diabetes Self-management Knowledge, Attitude, and Behavior Assessment Scale are excellent, which is a suitable instrument to evaluate the self-management for patients
Validity and Reliability of the Turkish Chronic Pain Acceptance Questionnaire

Directory of Open Access Journals (Sweden)

Hazel Ekin Akmaz

2018-05-01

Full Text Available Background: Pain acceptance is the process of giving up the struggle with pain and learning to live a worthwhile life despite it. In assessing patients with chronic pain in Turkey, making a diagnosis and tracking the effectiveness of treatment is done with scales that have been translated into Turkish. However, there is as yet no valid and reliable scale in Turkish to assess the acceptance of pain. Aims: To validate a Turkish version of the Chronic Pain Acceptance Questionnaire developed by McCracken and colleagues. Study Design: Methodological and cross sectional study. Methods: A simple randomized sampling method was used in selecting the study sample. The sample was composed of 201 patients, more than 10 times the number of items examined for validity and reliability in the study, which totaled 20. A patient identification form, the Chronic Pain Acceptance Questionnaire, and the Brief Pain Inventory were used to collect data. Data were collected by face-to-face interviews. In the validity testing, the content validity index was used to evaluate linguistic equivalence, content validity, construct validity, and expert views. In reliability testing of the scale, Cronbach’s α coefficient was calculated, and item analysis and split-test reliability methods were used. Principal component analysis and varimax rotation were used in factor analysis and to examine factor structure for construct concept validity. Results: The item analysis established that the scale, all items, and item-total correlations were satisfactory. The mean total score of the scale was 21.78. The internal consistency coefficient was 0.94, and the correlation between the two halves of the scale was 0.89. Conclusion: The Chronic Pain Acceptance Questionnaire, which is intended to be used in Turkey upon confirmation of its validity and reliability, is an evaluation instrument with sufficient validity and reliability, and it can be reliably used to examine patients’ acceptance
Grant Peer Review: Improving Inter-Rater Reliability with Training.

Science.gov (United States)

Sattler, David N; McKnight, Patrick E; Naney, Linda; Mathis, Randy

2015-01-01

This study developed and evaluated a brief training program for grant reviewers that aimed to increase inter-rater reliability, rating scale knowledge, and effort to read the grant review criteria. Enhancing reviewer training may improve the reliability and accuracy of research grant proposal scoring and funding recommendations. Seventy-five Public Health professors from U.S. research universities watched the training video we produced and assigned scores to the National Institutes of Health scoring criteria proposal summary descriptions. For both novice and experienced reviewers, the training video increased scoring accuracy (the percentage of scores that reflect the true rating scale values), inter-rater reliability, and the amount of time reading the review criteria compared to the no video condition. The increase in reliability for experienced reviewers is notable because it is commonly assumed that reviewers--especially those with experience--have good understanding of the grant review rating scale. The findings suggest that both experienced and novice reviewers who had not received the type of training developed in this study may not have appropriate understanding of the definitions and meaning for each value of the rating scale and that experienced reviewers may overestimate their knowledge of the rating scale. The results underscore the benefits of and need for specialized peer reviewer training.
Translation, cross-cultural adaptation, and psychometric properties of the German version of the hip disability and osteoarthritis outcome score.

Science.gov (United States)

Blasimann, Angela; Dauphinee, Sharon Wood; Staal, J Bart

2014-12-01

Clinical measurement. To translate and cross-culturally adapt the Hip disability and Osteoarthritis Outcome Score (HOOS) from English into German, and to study its psychometric properties in patients after hip surgery. There is no specific hip questionnaire in German that not only measures symptoms and function but also contains items about hip-related quality of life. The translation and cross-cultural adaptation involved forward translation, harmonization, cognitive debriefing, back translation, and comparison to the original HOOS following international guidelines. The German version was tested in 51 Swiss inpatients 8 weeks after different types of hip surgery, mainly total hip replacement. The mean age of the participants was 62.5 years, and the age range was from 27 to 87 years. Thirty (58.8%) of the participants were women. Internal consistency and test-retest reliability were estimated using Cronbach alpha and intraclass correlation coefficients for agreement. For construct validity, total scores of the German HOOS were correlated with those of the Western Ontario and McMaster Universities Osteoarthritis Index. The HOOS was also compared to the Medical Outcomes Study 36-Item Short-Form Health Survey. Cronbach alpha values for all German HOOS subscales were between .87 and .93. For test-retest reliability, the intraclass correlation coefficient for agreement was 0.85 for the total scores of the German HOOS. The Spearman rho for the Medical Outcomes Study 36-Item Short-Form Health Survey physical functioning subscale compared to the sum of all HOOS subscales was 0.71, and that for the Medical Outcomes Study 36-Item Short-Form Health Survey physical component summary was 0.97. The German HOOS has demonstrated adequate reliability and validity. Use of the German HOOS is recommended for assessment of patients after hip surgery, with the proviso that additional psychometric testing should be done in future research.
The validation of the visual analogue scale for patient satisfaction after total hip arthroplasty.

Science.gov (United States)

Brokelman, Roy B G; Haverkamp, Daniel; van Loon, Corné; Hol, Annemiek; van Kampen, Albert; Veth, Rene

2012-06-01

INTRODUCTION: Patient satisfaction becomes more important in our modern health care system. The assessment of satisfaction is difficult because it is a multifactorial item for which no golden standard exists. One of the potential methods of measuring satisfaction is by using the well-known visual analogue scale (VAS). In this study, we validated VAS for satisfaction. PATIENT AND METHODS: In this prospective study, we studied 147 patients (153 hips). The construct validity was measured using the Spearman correlation test that compares the satisfaction VAS with the Harris hip score, pain VAS at rest and during activity, Oxford hip score, Short Form 36 and Western Ontario McMaster Universities Osteoarthritis Index. The reliability was tested using the intra-class coefficient. RESULTS: The Pearson correlation test showed correlations in the range of 0.40-0.80. The satisfaction VAS had a high correlation between the pain VAS and Oxford hip score, which could mean that pain is one of the most important factors in patient satisfaction. The intra-class coefficient was 0.95. CONCLUSIONS: There is a moderate to mark degree of correlation between the satisfaction VAS and the currently available subjective and objective scoring systems. The intra-class coefficient of 0.95 indicates an excellent test-retest reliability. The VAS satisfaction is a simple instrument to quantify the satisfaction of a patient after total hip arthroplasty. In this study, we showed that the satisfaction VAS has a good validity and reliability.
Pattern analysis of total item score and item response of the Kessler Screening Scale for Psychological Distress (K6 in a nationally representative sample of US adults

Directory of Open Access Journals (Sweden)

Shinichiro Tomitaka

2017-02-01

Full Text Available Background Several recent studies have shown that total scores on depressive symptom measures in a general population approximate an exponential pattern except for the lower end of the distribution. Furthermore, we confirmed that the exponential pattern is present for the individual item responses on the Center for Epidemiologic Studies Depression Scale (CES-D. To confirm the reproducibility of such findings, we investigated the total score distribution and item responses of the Kessler Screening Scale for Psychological Distress (K6 in a nationally representative study. Methods Data were drawn from the National Survey of Midlife Development in the United States (MIDUS, which comprises four subsamples: (1 a national random digit dialing (RDD sample, (2 oversamples from five metropolitan areas, (3 siblings of individuals from the RDD sample, and (4 a national RDD sample of twin pairs. K6 items are scored using a 5-point scale: “none of the time,” “a little of the time,” “some of the time,” “most of the time,” and “all of the time.” The pattern of total score distribution and item responses were analyzed using graphical analysis and exponential regression model. Results The total score distributions of the four subsamples exhibited an exponential pattern with similar rate parameters. The item responses of the K6 approximated a linear pattern from “a little of the time” to “all of the time” on log-normal scales, while “none of the time” response was not related to this exponential pattern. Discussion The total score distribution and item responses of the K6 showed exponential patterns, consistent with other depressive symptom scales.
Using the Nursing Culture Assessment Tool (NCAT in Long-Term Care: An Update on Psychometrics and Scoring Standardization

Directory of Open Access Journals (Sweden)

Susan Kennerly

2015-07-01

Full Text Available An effective workforce performing within the context of a positive cultural environment is central to a healthcare organization’s ability to achieve quality outcomes. The Nursing Culture Assessment Tool (NCAT provides nurses with a valid and reliable tool that captures the general aspects of nursing culture. This study extends earlier work confirming the tool’s construct validity and dimensionality by standardizing the scoring approach and establishing norm-referenced scoring. Scoring standardization provides a reliable point of comparison for NCAT users. NCAT assessments support nursing’s ability to evaluate nursing culture, use results to shape the culture into one that supports change, and advance nursing’s best practices and care outcomes. Registered nurses, licensed practical nurses, and certified nursing assistants from 54 long-term care facilities in Kentucky, Nevada, North Carolina, and Oregon were surveyed. Confirmatory factor analysis yielded six first order factors forming the NCAT’s subscales (Expectations, Behaviors, Teamwork, Communication, Satisfaction, Commitment (Comparative Fit Index 0.93 and a second order factor—The Total Culture Score. Aggregated facility level comparisons of observed group variance with expected random variance using rwg(J statistics is presented. Normative scores and cumulative rank percentages and how the NCAT can be used in implementing planned change are provided.

The birth satisfaction scale: Turkish adaptation, validation and reliability study

Science.gov (United States)

Cetin, Fatma Cosar; Sezer, Ayse; Merih, Yeliz Dogan

2015-01-01

OBJECTIVE: The objective of this study is to investigate the validity and the reliability of Birth Satisfaction Scale (BSS) and to adapt it into the Turkish language. This scale is used for measuring maternal satisfaction with birth in order to evaluate women’s birth perceptions. METHODS: In this study there were 150 women who attended to inpatient postpartum clinic. The participants filled in an information form and the BSS questionnaire forms. The properties of the scale were tested by conducting reliability and validation analyses. RESULTS: BSS entails 30 Likert-type questions. It was developed by Hollins Martin and Fleming. Total scale scores ranged between 30–150 points. Higher scores from the scale mean increases in birth satisfaction. Three overarching themes were identified in Scale: service provision (home assessment, birth environment, support, relationships with health care professionals); personal attributes (ability to cope during labour, feeling in control, childbirth preparation, relationship with baby); and stress experienced during labour (distress, obstetric injuries, receiving sufficient medical care, obstetric intervention, pain, prolonged labour and baby’s health). Cronbach’s alfa coefficient was 0.62. CONCLUSION: According to the present study, BSS entails 30 Likert-type questions and evaluates women’s birth perceptions. The Turkish version of BSS has been proven to be a valid and a reliable scale. PMID:28058355
Major influence of interobserver reliability on polytrauma identification with the Injury Severity Score (ISS): Time for a centralised coding in trauma registries?

Science.gov (United States)

Maduz, Roman; Kugelmeier, Patrick; Meili, Severin; Döring, Robert; Meier, Christoph; Wahl, Peter

2017-04-01

The Abbreviated Injury Scale (AIS) and the Injury Severity Score (ISS) find increasingly widespread use to assess trauma burden and to perform interhospital benchmarking through trauma registries. Since 2015, public resource allocation in Switzerland shall even be derived from such data. As every trauma centre is responsible for its own coding and data input, this study aims at evaluating interobserver reliability of AIS and ISS coding. Interobserver reliability of the AIS and ISS is analysed from a cohort of 50 consecutive severely injured patients treated in 2012 at our institution, coded retrospectively by 3 independent and specifically trained observers. Considering a cutoff ISS≥16, only 38/50 patients (76%) were uniformly identified as polytraumatised or not. Increasing the cut off to ≥20, this increased to 41/50 patients (82%). A difference in the AIS of ≥ 1 was present in 261 (16%) of possible codes. Excluding the vast majority of uninjured body regions, uniformly identical AIS severity values were attributed in 67/193 (35%) body regions, or 318/579 (55%) possible observer pairings. Injury severity all too often is neither identified correctly nor consistently when using the AIS. This leads to wrong identification of severely injured patients using the ISS. Improving consistency of coding through centralisation is recommended before scores based on the AIS are to be used for interhospital benchmarking and resource allocation in the treatment of severely injured patients. Copyright © 2017. Published by Elsevier Ltd.
Cross-cultural adaptation and validation of the reliability of the Thai version of the Hip disability and Osteoarthritis Outcome Score (HOOS).

Science.gov (United States)

Trathitiphan, Warayos; Paholpak, Permsak; Sirichativapee, Winai; Wisanuyotin, Taweechok; Laupattarakasem, Pat; Sukhonthamarn, Kamolsak; Jeeravipoolvarn, Polasak; Kosuwon, Weerachai

2016-10-01

HOOS was developed as an extension of the Western Ontario and McMaster Universities' Osteoarthritis Index questionnaire for measuring symptoms and functional limitations related to the hip(s) of patients with osteoarthritis. To determine the validity and reliability of the Thai version of the Hip disability and Osteoarthritis Outcome Score (HOOS) vis-à-vis hip osteoarthritis, the original HOOS was translated into a Thai version of HOOS, according to international recommendations. Patients with hip osteoarthritis (n = 57; 25 males) were asked to complete the Thai version of HOOS twice: once then again after a 3-week interval. The test-retest reliability was analyzed using the intraclass correlation coefficient (ICC). Internal consistencies were analyzed using Cronbach's alpha, while the construct validity was tested by comparing the Thai HOOS with the Thai modified SF-36 and calculating the Spearman's rank correlation coefficients. The Thai HOOS produced good reliability (i.e., the ICC was greater than 0.9 in all five subscales). All of the Cronbach's alpha showed that the Thai HOOS had high internal consistency (Cronbach's alpha greater than 0.8), especially for the pain and ADL subscales (0.89 and 0.90, respectively). The Spearman's rank correlation for all five subscales of the Thai HOOS had moderate correlation with the Bodily Pain subscale of the Thai SF-36. The pain subscale of the Thai HOOS had a high correlation with the Vitality and Social Function subscales of the Thai SF-36 (r = 0.55 and 0.54)-with which the symptom subscale had a moderate correlation. The Thai version of HOOS had excellent internal consistency, excellent test-retest reliability, and good construct validity. It can be used as a reliable tool for assessing quality of life for patients with hip osteoarthritis in Thailand.
Total Longitudinal Moment Calculation and Reliability Analysis of Yacht Structures

Science.gov (United States)

Zhi, Wenzheng; Lin, Shaofen

In order to check the reliability of the yacht in FRP (Fiber Reinforce Plastic) materials, in this paper, the vertical force and the calculation method of the overall longitudinal bending moment on yacht was analyzed. Specially, this paper focuses on the impact of speed on the still water bending moment on yacht. Then considering the mechanical properties of the cap type stiffeners in composite materials, the ultimate bearing capacity of the yacht has been worked out, finally the reliability of the yacht was calculated with using response surface methodology. The result can be used in yacht design and yacht driving.
Reliability of the Brazilian Portuguese version of the Gross Motor Function Measure in children with cerebral palsy

Science.gov (United States)

Almeida, Kênnea M.; Albuquerque, Karolina A.; Ferreira, Marina L.; Aguiar, Stéphany K. B.; Mancini, Marisa C.

2016-01-01

OBJECTIVE: To test the intra- and interrater reliability of the Brazilian Portuguese version of the 66-item Gross Motor Function Measure (GMFM-66). METHOD: The sample included 48 children with cerebral palsy (CP), ranging from 2-17 years old, classified at levels I to IV of the Gross Motor Function Classification System (GMFCS) and four child rehabilitation examiners. A main examiner evaluated all children using the GMFM-66 and video-recorded the assessments. The other examiners watched the video recordings and scored them independently for the assessment of interrater reliability. For the intrarater reliability evaluation, the main examiner watched the video recordings one month after the evaluation and re-scored each child. We calculated reliability by using intraclass correlation coefficients (ICC) with their respective 95% confidence intervals. RESULTS: Excellent test reliability was documented. The intrarater reliability of the total sample was ICC=0.99 (95% CI 0.98-0.99), and the interrater reliability was ICC=0.97 (95% CI 0.95-0.98). The reliability across GMFCS levels ranged from ICC=0.92 (95% CI 0.72-0.98) to ICC=0.99 (95% CI 0.99-0.99); the lowest value was the interrater reliability for the GMFCS IV group. Reliability in the five GMFM dimensions varied from ICC=0.95 (95% CI 0.93-0.97) to ICC=0.99 (95% CI 0.99-0.99). CONCLUSION: The Brazilian Portuguese version of the GMFM-66 showed excellent intra- and interrater reliability when used in Brazilian children with CP levels GMFCS I to IV. PMID:26786081
Mobile health technology transforms injury severity scoring in South Africa.

Science.gov (United States)

Spence, Richard Trafford; Zargaran, Eiman; Hameed, S Morad; Navsaria, Pradeep; Nicol, Andrew

2016-08-01

The burden of data collection associated with injury severity scoring has limited its application in areas of the world with the highest incidence of trauma. Since January 2014, electronic records (electronic Trauma Health Records [eTHRs]) replaced all handwritten records at the Groote Schuur Hospital Trauma Unit in South Africa. Data fields required for Glasgow Coma Scale, Revised Trauma Score, Kampala Trauma Score, Injury Severity Score (ISS), and Trauma Score-Injury Severity Score calculations are now prospectively collected. Fifteen months after implementation of eTHR, the injury severity scores were compared as predictors of mortality on three accounts: (1) ability to discriminate (area under receiver operating curve, ROC); (2) ability to calibrate (observed versus expected ratio, O/E); and (3) feasibility of data collection (rate of missing data). A total of 7460 admissions were recorded by eTHR from April 1, 2014 to July 7, 2015, including 770 severely injured patients (ISS > 15) and 950 operations. The mean age was 33.3 y (range 13-94), 77.6% were male, and the mechanism of injury was penetrating in 39.3% of cases. The cohort experienced a mortality rate of 2.5%. Patient reserve predictors required by the scores were 98.7% complete, physiological injury predictors were 95.1% complete, and anatomic injury predictors were 86.9% complete. The discrimination and calibration of Trauma Score-Injury Severity Score was superior for all admissions (ROC 0.9591 and O/E 1.01) and operatively managed patients (ROC 0.8427 and O/E 0.79). In the severely injured cohort, the discriminatory ability of Revised Trauma Score was superior (ROC 0.8315), but no score provided adequate calibration. Emerging mobile health technology enables reliable and sustainable injury severity scoring in a high-volume trauma center in South Africa. Copyright © 2016 Elsevier Inc. All rights reserved.
A new scoring system for predicting survival in patients with non-small cell lung cancer

International Nuclear Information System (INIS)

Schild, Steven E; Tan, Angelina D; Wampfler, Jason A; Ross, Helen J; Yang, Ping; Sloan, Jeff A

2015-01-01

This analysis was performed to create a scoring system to estimate the survival of patients with non-small cell lung cancer (NSCLC). Data from 1274 NSCLC patients were analyzed to create and validate a scoring system. Univariate (UV) and multivariate (MV) Cox models were used to evaluate the prognostic importance of each baseline factor. Prognostic factors that were significant on both UV and MV analyses were used to develop the score. These included quality of life, age, performance status, primary tumor diameter, nodal status, distant metastases, and smoking cessation. The score for each factor was determined by dividing the 5-year survival rate (%) by 10 and summing these scores to form a total score. MV models and the score were validated using bootstrapping with 1000 iterations from the original samples. The score for each prognostic factor ranged from 1 to 7 points with higher scores reflective of better survival. Total scores (sum of the scores from each independent prognostic factor) of 32–37 correlated with a 5-year survival of 8.3% (95% CI = 0–17.1%), 38–43 correlated with a 5-year survival of 20% (95% CI = 13–27%), 44–47 correlated with a 5-year survival of 48.3% (95% CI = 41.5–55.2%), 48–49 correlated to a 5-year survival of 72.1% (95% CI = 65.6–78.6%), and 50–52 correlated to a 5-year survival of 84.7% (95% CI = 79.6–89.8%). The bootstrap method confirmed the reliability of the score. Prognostic factors significantly associated with survival on both UV and MV analyses were used to construct a valid scoring system that can be used to predict survival of NSCLC patients. Optimally, this score could be used when counseling patients, and designing future trials
Intra- and inter-rater reliability of the Sollerman hand function test in patients with chronic stroke

DEFF Research Database (Denmark)

Brogårdh, Christina; Persson, Ann L; Sjölund, Bengt H

2007-01-01

PURPOSE: To examine whether the Sollerman hand function test is reliable in a test-retest situation in patients with chronic stroke. METHOD: Three independent examiners observed each patient at three experimental sessions; two days in week 1 (short-term test-retest) and one day in week 4 (long...... test seems to be a reliable test in patients with chronic stroke, but we recommend that the same examiner evaluates a patient's hand function pre- and post-treatment.......-term test-retest). A total of 24 patients with chronic stroke (mean age; 59.7 years, mean time since stroke onset 29.6 months) participated. The examiners simultaneously assessed the patients' ability to perform 20 subtests. Both ordinal data (generalized kappa) and total sum scores (Spearman's rank...
Cross-cultural adaptation of Kerlan-Jobe Orthopaedic Clinic shoulder and elbow score: Reliability and validity in Turkish-speaking overhead athletes.

Science.gov (United States)

Turgut, Elif; Tunay, Volga Bayrakci

2018-03-09

Kerlan-Jobe Orthopaedic Clinic Shoulder and Elbow Score (KJOC-SES) is a subjective assessment tool to measure functional status of the upper extremities in overhead athletes. The aim was to translate and culturally adapt the KJOC-SES and to evaluate the psychometric properties of the Turkish version (KJOC-SES-Tr) in overhead athletes. The forward and back-translation method was followed. One hundred and twenty-three overhead athletes completed the KJOC-SES-Tr, the Disabilities of the Arm, Shoulder, and Hand (DASH), and the American Shoulder and Elbow Surgeons Evaluation Form (ASES). Participants were assigned to one of the following subgroups: asymptomatic (playing without pain) or symptomatic (playing with pain, or not playing due to pain). Internal consistency, reliability, construct validity, discriminant validity, and content validity of the KJOC-SES-Tr were tested. The test-retest reliability of the KJOC-SES-Tr was excellent with an interclass coefficient of 0.93. There was a strong correlation between the KJOC-SES-Tr and the DASH and the ASES, indicating that the construct validity was good for all participants. Results of the KJOC-SES-Tr significantly differed between different subgroups and categories of athletes. The floor and ceiling effects were acceptable for symptomatic athletes. The KJOC-SES-Tr was shown to be valid, reliable tool to monitor the return to sports following injuries in athletes. Copyright © 2018 Turkish Association of Orthopaedics and Traumatology. Production and hosting by Elsevier B.V. All rights reserved.
Standardization, Validity and Reliability Study of Gülhane Aphasia Test-2 (GAT-2

Directory of Open Access Journals (Sweden)

İlknur Maviş

2007-04-01

Full Text Available OBJECTIVE: Gülhane Aphasia Test-2 (GAT-2 has been developed to show the presence of a language disorder ‘aphasia’ and to give the clinician implications for the accompanying speech disorders such as apraxia and dysarthria. OBJECTIVE: The aim of the study was to report standardization, validity and reliability study of GAT-2. METHODS: : 10 healthy individuals were tested initially for the pilot study. 134 healthy individual was included to the standardization study and 30 individuals with aphasia and 11 individuals with right brain injury was included to the validation study. The inter group GAT-2 score differentiations and the effects of age, years of education, sex variances were observed. GAT-2 cut-off scores were calculated by the scores of healthy individuals. GAT-2 test-retest reliability and inter-observer reliability was calculated. RESULTS: Healthy individuals’ GAT-2 scores were significantly different from the GAT-2 scores of aphasic patients, but not from right brain injured patients’. Healthy individuals’ GAT-2 scores were not affected from the sex, age variances but from years of education, so cut-off scores were calculated by this variance. GAT-2 scores of aphasic patients were not affected from age, sex and years of education. Test-retest and inter-observer reliability and internal consistency results showed that GAT-2 is a highly reliable aphasia screening test. CONCLUSION: GAT-2 was found to be a standardized, highly reliable and a valid aphasia test for Turkish stroke patients with aphasia
Cross-cultural adaptation and validation of the Turkish version of Oxford hip score.

Science.gov (United States)

Tuğay, Baki Umut; Tuğay, Nazan; Güney, Hande; Hazar, Zeynep; Yüksel, İnci; Atilla, Bülent

2015-06-01

The purpose of this study was to translate the Oxford hip score (OHS) into Turkish and to evaluate the psychometric properties by testing the internal consistency, reproducibility, construct validity, and responsiveness in patients with hip osteoarthritis (OA). Oxford hip score was translated and culturally adapted according to the guidelines in the literature. Seventy patients (mean age 61.45 ± 9.29 years) with hip osteoarthritis participated in the study. Patients completed the Turkish Oxford hip score (OHS-TR), the Short-Form 36 (SF-36), and Western Ontario and McMaster Universities Index (WOMAC). Internal consistency was tested using Cronbach's α coefficient. Patients completed OHS-TR questionnaire twice in 7 days for determining the reproducibility. Correlation between the total results of both tests was determined by the Pearson correlation coefficient and intraclass correlation coefficient (ICC). Validity was assessed by calculating the Pearson correlation coefficient between the OHS-TR and WOMAC and SF-36 scores. Floor and ceiling effects were analyzed. The internal consistency was high (Cronbach's α 0.93). The construct validity showed a significant correlation between the OHS-TR and WOMAC and related SF-36 domains (p < 0.001). The ICC's ranged between 0.80 and 0.99. There was no floor or ceiling effect in total OHS-TR score. The OHS-TR questionnaire is valid, reliable, and responsive for the Turkish-speaking patients with hip OA.
Interrater reliability assessment using the Test of Gross Motor Development-2.

Science.gov (United States)

Barnett, Lisa M; Minto, Christine; Lander, Natalie; Hardy, Louise L

2014-11-01

The aim was to examine interrater reliability of the object control subtest from the Test of Gross Motor Development-2 by live observation in a school field setting. Reliability Study--cross sectional. Raters were rated on their ability to agree on (1) the raw total for the six object control skills; (2) each skill performance and (3) the skill components. Agreement for the object control subtest and the individual skills was assessed by an intraclass correlation (ICC) and a kappa statistic assessed for skill component agreement. A total of 37 children (65% girls) aged 4-8 years (M = 6.2, SD = 0.8) were assessed in six skills by two raters; equating to 222 skill tests. Interrater reliability was excellent for the object control subset (ICC = 0.93), and for individual skills, highest for the dribble (ICC = 0.94) followed by strike (ICC = 0.85), overhand throw (ICC = 0.84), underhand roll (ICC = 0.82), kick (ICC = 0.80) and the catch (ICC = 0.71). The strike and the throw had more components with less agreement. Even though the overall subtest score and individual skill agreement was good, some skill components had lower agreement, suggesting these may be more problematic to assess. This may mean some skill components need to be specified differently in order to improve component reliability. Crown Copyright © 2013. Published by Elsevier Ltd. All rights reserved.
MRI interrReader and intra-reader reliabilities for assessing injury morphology and posterior ligamentous complex integrity of the spine according to the thoracolumbar injury classification system and severity score

International Nuclear Information System (INIS)

Lee, Guen Young; Lee, Joon Woo; Choi, Seung Woo; Lim, Hyun Jin; Sun, Hye Young; Kang, Yu Suhn; Kang, Heung Sik; Chai, Jee Won; Kim, Su Jin

2015-01-01

To evaluate spine magnetic resonance imaging (MRI) inter-reader and intra-reader reliabilities using the thoracolumbar injury classification system and severity score (TLICS) and to analyze the effects of reader experience on reliability and the possible reasons for discordant interpretations. Six radiologists (two senior, two junior radiologists, and two residents) independently scored 100 MRI examinations of thoracolumbar spine injuries to assess injury morphology and posterior ligamentous complex (PLC) integrity according to the TLICS. Inter-reader and intra-reader agreements were determined and analyzed according to the number of years of radiologist experience. Inter-reader agreement between the six readers was moderate (k = 0.538 for the first and 0.537 for the second review) for injury morphology and fair to moderate (k = 0.440 for the first and 0.389 for the second review) for PLC integrity. No significant difference in inter-reader agreement was observed according to the number of years of radiologist experience. Intra-reader agreements showed a wide range (k = 0.538-0.822 for injury morphology and 0.423-0.616 for PLC integrity). Agreement was achieved in 44 for the first and 45 for the second review about injury morphology, as well as in 41 for the first and 38 for the second review of PLC integrity. A positive correlation was detected between injury morphology score and PLC integrity. The reliability of MRI for assessing thoracolumbar spinal injuries according to the TLICS was moderate for injury morphology and fair to moderate for PLC integrity, which may not be influenced by radiologist' experience
Test-retest reliability and predictive validity of the Implicit Association Test in children.

Science.gov (United States)

Rae, James R; Olson, Kristina R

2018-02-01

The Implicit Association Test (IAT) is increasingly used in developmental research despite minimal evidence of whether children's IAT scores are reliable across time or predictive of behavior. When test-retest reliability and predictive validity have been assessed, the results have been mixed, and because these studies have differed on many factors simultaneously (lag-time between testing administrations, domain, etc.), it is difficult to discern what factors may explain variability in existing test-retest reliability and predictive validity estimates. Across five studies (total N = 519; ages 6- to 11-years-old), we manipulated two factors that have varied in previous developmental research-lag-time and domain. An internal meta-analysis of these studies revealed that, across three different methods of analyzing the data, mean test-retest (rs of .48, .38, and .34) and predictive validity (rs of .46, .20, and .10) effect sizes were significantly greater than zero. While lag-time did not moderate the magnitude of test-retest coefficients, whether we observed domain differences in test-retest reliability and predictive validity estimates was contingent on other factors, such as how we scored the IAT or whether we included estimates from a unique sample (i.e., a sample containing gender typical and gender diverse children). Recommendations are made for developmental researchers that utilize the IAT in their research. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Reliability of the Dissociative Trance Disorder Interview Schedule: A preliminary report.

Science.gov (United States)

Ross, Colin A; Somer, Eli; Goode, Caitlin

2018-01-01

One hundred inpatients in a hospital-based Trauma Program in the USA were interviewed with the Dissociative Trance Disorder Interview Schedule (DTDIS). There were no significant differences for the DTDIS total score or any of the subscale scores on test-retest: all t-values comparing the two administrations of the DTDIS were below 0.7, and all p-values were above 0.5. Cronbach's alpha for the US sample was 0.966 and for the Israeli sample it was 0.971. The findings indicate that the DTDIS has good reliability and may be suitable for use in cross-cultural research; however, the results require replication by independent researchers in a variety of cultures and languages, and in both clinical and nonclinical samples.
The reliability of the Glasgow Coma Scale: a systematic review.

Science.gov (United States)

Reith, Florence C M; Van den Brande, Ruben; Synnot, Anneliese; Gruen, Russell; Maas, Andrew I R

2016-01-01

The Glasgow Coma Scale (GCS) provides a structured method for assessment of the level of consciousness. Its derived sum score is applied in research and adopted in intensive care unit scoring systems. Controversy exists on the reliability of the GCS. The aim of this systematic review was to summarize evidence on the reliability of the GCS. A literature search was undertaken in MEDLINE, EMBASE and CINAHL. Observational studies that assessed the reliability of the GCS, expressed by a statistical measure, were included. Methodological quality was evaluated with the consensus-based standards for the selection of health measurement instruments checklist and its influence on results considered. Reliability estimates were synthesized narratively. We identified 52 relevant studies that showed significant heterogeneity in the type of reliability estimates used, patients studied, setting and characteristics of observers. Methodological quality was good (n = 7), fair (n = 18) or poor (n = 27). In good quality studies, kappa values were ≥0.6 in 85%, and all intraclass correlation coefficients indicated excellent reliability. Poor quality studies showed lower reliability estimates. Reliability for the GCS components was higher than for the sum score. Factors that may influence reliability include education and training, the level of consciousness and type of stimuli used. Only 13% of studies were of good quality and inconsistency in reported reliability estimates was found. Although the reliability was adequate in good quality studies, further improvement is desirable. From a methodological perspective, the quality of reliability studies needs to be improved. From a clinical perspective, a renewed focus on training/education and standardization of assessment is required.
Funding Medical Research Projects: Taking into Account Referees' Severity and Consistency through Many-Faceted Rasch Modeling of Projects' Scores.

Science.gov (United States)

Tesio, Luigi; Simone, Anna; Grzeda, Mariuzs T; Ponzio, Michela; Dati, Gabriele; Zaratin, Paola; Perucca, Laura; Battaglia, Mario A

2015-01-01

The funding policy of research projects often relies on scores assigned by a panel of experts (referees). The non-linear nature of raw scores and the severity and inconsistency of individual raters may generate unfair numeric project rankings. Rasch measurement (many-facets version, MFRM) provides a valid alternative to scoring. MFRM was applied to the scores achieved by 75 research projects on multiple sclerosis sent in response to a previous annual call by FISM-Italian Foundation for Multiple Sclerosis. This allowed to simulate, a posteriori, the impact of MFRM on the funding scenario. The applications were each scored by 2 to 4 independent referees (total = 131) on a 10-item, 0-3 rating scale called FISM-ProQual-P. The rotation plan assured "connection" of all pairs of projects through at least 1 shared referee.The questionnaire fulfilled satisfactorily the stringent criteria of Rasch measurement for psychometric quality (unidimensionality, reliability and data-model fit). Arbitrarily, 2 acceptability thresholds were set at a raw score of 21/30 and at the equivalent Rasch measure of 61.5/100, respectively. When the cut-off was switched from score to measure 8 out of 18 acceptable projects had to be rejected, while 15 rejected projects became eligible for funding. Some referees, of various severity, were grossly inconsistent (z-std fit indexes less than -1.9 or greater than 1.9). The FISM-ProQual-P questionnaire seems a valid and reliable scale. MFRM may help the decision-making process for allocating funds to MS research projects but also in other fields. In repeated assessment exercises it can help the selection of reliable referees. Their severity can be steadily calibrated, thus obviating the need to connect them with other referees assessing the same projects.
Reliability and validity of the Turkish version of the Structured Clinical Interview for DSM-IV Dissociative Disorders (SCID-D): a preliminary study.

Science.gov (United States)

Kundakçi, Turgut; Sar, Vedat; Kiziltan, Emre; Yargiç, Ilhan L; Tutkun, Hamdi

2014-01-01

A total of 34 consecutive patients with dissociative identity disorder or dissociative disorder not otherwise specified were evaluated using the Turkish version of the Structured Clinical Interview for DSM-IV Dissociative Disorders (SCID-D). They were compared with a matched control group composed of 34 patients who had a nondissociative psychiatric disorder. Interrater reliability was evaluated by 3 clinicians who assessed videotaped interviews conducted with 5 dissociative and 5 nondissociative patients. All subjects who were previously diagnosed by clinicians as having a dissociative disorder were identified as positive, and all subjects who were previously diagnosed as not having a dissociative disorder were identified as negative. The scores of the main symptom clusters and the total score of the SCID-D differentiated dissociative patients from the nondissociative group. There were strong correlations between the SCID-D and the Dissociative Experiences Scale total and subscale scores. These results are promising for the validity and reliability of the Turkish version of the SCID-D. However, as the present study was conducted on a predominantly female sample with very severe dissociation, these findings should not be generalized to male patients, to dissociative disorders other than dissociative identity disorder, or to broader clinical or nonclinical populations.
Validation of the Simplified Motor Score in patients with traumatic ...

African Journals Online (AJOL)

Background. This study used data from a large prospectively entered database to assess the efficacy of the motor score (M score) component of the Glasgow Coma Scale (GCS) and the Simplified Motor Score (SMS) in predicting overall outcome in patients with traumatic brain injury (TBI). Objective. To safely and reliably ...
Reliability of a Retail Food Store Survey and Development of an Accompanying Retail Scoring System to Communicate Survey Findings and Identify Vendors for Healthful Food and Marketing Initiatives

Science.gov (United States)

Ghirardelli, Alyssa; Quinn, Valerie; Sugerman, Sharon

2011-01-01

Objective: To develop a retail grocery instrument with weighted scoring to be used as an indicator of the food environment. Participants/Setting: Twenty six retail food stores in low-income areas in California. Intervention: Observational. Main Outcome Measure(s): Inter-rater reliability for grocery store survey instrument. Description of store…

Is Total Femur Replacement a Reliable Treatment Option for Patients With Metastatic Carcinoma of the Femur?

Science.gov (United States)

Sevelda, Florian; Waldstein, Wenzel; Panotopoulos, Joannis; Kaider, Alexandra; Funovics, Philipp Theodor; Windhager, Reinhard

2018-05-01

The majority of metastatic bone lesions to the femoral bone can be treated without surgery or with minimally invasive intramedullary nailing. In rare patients with extensive metastatic disease to the femur, total femur replacement may be the only surgical alternative to amputation; however, little is known about this approach. In a highly selected small group of patients with metastatic carcinoma of the femur, we asked: (1) What was the patient survivorship after this treatment? (2) What was the implant survivorship free from all-cause revision and amputation, and what complications were associated with this treatment? (3) What functional outcomes were achieved by patients after total femur replacement for this indication? Eleven patients (three men, eight women) with a mean age of 64 years (range, 41-78 years) received total femur replacements between 1986 and 2016; none were lost to followup. The most common primary disease was breast cancer. In general, during this period, our indications for this procedure were extensive metastatic disease precluding internal fixation or isolated proximal or distal femur replacement, and an anticipated lifespan exceeding 6 months. Our contraindication for this procedure during this time was expected lifespan less than 6 months. Patient survival was assessed by Kaplan-Meier analysis; implant survival free from revision surgery and amputation were assessed by competing risk analysis. Function was determined preoperatively and 6 to 12 weeks postoperatively with the Musculoskeletal Tumor Society (MSTS) score normalized to a 100-point scale, with higher scores representing better function from a longitudinally maintained institutional database. Eleven patients died at a median of 5 months (range, 1-31 months) after surgery. One-year revision-free and limb survival were 82% (95% CI, 51%-98%) and 91% (95% CI, 61%-99%), respectively. Reasons for reoperation were hip dislocation, infection and local recurrence in one patient each. The
Reliability of Videoconferencing Administration of a Communication Questionnaire to People With Traumatic Brain Injury and Their Close Others.

Science.gov (United States)

Rietdijk, Rachael; Power, Emma; Brunner, Melissa; Togher, Leanne

To compare in-person with videoconferencing administration of a communication questionnaire for people with traumatic brain injury (TBI) and their close others. Repeated-measures design with randomized order of administration. Twenty adults with severe TBI and their close others. Both participants with TBI and their close others completed the La Trobe Communication Questionnaire (LCQ) via interview with a clinician, once via Skype and once during a home visit. Total LCQ score and time taken for completion. There were no significant differences between videoconferencing and in-person conditions in the total scores or time taken to complete the questionnaire. Videoconferencing-based administration of the LCQ is as reliable and efficient as in-person administration.
Pharmacokinetic-pharmacodynamic modeling of antipsychotic drugs in patients with schizophrenia Part I : The use of PANSS total score and clinical utility

NARCIS (Netherlands)

Reddy, Venkatesh Pilla; Kozielska, Magdalena; Suleiman, Ahmed Abbas; Johnson, Martin; Vermeulen, An; Liu, Jing; de Greef, Rik; Groothuis, Geny M. M.; Danhof, Meindert; Proost, Johannes H.

Background: To develop a pharmacokinetic-pharmacodynamic (PK-PD) model using individual-level data of Positive and Negative Syndrome Scale (PANSS) total score to characterize the antipsychotic drug effect taking into account the placebo effect and dropout rate. In addition, a clinical utility (CU)
Reliability of Modern Scores to Predict Long-Term Mortality After Isolated Aortic Valve Operations.

Science.gov (United States)

Barili, Fabio; Pacini, Davide; D'Ovidio, Mariangela; Ventura, Martina; Alamanni, Francesco; Di Bartolomeo, Roberto; Grossi, Claudio; Davoli, Marina; Fusco, Danilo; Perucci, Carlo; Parolari, Alessandro

2016-02-01

Contemporary scores for estimating perioperative death have been proposed to also predict also long-term death. The aim of the study was to evaluate the performance of the updated European System for Cardiac Operative Risk Evaluation II, The Society of Thoracic Surgeons Predicted Risk of Mortality score, and the Age, Creatinine, Left Ventricular Ejection Fraction score for predicting long-term mortality in a contemporary cohort of isolated aortic valve replacement (AVR). We also sought to develop for each score a simple algorithm based on predicted perioperative risk to predict long-term survival. Complete data on 1,444 patients who underwent isolated AVR in a 7-year period were retrieved from three prospective institutional databases and linked with the Italian Tax Register Information System. Data were evaluated with performance analyses and time-to-event semiparametric regression. Survival was 83.0% ± 1.1% at 5 years and 67.8 ± 1.9% at 8 years. Discrimination and calibration of all three scores both worsened for prediction of death at 1 year and 5 years. Nonetheless, a significant relationship was found between long-term survival and quartiles of scores (p System for Cardiac Operative Risk Evaluation II, 1.34 (95% CI, 1.28 to 1.40) for the Society of Thoracic Surgeons score, and 1.08 (95% CI, 1.06 to 1.10) for the Age, Creatinine, Left Ventricular Ejection Fraction score. The predicted risk generated by European System for Cardiac Operative Risk Evaluation II, The Society of Thoracic Surgeons score, and Age, Creatinine, Left Ventricular Ejection Fraction scores cannot also be considered a direct estimate of the long-term risk for death. Nonetheless, the three scores can be used to derive an estimate of long-term risk of death in patients who undergo isolated AVR with the use of a simple algorithm. Copyright © 2016 The Society of Thoracic Surgeons. Published by Elsevier Inc. All rights reserved.
Development of the siriraj clinical asthma score.

Science.gov (United States)

Vichyanond, Pakit; Veskitkul, Jittima; Rienmanee, Nuanphong; Pacharn, Punchama; Jirapongsananuruk, Orathai; Visitsunthorn, Nualanong

2013-09-01

Acute asthmatic attack in children commonly occurs despite the introduction of effective controllers such as inhaled corticosteroids and leukotriene modifiers. Treatment of acute asthmatic attack requires proper evaluation of attack severity and appropriate selection of medical therapy. In children, measurement of lung function is difficult during acute attack and thus clinical asthma scoring may aid physician in making further decision regarding treatment and admission. We enrolled 70 children with acute asthmatic attack with age range from 1 to 12 years (mean ± SD = 51.5 ± 31.8 months) into the study. Twelve selected asthma severity items were assessed by 2 independent observers prior to administration of salbutamol nebulization (up to 3 doses at 20 minutes interval). Decision for further therapy and admission was made by emergency department physician. Three different scoring systems were constructed from items with best validity. Sensitivity, specificity and accuracy of these scores were assessed. Inter-rater reliability was assessed for each score. Review of previous scoring systems was also conducted and reported. Three severity items had poor validity, i.e., cyanosis, depressed cerebral function, and I:E ratio (p > 0.05). Three items had poor inter-rater reliability, i.e., breath sound quality, air entry, and I:E ratio. These items were omitted and three new clinical scores were constructed from the remaining items. Clinical scoring system comprised retractions, dyspnea, O2 saturation, respiratory rate and wheezing (rangeof score 0-10) gave the best accuracy and inter-rater variability and were chosen for clinical use-Siriraj Clinical Asthma Score (SCAS). A Clinical Asthma Score that is simple, relatively easy to administer and with good validity and variability is essential for treatment of acute asthma in children. Several good candidate scores have been introduced in the past. We described the development of the Siriraj Clinical Asthma Score (SCAS) in
Reliability of the CARE rule and the HEART score to rule out an acute coronary syndrome in non-traumatic chest pain patients.

Science.gov (United States)

Moumneh, Thomas; Richard-Jourjon, Vanessa; Friou, Emilie; Prunier, Fabrice; Soulie-Chavignon, Caroline; Choukroun, Jacques; Mazet-Guilaumé, Betty; Riou, Jérémie; Penaloza, Andréa; Roy, Pierre-Marie

2018-03-02

In patients consulting in the Emergency Department for chest pain, a HEART score ≤ 3 has been shown to rule out an acute coronary syndrome (ACS) with a low risk of major adverse cardiac event (MACE) occurrence. A negative CARE rule (≤ 1) that stands for the first four elements of the HEART score may have similar rule-out reliability without troponin assay requirement. We aim to prospectively assess the performance of the CARE rule and of the HEART score to predict MACE in a chest pain population. Prospective two-center non-interventional study. Patients admitted to the ED for non-traumatic chest pain were included, and followed-up at 6 weeks. The main study endpoint was the 6-week rate of MACE (myocardial infarction, coronary angioplasty, coronary bypass, and sudden unexplained death). 641 patients were included, of whom 9.5% presented a MACE at 6 weeks. The CARE rule was negative for 31.2% of patients, and none presented a MACE during follow-up [0, 95% confidence interval: (0.0-1.9)]. The HEART score was ≤ 3 for 63.0% of patients, and none presented a MACE during follow-up [0% (0.0-0.9)]. With an incidence below 2% in the negative group, the CARE rule seemed able to safely rule out a MACE without any biological test for one-third of patients with chest pain and the HEART score for another third with a single troponin assay.
Analysing relations between specific and total liking scores

DEFF Research Database (Denmark)

Menichelli, Elena; Kraggerud, Hilde; Olsen, Nina Veflen

2013-01-01

The objective of this article is to present a new statistical approach for the study of consumer liking. Total liking data are extended by incorporating liking for specific sensory properties. The approach combines different analyses for the purpose of investigating the most important aspects...... of liking and indicating which products are similarly or differently perceived by which consumers. A method based on the differences between total liking and the specific liking variables is proposed for studying both relative differences among products and individual consumer differences. Segmentation...... is also tested out in order to distinguish consumers with the strongest differences in their liking values. The approach is illustrated by a case study, based on cheese data. In the consumer test consumers were asked to evaluate their total liking, the liking for texture and the liking for odour/taste. (C...
A Categorical Instrument for Scoring Second Language Writing Skills.

Science.gov (United States)

Brown, James Dean; Bailey, Kathleen M.

1984-01-01

Discusses a study of the reliability of a categorical instrument for evaluating compositions written by upper intermediate university English as a second language students. The instrument tests organization, logical development of ideas, grammar, mechanics, and style. Results indicate that the scoring instrument is moderately reliable. (SED)
Validity and reliability of Turkish Caregiver Burden Scale among family caregivers of haemodialysis patients.

Science.gov (United States)

Cil Akinci, Ayse; Pinar, Rukiye

2014-02-01

To investigate the validity and reliability of the Caregiver Burden Scale in family members who provide primary care for haemodialysis patients. In Turkey, there is a need for a multi-dimensional instrument to evaluate the caregiver burden in people who provide care for patients with chronic diseases. A methodological study. The study sample consisted of 161 family members who provide primary care for haemodialysis patients. The forward-backward translation method was used to develop the Turkish Caregiver Burden Scale. The reliability was based on internal consistency investigated by Cronbach's alpha and item-total correlation. The factorial construct validity of the scale was tested with confirmatory factor analysis. By means of convergent and divergent validity, correlation between Caregiver Burden Scale and 36-Item Short Form Health Survey (SF-36) and correlation between Caregiver Burden Scale and the Maslach Burnout Scale were investigated. Cronbach's alpha and item-total correlations results suggested that there was good internal reliability. We found five underlying factors similar to original Scale's five-factor solution. The confirmatory factor analysis five-factor model represented an acceptable fit. Factor loadings were significant, with standardised loadings ranging from 0·43-0·81. By means of divergent validity, all sub-dimension scores and the total score of the Caregiver Burden Scale were negatively correlated with the SF-36, whereas there was a positive correlation with the emotional exhaustion and depersonalisation subscales of the Maslach Burnout Scale as expected. These results suggest that the Caregiver Burden Scale is a reliable and valid instrument which can be used with confidence in Turkish caregivers for haemodialysis patients to screen caregiver burden. The burden experienced by people who provide care for patients with chronic diseases can be evaluated with the Caregiver Burden Scale. Additionally, the Caregiver Burden Scale can be used
[Validating the Spanish version of the Nursing Activities Score].

Science.gov (United States)

Sánchez-Sánchez, M M; Arias-Rivera, S; Fraile-Gamo, M P; Thuissard-Vasallo, I J; Frutos-Vivar, F

2015-01-01

Validating workload scores ensures that they are appropriate for the purpose for which they were developed. To validate the Nursing Activities Score (NAS) Spanish version. Observational and prospective study. 1,045 patients who were admitted to a medical-surgical unit and a serious burns unit in 2006 were included. The nurse in charge assessed patient workloads by Nine Equivalent of Nursing Manpower use Score and NAS. To assess the internal consistency of the measurements of NAS, item-test correlations, Cronbach's α and Cronbach's α corrected by omitting each of the items were calculated. The intraobserver and interobserver reliability were assessed with the intraclass correlation coefficient by viewing recordings and Kappa (interobserver reliability) was estimated. For the analysis of internal validity, a factorial principal components analysis was performed. Convergent validity was assessed using the Spearman correlation coefficient values obtained from the Nine Equivalent of Nursing Manpower use Score and Spanish-NAS scales. For internal consistency, 164 questionnaires were analysed and a Cronbach's α of 0.373 was calculated. The intraclass correlation coefficient for intraobserver reliability estimate was 0.837 (95% IC: 0.466-0.950) and 0.662 (95% IC: 0.033-0.882) for interobserver reliability. The estimated kappa was 0.371. For internal validity, exploratory factor analysis showed that the first item explained 58.9% of the variance of the questionnaire. For convergent validity 1006 questionnaires were included and a Spearman correlation coefficient of 0.746 was observed. The psychometric properties of Spanish-NAS are acceptable. Copyright © 2014 Elsevier España, S.L.U. y SEEIUC. All rights reserved.
Reliability, construct and criterion validity of the KIDSCREEN-10 score: a short measure for children and adolescents’ well-being and health-related quality of life

Science.gov (United States)

Erhart, Michael; Rajmil, Luis; Herdman, Michael; Auquier, Pascal; Bruil, Jeanet; Power, Mick; Duer, Wolfgang; Abel, Thomas; Czemy, Ladislav; Mazur, Joanna; Czimbalmos, Agnes; Tountas, Yannis; Hagquist, Curt; Kilroe, Jean

2010-01-01

Background To assess the criterion and construct validity of the KIDSCREEN-10 well-being and health-related quality of life (HRQoL) score, a short version of the KIDSCREEN-52 and KIDSCREEN-27 instruments. Methods The child self-report and parent report versions of the KIDSCREEN-10 were tested in a sample of 22,830 European children and adolescents aged 8–18 and their parents (n = 16,237). Correlation with the KIDSCREEN-52 and associations with other generic HRQoL measures, physical and mental health, and socioeconomic status were examined. Score differences by age, gender, and country were investigated. Results Correlations between the 10-item KIDSCREEN score and KIDSCREEN-52 scales ranged from r = 0.24 to 0.72 (r = 0.27–0.72) for the self-report version (proxy-report version). Coefficients below r = 0.5 were observed for the KIDSCREEN-52 dimensions Financial Resources and Being Bullied only. Cronbach alpha was 0.82 (0.78), test–retest reliability was ICC = 0.70 (0.67) for the self- (proxy-)report version. Correlations between other children self-completed HRQoL questionnaires and KIDSCREEN-10 ranged from r = 0.43 to r = 0.63 for the KIDSCREEN children self-report and r = 0.22–0.40 for the KIDSCREEN parent proxy report. Known group differences in HRQoL between physically/mentally healthy and ill children were observed in the KIDSCREEN-10 self and proxy scores. Associations with self-reported psychosomatic complaints were r = −0.52 (−0.36) for the KIDSCREEN-10 self-report (proxy-report). Statistically significant differences in KIDSCREEN-10 self and proxy scores were found by socioeconomic status, age, and gender. Conclusions Our results indicate that the KIDSCREEN-10 provides a valid measure of a general HRQoL factor in children and adolescents, but the instrument does not represent well most of the single dimensions of the original KIDSCREEN-52. Test–retest reliability was slightly below a priori defined thresholds. PMID:20668950
Validity and reliability of a dental operator posture assessment instrument (PAI).

Science.gov (United States)

Branson, Bonnie G; Williams, Karen B; Bray, Kimberly Krust; Mcllnay, Sandy L; Dickey, Diana

2002-01-01

Basic operating posture is considered an important occupational health issue for oral health care clinicians. It is generally agreed that the physical posture of the operator, while providing care, should be such that all muscles are in a relaxed, well-balanced, and neutral position. Postures outside of this neutral position are likely to cause musculoskeletal discomfort. To date, the range of the neutral operator position has not been well-defined; nor have any specific instruments been identified that can quantitatively or semi-quantitatively assess dental operator posture. This paper reports on the development of an instrument that can be used to semi-quantitatively evaluate postural components. During the first phase of the study, an expert panel defined the basic parameters for acceptable, compromised, and harmful operator postures and established face validity of a posture assessment instrument (PAI). During the second phase, the PAI was tested for reliability using generalizability theory. Four raters tested the instrument for reliability. Overall, total PAI scores were similar amongst three of the raters, with the fourth rater's scores being slightly greater than the other three. The main effect of the rater on individual postural components was moderate, indicating that rater variance contributed to 11.9% of total variance. The PAI measures posture as it occurs and will have numerous applications when evaluating operator performance in the dental and dental hygiene education setting. Also, the PAI will prove useful when examining the effects of operator posture and musculoskeletal disorders.
Validity and reliability of global operative assessment of laparoscopic skills (GOALS) in novice trainees performing a laparoscopic cholecystectomy.

Science.gov (United States)

Kramp, Kelvin H; van Det, Marc J; Hoff, Christiaan; Lamme, Bas; Veeger, Nic J G M; Pierie, Jean-Pierre E N

2015-01-01

Global Operative Assessment of Laparoscopic Skills (GOALS) assessment has been designed to evaluate skills in laparoscopic surgery. A longitudinal blinded study of randomized video fragments was conducted to estimate the validity and reliability of GOALS in novice trainees. In total, 10 trainees each performed 6 consecutive laparoscopic cholecystectomies. Sixty procedures were recorded on video. Video fragments of (1) opening of the peritoneum; (2) dissection of Calot's triangle and achievement of critical view of safety; and (3) dissection of the gallbladder from the liver bed were blinded, randomized, and rated by 2 consultant surgeons using GOALS. Also, a grade was given for overall competence. The correlation of GOALS with live observation Objective Structured Assessment of Technical Skills (OSATS) scores was calculated. Construct validity was estimated using the Friedman 2-way analysis of variance by ranks and the Wilcoxon signed-rank test. The interrater reliability was calculated using the absolute and consistency agreement 2-way random-effects model intraclass correlation coefficient. A high correlation was found between mean GOALS score (r = 0.879, p = 0.021) and mean OSATS score. The GOALS score increased significantly across the 6 procedures (p = 0.002). The trainees performed significantly better on their sixth when compared with their first cholecystectomy (p = 0.004). The consistency agreement interrater reliability was 0.37 for the mean GOALS score (p = 0.002) and 0.55 for overall competence (p < 0.001) of the 3 video fragments. The validity observed in this randomized blinded longitudinal study supports the existing evidence that GOALS is a valid tool for assessment of novice trainees. A relatively low reliability was found in this study. Copyright © 2014 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.
Validity of GRE General Test scores and TOEFL scores for graduate admission to a technical university in Western Europe

Science.gov (United States)

Zimmermann, Judith; von Davier, Alina A.; Buhmann, Joachim M.; Heinimann, Hans R.

2018-01-01

Graduate admission has become a critical process in tertiary education, whereby selecting valid admissions instruments is key. This study assessed the validity of Graduate Record Examination (GRE) General Test scores for admission to Master's programmes at a technical university in Europe. We investigated the indicative value of GRE scores for the Master's programme grade point average (GGPA) with and without the addition of the undergraduate GPA (UGPA) and the TOEFL score, and of GRE scores for study completion and Master's thesis performance. GRE scores explained 20% of the variation in the GGPA, while additional 7% were explained by the TOEFL score and 3% by the UGPA. Contrary to common belief, the GRE quantitative reasoning score showed only little explanatory power. GRE scores were also weakly related to study progress but not to thesis performance. Nevertheless, GRE and TOEFL scores were found to be sensible admissions instruments. Rigorous methodology was used to obtain highly reliable results.
Validity, Reliability and Standardization Study of the Language Assessment Test for Aphasia

Directory of Open Access Journals (Sweden)

Bülent Toğram

2012-09-01

Full Text Available OBJECTIVE: Aphasia assessment is the first step towards a well- founded language therapy. Language tests need to consider cultural as well as typological linguistic aspects of a given language. This study was designed to determine the standardization, validity and reliability of Language Assessment Test for Aphasia, which consists of eight subtests including spontaneous speech and language, auditory comprehension, repetition, naming, reading, grammar, speech acts, and writing. METHODS: The test was administered to 282 healthy participants and 92 aphasic participants in age, education and gender matched groups. The validity study of the test was investigated with analysis of content, structure and criterion-related validity. For reliability of the test, the analysis of internal consistency, stability and equivalence reliability was conducted. The influence of variables on healhty participants’ sub-test scores, test score and language score was examined. According to significant differences, norms and cut-off scores based on language score were determined. RESULTS: The group with aphasia performed highly lower than healthy participants on subtest, test and language scores. The test scores of healthy group were mostly affected by age and educational level but not affected by gender. According to significant differences, age and educational level for both groups were determined. Considering age and educational levels, the reference values for the cut-off scores were presented. CONCLUSION: The test was found to be a highly reliable and valid aphasia test for Turkish- speaking aphasic patients either in Turkey or other Turkish communities around the world
Reliability and Validity of the Dutch Physical Activity Questionnaires for Children (PAQ-C) and Adolescents (PAQ-A).

Science.gov (United States)

Bervoets, Liene; Van Noten, Caroline; Van Roosbroeck, Sofie; Hansen, Dominique; Van Hoorenbeeck, Kim; Verheyen, Els; Van Hal, Guido; Vankerckhoven, Vanessa

2014-01-01

This study was designed to validate the Dutch Physical Activity Questionnaires for Children (PAQ-C) and Adolescents (PAQ-A). After adjustment of the original Canadian PAQ-C and PAQ-A (i.e. translation/back-translation and evaluation by expert committee), content validity of both PAQs was assessed and calculated using item-level (I-CVI) and scale-level (S-CVI) content validity indexes. Inter-item and inter-rater reliability of 196 PAQ-C and 95 PAQ-A filled in by both children or adolescents and their parent, were evaluated. Inter-item reliability was calculated by Cronbach's alpha (α) and inter-rater reliability was examined by percent observed agreement and weighted kappa (κ). Concurrent validity of PAQ-A was examined in a subsample of 28 obese and 16 normal-weight children by comparing it with concurrently measured physical activity using a maximal cardiopulmonary exercise test for the assessment of peak oxygen uptake (VO2 peak). For both PAQs, I-CVI ranged 0.67-1.00. S-CVI was 0.89 for PAQ-C and 0.90 for PAQ-A. A total of 192 PAQ-C and 94 PAQ-A were fully completed by both child and parent. Cronbach's α was 0.777 for PAQ-C and 0.758 for PAQ-A. Percent agreement ranged 59.9-74.0% for PAQ-C and 51.1-77.7% for PAQ-A, and weighted κ ranged 0.48-0.69 for PAQ-C and 0.51-0.68 for PAQ-A. The correlation between total PAQ-A score and VO2 peak - corrected for age, gender, height and weight - was 0.516 (p = 0.001). Both PAQs have an excellent content validity, an acceptable inter-item reliability and a moderate to good strength of inter-rater agreement. In addition, total PAQ-A score showed a moderate positive correlation with VO2 peak. Both PAQs have an acceptable to good reliability and validity, however, further validity testing is recommended to provide a more complete assessment of both PAQs.
Reliability and validity of the Turkish version of the Berg Balance Scale.

Science.gov (United States)

Sahin, Fusun; Yilmaz, Figen; Ozmaden, Asli; Kotevolu, Nurdan; Sahin, Tulay; Kuran, Banu

2008-01-01

The purpose of this study was to develop a Turkish version of the Berg Balance Scale (BBS) and assess its reliability and validity. Sixty healthy volunteers older than 65 years were included in to the study. Subjects who had lower extremity amputation, or were armchair or bedridden were excluded. After translation process, the Turkish version of the scale was administered to each participant twice with an interval of 2 weeks. The intraclass correlation coefficient (ICC) was calculated to assess intra- and inter-observer reliability. Chronbach alpha was calculated to evaluate internal consistency of the total BBS score. Interclass correlation coefficient was calcuated to examine test-retest reliability. Convergent validity was assessed by correlating the scale with Modified Barthel Index (MBI) and Timed Up and Go Test (TUG). Construct validity was assessed with factor analysis. The mean age in years of the participants were 77.00+/-5.67 (range: 67-92 yrs). The ICC for intra- and inter- observer reliability was 0.98 (pr=0.67 pr=-0.75 p<0.0001, respectively). The Turkish version of the BBS is a reliable and valid scale to be used in balance assessment of Turkish older adults.
Good performance in Japan is proof of continuing safety and reliability improvement practice

International Nuclear Information System (INIS)

Sumi, Y.

1987-01-01

Nuclear power is a vital energy supply source for both security and economy for such countries as Japan whose sources of energy are dependent on imported materials. This is the very reason why Japan gives her national priority to the improvement of nuclear power safety and reliability. As of the end of 1986, total nuclear power capacity owned and operated by private utility companies in Japan amounted to 24521 MW with 32 units sharing -- 19% of the total generating capacity. Moreover, during 1986 these units scored a remarkably high capacity factor of 76.2% and shared almost 28% of the nationwide electric power production, thereby contributing to a considerable saving of imported sources of energy. This outstanding record has been achieved by the parties concerned who dedicated themselves to furthering nuclear plant safety and reliability improvement. In this connection, this paper summarizes those key factors contributing to the good nuclear power plant performance of the Kansai Electric Power Company
Administration and scoring variance on the ADAS-Cog.

Science.gov (United States)

Connor, Donald J; Sabbagh, Marwan N

2008-11-01

The Alzheimer's Disease Assessment Scale - Cognitive (ADAS-Cog) is the most commonly used primary outcome instrument in clinical trials for treatments of dementia. Variations in forms, administration procedures and scoring rules, along with rater turnover and intra-rater drift may decrease the reliability of the instrument. A survey of possible variations in the ADAS-Cog was administered to 26 volunteer raters at a clinical trials meeting. Results indicate notable protocol variations in the forms used, administration procedures, and scoring rules. Since change over time is used to determine treatment effect in clinical trials, standardizing the instrument's ambiguities and addressing common problems will greatly increase the instrument's reliability and thereby enhance its sensitivity to treatment effects.
Reliability of provocative tests of motion sickness susceptibility

Science.gov (United States)

Calkins, D. S.; Reschke, M. F.; Kennedy, R. S.; Dunlop, W. P.

1987-01-01

Test-retest reliability values were derived from motion sickness susceptibility scores obtained from two successive exposures to each of three tests: (1) Coriolis sickness sensitivity test; (2) staircase velocity movement test; and (3) parabolic flight static chair test. The reliability of the three tests ranged from 0.70 to 0.88. Normalizing values from predictors with skewed distributions improved the reliability.

The Alcohol Use Disorders Identification Test (AUDIT: reliability and validity of the Greek version

Directory of Open Access Journals (Sweden)

Bratis Dimitris

2009-05-01

Full Text Available Abstract Background Problems associated with alcohol abuse are recognised by the World Health Organization as a major health issue, which according to most recent estimations is responsible for 1.4% of the total world burden of morbidity and has been proven to increase mortality risk by 50%. Because of the size and severity of the problem, early detection is very important. This requires easy to use and specific tools. One of these is the Alcohol Use Disorders Identification Test (AUDIT. Aim This study aims to standardise the questionnaire in a Greek population. Methods AUDIT was translated and back-translated from its original language by two English-speaking psychiatrists. The tool contains 10 questions. A score ≥ 11 is an indication of serious abuse/dependence. In the study, 218 subjects took part: 128 were males and 90 females. The average age was 40.71 years (± 11.34. From the 218 individuals, 109 (75 male, 34 female fulfilled the criteria for alcohol dependence according to the Diagnostic and Statistical Manual of Mental Disorders, 4th edition (DSM-IV, and presented requesting admission; 109 subjects (53 male, 56 female were healthy controls. Results Internal reliability (Cronbach α was 0.80 for the controls and 0.80 for the alcohol-dependent individuals. Controls had significantly lower average scores (t test P 8 was 0.98 and its specificity was 0.94 for the same score. For the alcohol-dependent sample 3% scored as false negatives and from the control group 1.8% scored false positives. In the alcohol-dependent sample there was no difference between males and females in their average scores (t test P > 0.05. Conclusion The Greek version of AUDIT has increased internal reliability and validity. It detects 97% of the alcohol-dependent individuals and has a high sensitivity and specificity. AUDIT is easy to use, quick and reliable and can be very useful in detection alcohol problems in sensitive populations.
The Alcohol Use Disorders Identification Test (AUDIT): reliability and validity of the Greek version.

Science.gov (United States)

Moussas, George; Dadouti, Georgia; Douzenis, Athanassios; Poulis, Evangelos; Tzelembis, Athanassios; Bratis, Dimitris; Christodoulou, Christos; Lykouras, Lefteris

2009-05-14

Problems associated with alcohol abuse are recognised by the World Health Organization as a major health issue, which according to most recent estimations is responsible for 1.4% of the total world burden of morbidity and has been proven to increase mortality risk by 50%. Because of the size and severity of the problem, early detection is very important. This requires easy to use and specific tools. One of these is the Alcohol Use Disorders Identification Test (AUDIT). This study aims to standardise the questionnaire in a Greek population. AUDIT was translated and back-translated from its original language by two English-speaking psychiatrists. The tool contains 10 questions. A score >or= 11 is an indication of serious abuse/dependence. In the study, 218 subjects took part: 128 were males and 90 females. The average age was 40.71 years (+/- 11.34). From the 218 individuals, 109 (75 male, 34 female) fulfilled the criteria for alcohol dependence according to the Diagnostic and Statistical Manual of Mental Disorders, 4th edition (DSM-IV), and presented requesting admission; 109 subjects (53 male, 56 female) were healthy controls. Internal reliability (Cronbach alpha) was 0.80 for the controls and 0.80 for the alcohol-dependent individuals. Controls had significantly lower average scores (t test P 8 was 0.98 and its specificity was 0.94 for the same score. For the alcohol-dependent sample 3% scored as false negatives and from the control group 1.8% scored false positives. In the alcohol-dependent sample there was no difference between males and females in their average scores (t test P > 0.05). The Greek version of AUDIT has increased internal reliability and validity. It detects 97% of the alcohol-dependent individuals and has a high sensitivity and specificity. AUDIT is easy to use, quick and reliable and can be very useful in detection alcohol problems in sensitive populations.
The scoring of movements in sleep.

Science.gov (United States)

Walters, Arthur S; Lavigne, Gilles; Hening, Wayne; Picchietti, Daniel L; Allen, Richard P; Chokroverty, Sudhansu; Kushida, Clete A; Bliwise, Donald L; Mahowald, Mark W; Schenck, Carlos H; Ancoli-Israel, Sonia

2007-03-15

The International Classification of Sleep Disorders (ICSD-2) has separated sleep-related movement disorders into simple, repetitive movement disorders (such as periodic limb movements in sleep [PLMS], sleep bruxism, and rhythmic movement disorder) and parasomnias (such as REM sleep behavior disorder and disorders of partial arousal, e.g., sleep walking, confusional arousals, night terrors). Many of the parasomnias are characterized by complex behaviors in sleep that appear purposeful, goal directed and voluntary but are outside the conscious awareness of the individual and therefore inappropriate. All of the sleep-related movement disorders described here have specific polysomnographic findings. For the purposes of developing and/or revising specifications and polysomnographic scoring rules, the AASM Scoring Manual Task Force on Movements in Sleep reviewed background literature and executed evidence grading of 81 relevant articles obtained by a literature search of published articles between 1966 and 2004. Subsequent evidence grading identified limited evidence for reliability and/or validity for polysomnographic scoring criteria for periodic limb movements in sleep, REM sleep behavior disorder, and sleep bruxism. Published scoring criteria for rhythmic movement disorder, excessive fragmentary myoclonus, and hypnagogic foot tremor/alternating leg muscle activation were empirical and based on descriptive studies. The literature review disclosed no published evidence defining clinical consequences of excessive fragmentary myoclonus or hypnagogic foot tremor/alternating leg muscle activation. Because of limited or absent evidence for reliability and/or validity, a standardized RAND/UCLA consensus process was employed for recommendation of specific rules for the scoring of sleep-associated movements.
The Neck Disability Index-Russian Language Version (NDI-RU): A Study of Validity and Reliability.

Science.gov (United States)

Bakhtadze, Maxim A; Vernon, Howard; Zakharova, Olga B; Kuzminov, Kirill O; Bolotov, Dmitry A

2015-07-15

Cross-cultural adaptation and psychometric testing. To perform a validated Russian translation and then to evaluate the validity and reliability of the Russian language version of the Neck Disability Index (NDI-RU). Neck pain is highly prevalent and can greatly affect daily activity. The Neck Disability Index (NDI) is the most frequently used scale for self-rating of disability due to neck pain. Its translated versions are applied in many countries. However, the Russian language version of the NDI has not been developed yet. Cross-cultural adaptation of the NDI-RU was performed according to established guidelines. Then, the NDI-RU was evaluated for content validity, concurrent criterion validity, internal consistency, test-retest reliability, factor structure, and minimum detectable change. Two hundred thirty-two patients took part in the study in total: 109 in validity (39.5 ± 10 yr), 123 in reliability (38.4 ± 11 yr; 80 in the test-retest phase). A culturally valid translation was achieved. NDI-RU total scores were distributed normally. Floor/ceiling effects were absent. Good values of Cronbach α were obtained for each item (from 0.80 to 0.84) and for the total NDI-RU (0.83). A 2-factor solution was found for the NDI-RU. The average interitem correlation coefficient was 0.53. Intraclass correlation coefficients for test-retest reliability coefficients ranged from 0.65 to 0.92 for different items and 0.91 for the total NDI-RU. Moderate correlation (Spearman rs = 0.62; P Russian language version of the Neck Disability Index resulted in a valid, reliable instrument that can be used both in clinical practice and scientific investigations. 1.
NIH Toolbox Cognitive Function Battery (CFB): Composite Scores of Crystallized, Fluid, and Overall Cognition

Science.gov (United States)

Akshoomoff, Natacha; Beaumont, Jennifer L.; Bauer, Patricia J.; Dikmen, Sureyya; Gershon, Richard; Mungas, Dan; Slotkin, Jerry; Tulsky, David; Weintraub, Sandra; Zelazzo, Philip; Heaton, Robert K.

2014-01-01

The NIH Toolbox Cognitive Function Battery (CFB) includes 7 tests covering 8 cognitive abilities considered to be important in adaptive functioning across the lifespan (from early childhood to late adulthood). Here we present data on psychometric characteristics in children (N = 208; ages 3–15 years) of a total summary score and composite scores reflecting two major types of cognitive abilities: “crystallized” (more dependent upon past learning experiences) and “fluid” (capacity for new learning and information processing in novel situations). Both types of cognition are considered important in everyday functioning, but are thought to be differently affected by brain health status throughout life, from early childhood through older adulthood. All three Toolbox composite scores showed excellent test-retest reliability, robust developmental effects across the childhood age range considered here, and strong correlations with established, “gold standard” measures of similar abilities. Additional preliminary evidence of validity includes significant associations between all three Toolbox composite scores and maternal reports of children’s health status and school performance. PMID:23952206
Assessing physiotherapists' communication skills for promoting patient autonomy for self-management: reliability and validity of the communication evaluation in rehabilitation tool.

Science.gov (United States)

Murray, Aileen; Hall, Amanda; Williams, Geoffrey C; McDonough, Suzanne M; Ntoumanis, Nikos; Taylor, Ian; Jackson, Ben; Copsey, Bethan; Hurley, Deirdre A; Matthews, James

2018-02-27

To assess the inter-rater reliability and concurrent validity of the Communication Evaluation in Rehabilitation Tool, which aims to externally assess physiotherapists competency in using Self-Determination Theory-based communication strategies in practice. Audio recordings of initial consultations between 24 physiotherapists and 24 patients with chronic low back pain in four hospitals in Ireland were obtained as part of a larger randomised controlled trial. Three raters, all of whom had Ph.Ds in psychology and expertise in motivation and physical activity, independently listened to the 24 audio recordings and completed the 18-item Communication Evaluation in Rehabilitation Tool. Inter-rater reliability between all three raters was assessed using intraclass correlation coefficients. Concurrent validity was assessed using Pearson's r correlations with a reference standard, the Health Care Climate Questionnaire. The total score for the Communication Evaluation in Rehabilitation Tool is an average of all 18 items. Total scores demonstrated good inter-rater reliability (Intraclass Correlation Coefficient (ICC) = 0.8) and concurrent validity with the Health Care Climate Questionnaire total score (range: r = 0.7-0.88). Item-level scores of the Communication Evaluation in Rehabilitation Tool identified five items that need improvement. Results provide preliminary evidence to support future use and testing of the Communication Evaluation in Rehabilitation Tool. Implications for Rehabilitation Promoting patient autonomy is a learned skill and while interventions exist to train clinicians in these skills there are no tools to assess how well clinicians use these skills when interacting with a patient. The lack of robust assessment has severe implications regarding both the fidelity of clinician training packages and resulting outcomes for promoting patient autonomy. This study has developed a novel measurement tool Communication Evaluation in Rehabilitation Tool and a
Validity and reliability of the Bahasa Melayu version of the Migraine Disability Assessment questionnaire.

Science.gov (United States)

Shaik, Munvar Miya; Hassan, Norul Badriah; Tan, Huay Lin; Bhaskar, Shalini; Gan, Siew Hua

2014-01-01

The study was designed to determine the validity and reliability of the Bahasa Melayu version (MIDAS-M) of the Migraine Disability Assessment (MIDAS) questionnaire. Patients having migraine for more than six months attending the Neurology Clinic, Hospital Universiti Sains Malaysia, Kubang Kerian, Kelantan, Malaysia, were recruited. Standard forward and back translation procedures were used to translate and adapt the MIDAS questionnaire to produce the Bahasa Melayu version. The translated Malay version was tested for face and content validity. Validity and reliability testing were further conducted with 100 migraine patients (1st administration) followed by a retesting session 21 days later (2nd administration). A total of 100 patients between 15 and 60 years of age were recruited. The majority of the patients were single (66%) and students (46%). Cronbach's alpha values were 0.84 (1st administration) and 0.80 (2nd administration). The test-retest reliability for the total MIDAS score was 0.73, indicating that the MIDAS-M questionnaire is stable; for the five disability questions, the test-retest values ranged from 0.77 to 0.87. The MIDAS-M questionnaire is comparable with the original English version in terms of validity and reliability and may be used for the assessment of migraine in clinical settings.
Do medical students’ scores using different assessment instruments predict their scores in clinical reasoning using a computer-based simulation?

Directory of Open Access Journals (Sweden)

Fida M

2015-02-01

Full Text Available Mariam Fida,1 Salah Eldin Kassab2 1Department of Molecular Medicine, College of Medicine and Medical Sciences, Arabian Gulf University, Manama, Bahrain; 2Department of Medical Education, Faculty of Medicine, Suez Canal University, Ismailia, Egypt Purpose: The development of clinical problem-solving skills evolves over time and requires structured training and background knowledge. Computer-based case simulations (CCS have been used for teaching and assessment of clinical reasoning skills. However, previous studies examining the psychometric properties of CCS as an assessment tool have been controversial. Furthermore, studies reporting the integration of CCS into problem-based medical curricula have been limited. Methods: This study examined the psychometric properties of using CCS software (DxR Clinician for assessment of medical students (n=130 studying in a problem-based, integrated multisystem module (Unit IX during the academic year 2011–2012. Internal consistency reliability of CCS scores was calculated using Cronbach's alpha statistics. The relationships between students' scores in CCS components (clinical reasoning, diagnostic performance, and patient management and their scores in other examination tools at the end of the unit including multiple-choice questions, short-answer questions, objective structured clinical examination (OSCE, and real patient encounters were analyzed using stepwise hierarchical linear regression. Results: Internal consistency reliability of CCS scores was high (α=0.862. Inter-item correlations between students' scores in different CCS components and their scores in CCS and other test items were statistically significant. Regression analysis indicated that OSCE scores predicted 32.7% and 35.1% of the variance in clinical reasoning and patient management scores, respectively (P<0.01. Multiple-choice question scores, however, predicted only 15.4% of the variance in diagnostic performance scores (P<0.01, while
The reliability and accuracy of two methods for proximal caries detection and depth on directly visible proximal surfaces: an in vitro study

DEFF Research Database (Denmark)

Ekstrand, K R; Alloza, Alvaro Luna; Promisiero, L

2011-01-01

This study aimed to determine the reliability and accuracy of the ICDAS and radiographs in detecting and estimating the depth of proximal lesions on extracted teeth. The lesions were visible to the naked eye. Three trained examiners scored a total of 132 sound/carious proximal surfaces from 106 p...
Internal consistency, test-retest reliability and measurement error of the self-report version of the social skills rating system in a sample of Australian adolescents.

Directory of Open Access Journals (Sweden)

Sharmila Vaz

Full Text Available The social skills rating system (SSRS is used to assess social skills and competence in children and adolescents. While its characteristics based on United States samples (US are published, corresponding Australian figures are unavailable. Using a 4-week retest design, we examined the internal consistency, retest reliability and measurement error (ME of the SSRS secondary student form (SSF in a sample of Year 7 students (N = 187, from five randomly selected public schools in Perth, western Australia. Internal consistency (IC of the total scale and most subscale scores (except empathy on the frequency rating scale was adequate to permit independent use. On the importance rating scale, most IC estimates for girls fell below the benchmark. Test-retest estimates of the total scale and subscales were insufficient to permit reliable use. ME of the total scale score (frequency rating for boys was equivalent to the US estimate, while that for girls was lower than the US error. ME of the total scale score (importance rating was larger than the error using the frequency rating scale. The study finding supports the idea of using multiple informants (e.g. teacher and parent reports, not just student as recommended in the manual. Future research needs to substantiate the clinical meaningfulness of the MEs calculated in this study by corroborating them against the respective Minimum Clinically Important Difference (MCID.
Internal consistency, test-retest reliability and measurement error of the self-report version of the social skills rating system in a sample of Australian adolescents.

Science.gov (United States)

Vaz, Sharmila; Parsons, Richard; Passmore, Anne Elizabeth; Andreou, Pantelis; Falkmer, Torbjörn

2013-01-01

The social skills rating system (SSRS) is used to assess social skills and competence in children and adolescents. While its characteristics based on United States samples (US) are published, corresponding Australian figures are unavailable. Using a 4-week retest design, we examined the internal consistency, retest reliability and measurement error (ME) of the SSRS secondary student form (SSF) in a sample of Year 7 students (N = 187), from five randomly selected public schools in Perth, western Australia. Internal consistency (IC) of the total scale and most subscale scores (except empathy) on the frequency rating scale was adequate to permit independent use. On the importance rating scale, most IC estimates for girls fell below the benchmark. Test-retest estimates of the total scale and subscales were insufficient to permit reliable use. ME of the total scale score (frequency rating) for boys was equivalent to the US estimate, while that for girls was lower than the US error. ME of the total scale score (importance rating) was larger than the error using the frequency rating scale. The study finding supports the idea of using multiple informants (e.g. teacher and parent reports), not just student as recommended in the manual. Future research needs to substantiate the clinical meaningfulness of the MEs calculated in this study by corroborating them against the respective Minimum Clinically Important Difference (MCID).
Coronary collateral circulation in patients with chronic coronary total occlusion; its relationship with cardiac risk markers and SYNTAX score.

Science.gov (United States)

Börekçi, A; Gür, M; Şeker, T; Baykan, A O; Özaltun, B; Karakoyun, S; Karakurt, A; Türkoğlu, C; Makça, I; Çaylı, M

2015-09-01

Compared to patients without a collateral supply, long-term cardiac mortality is reduced in patients with well-developed coronary collateral circulation (CCC). Cardiovascular risk markers, such as N-terminal pro-brain natriuretic peptide (NT-proBNP), high-sensitive C-reactive protein (hs-CRP) and high-sensitive cardiac troponin T (hs-cTnT) are independent predictors for cardiovascular mortality. The main goal of this study was to examine the relationship between CCC and cardiovascular risk markers. We prospectively enrolled 427 stable coronary artery disease patients with chronic total occlusion (mean age: 57.5±11.1 years). The patients were divided into two groups, according to their Rentrop scores: (a) poorly developed CCC group (Rentrop 0 and 1) and (b) well-developed CCC group (Rentrop 2 and 3). NT-proBNP, hs-CRP, hs-cTnT, uric acid and other biochemical markers were also measured. The SYNTAX score was calculated for all patients. The patients in the poorly developed CCC group had higher frequencies of diabetes and hypertension (prisk markers, such as NT-proBNP, hs-cTnT and hs-CRP are independently associated with CCC in stable coronary artery disease with chronic total occlusion. © The Author(s) 2014.
Reliability and validity of the Korean version of Pediatric Voice Handicap Index: in school age children.

Science.gov (United States)

Park, Sung Shin; Kwon, Tack-Kyun; Choi, Seong Hee; Lee, Won Yong; Hong, Young Hye; Jeong, Nyun Gi; Sung, Myung-Whun; Kim, Kwang Hyun

2013-01-01

The aim of this study was to assess the reliability and validity of the Pediatric Voice Handicap Index (pVHI) for cross-cultural adaptation of the Korean version with school age children. The questionnaire was translated into Korean and was completed by 101 Korean parents who have children with or without disordered voice. The Korean version-pVHI scores were obtained with 60 parents of normal children and 41 parents who have children with voice problems. Content validity was verified by five experienced speech-language pathologists with clinical specialization in voice disorders. Internal consistency was calculated through Cronbach's α coefficient and test-retest reliability of the Korean version-pVHI score was determined using Pearson product-moment correlation coefficients. Mann-Whitney U test was used to compare GRBAS with the Korean version-pVHI scores between normal and dysphonia group. The relationship between the parent-reported the Korean version-pVHI total scores and perceptual ratings of voice quality from experts was investigated using Spearman correlation coefficients. The results showed that the Korean version-pVHI provided a high internal consistency (α=0.92) and test-retest reliability of its subscales: total (T) 0.97, functional (F) 0.90, physical (P) 0.95, emotional (E) 0.92. The Korean version-pVHI mean scores in normal group were 1.28 (T), 0.62 (F), 0.35 (P) and 0.32 (E), respectively whereas those of the Korean version-pVHI in children group with dysphonia were 23.13 (T), 8.90 (F), 9.54 (P) and 4.93 (E). Significant differences in the Korean version-pVHI (T, F, P, E) and perceptual evaluation (grade, rough, breathy) between normal and dysphonia group were revealed (PKorean version-pVHI parameters (T) and perceptual measures (G) was exhibited in children with dysphonia. The subjective Korean version-pVHI can be applicable and useful supplementary tool for evaluating parents' perception of their children's voice dysfunction, identifying
Psychometrics Matter in Health Behavior: A Long-term Reliability Generalization Study.

Science.gov (United States)

Pickett, Andrew C; Valdez, Danny; Barry, Adam E

2017-09-01

Despite numerous calls for increased understanding and reporting of reliability estimates, social science research, including the field of health behavior, has been slow to respond and adopt such practices. Therefore, we offer a brief overview of reliability and common reporting errors; we then perform analyses to examine and demonstrate the variability of reliability estimates by sample and over time. Using meta-analytic reliability generalization, we examined the variability of coefficient alpha scores for a well-designed, consistent, nationwide health study, covering a span of nearly 40 years. For each year and sample, reliability varied. Furthermore, reliability was predicted by a sample characteristic that differed among age groups within each administration. We demonstrated that reliability is influenced by the methods and individuals from which a given sample is drawn. Our work echoes previous calls that psychometric properties, particularly reliability of scores, are important and must be considered and reported before drawing statistical conclusions.
Reliability and validity of adapted French Canadian version of Scoliosis Research Society Outcomes Questionnaire (SRS-22) in Quebec.

Science.gov (United States)

Beauséjour, Marie; Joncas, Julie; Goulet, Lise; Roy-Beaudry, Marjolaine; Parent, Stefan; Grimard, Guy; Forcier, Martin; Lauriault, Sophie; Labelle, Hubert

2009-03-15

Prospective validation study of a cross-cultural adaptation of the Scoliosis Research Society (SRS) Outcomes Questionnaire. To provide a French Canadian version of the SRS Outcomes Questionnaire and to empirically test its response in healthy adolescents and adolescent idiopathic scoliosis (AIS) patients in Québec. The SRS Outcomes Questionnaire is widely used for the assessment of health-related quality of life in AIS patients. French translation and back-translation of the SRS-22 (SRS-22-fv) were done by an expert committee. Its reliability was measured using the coefficient of internal consistency, construct validity with a factorial analysis, concurrent validity by using the short form-12 and discriminant validity using ANOVA and multivariate linear regression, on 145 AIS patients, 44 patients with non clinically significant scoliosis (NCSS), and 64 healthy patients. The SRS-22-fv showed a good global internal consistency (AIS: Cronbach alpha = 0.86, NCSS: 0.81, and controls: 0.79) and in all of its domains for AIS patients. The factorial structure was coherent with the original questionnaire (47.4% of explained variance). High correlation coefficients were obtained between SRS-22-fv and short form-12 corresponding domains. Boys had higher scores than girls, scores worsened with age, and with increasing body mass index. Mean Total, Pain, Self-image, and Satisfaction scores, were correlated with Cobb angle. Adjusted regression models showed statistically significant differences between the AIS, NCSS, and control groups in the Total, Pain, and Function scores. The SRS-22-fv showed satisfactory reliability, factorial, concurrent, and discriminant validity. This study provides scores in a significant group of healthy adolescents and demonstrates a clear gradient in response between subjects with AIS, NCSS, and controls.
Improving machinery reliability

CERN Document Server

Bloch, Heinz P

1998-01-01

This totally revised, updated and expanded edition provides proven techniques and procedures that extend machinery life, reduce maintenance costs, and achieve optimum machinery reliability. This essential text clearly describes the reliability improvement and failure avoidance steps practiced by best-of-class process plants in the U.S. and Europe.
Alternative Payment Models Should Risk-Adjust for Conversion Total Hip Arthroplasty: A Propensity Score-Matched Study.

Science.gov (United States)

McLawhorn, Alexander S; Schairer, William W; Schwarzkopf, Ran; Halsey, David A; Iorio, Richard; Padgett, Douglas E

2017-12-06

For Medicare beneficiaries, hospital reimbursement for nonrevision hip arthroplasty is anchored to either diagnosis-related group code 469 or 470. Under alternative payment models, reimbursement for care episodes is not further risk-adjusted. This study's purpose was to compare outcomes of primary total hip arthroplasty (THA) vs conversion THA to explore the rationale for risk adjustment for conversion procedures. All primary and conversion THAs from 2007 to 2014, excluding acute hip fractures and cancer patients, were identified in the National Surgical Quality Improvement Program database. Conversion and primary THA patients were matched 1:1 using propensity scores, based on preoperative covariates. Multivariable logistic regressions evaluated associations between conversion THA and 30-day outcomes. A total of 2018 conversions were matched to 2018 primaries. There were no differences in preoperative covariates. Conversions had longer operative times (148 vs 95 minutes, P reimbursement models shift toward bundled payment paradigms, conversion THA appears to be a procedure for which risk adjustment is appropriate. Copyright © 2017 Elsevier Inc. All rights reserved.
An examination of the interrater reliability between practitioners and researchers on the static-99.

Science.gov (United States)

Quesada, Stephen P; Calkins, Cynthia; Jeglic, Elizabeth L

2014-11-01

Many studies have validated the psychometric properties of the Static-99, the most widely used measure of sexual offender recidivism risk. However much of this research relied on instrument coding completed by well-trained researchers. This study is the first to examine the interrater reliability (IRR) of the Static-99 between practitioners in the field and researchers. Using archival data from a sample of 1,973 formerly incarcerated sex offenders, field raters' scores on the Static-99 were compared with those of researchers. Overall, clinicians and researchers had excellent IRR on Static-99 total scores, with IRR coefficients ranging from "substantial" to "outstanding" for the individual 10 items of the scale. The most common causes of discrepancies were coding manual errors, followed by item subjectivity, inaccurate item scoring, and calculation errors. These results offer important data with regard to the frequency and perceived nature of scoring errors. © The Author(s) 2013.
Modified Tuck Jump Assessment: Reliability and Training of Raters

Directory of Open Access Journals (Sweden)

Craig A. Smith, Nicole J. Chimera, Monica R. Lininger, Meghan Warren

2017-09-01

Full Text Available We are writing with regard to “Intra- and inter-rater reliability of the modified tuck jump assessment,” by Fort-Vanmeerhaeghe et al. (2017 published in the Journal of Sports Science & Medicine. The authors reported on the reliability of the modified Tuck Jump Assessment (TJA. The purpose of the article was twofold: to introduce a new scoring methodology and to report on the interrater and intrarater reliability. The authors found the modified TJA to have excellent interrater reliability (ICC = 0.94, 95% CI = 0.88-0.97 and intrarater reliability (rater 1 ICC = 0.94, 95% CI = 0.88-0.9; rater 2 ICC = 0.96, 95% CI = 0.92-0.98 with experienced raters (n = 2 in a sample of 24 elite volleyball athletes. Overall, we found the study to be well conducted and valuable to the field of injury screening; however, the study did not adequately explain how the raters were trained in the modified TJA to improve consistency of scoring, or the modifications of the individual flaw “excessive contact noise at landing.” This information is necessary to improve the clinical utility of the TJA and direct future reliability studies. The TJA has been changed at least three times in the literature: from the initial introduction (Myer et al., 2006 to the most referenced and detailed protocol (Myer et al., 2011 to the publication under discussion (Fort-Vanmeerhaeghe et al., 2017. The initial test protocol was based upon clinical expertise and has evolved over time as new research emerged and problems arose with the original TJA. Initially, the TJA was scored on a visual analog scale (Myer et al., 2006, changed to a dichotomous scale (0 for no flaw or 1 for flaw present (Myer et al., 2011 and most recently modified using an ordinal scale (Fort-Vanmeerhaeghe et al., 2017. A significant disparity in the reported interrater and intrarater reliability arose with the dichotomously scored TJA, between those involved in the development of the TJA (Herrington et al., 2013
Translation, adaptation and inter-rater reliability of the administration manual for the Fugl-Meyer assessment.

Science.gov (United States)

Michaelsen, Stella M; Rocha, André S; Knabben, Rodrigo J; Rodrigues, Luciano P; Fernandes, Claudia G C

2011-01-01

Recently, the reliability of the Brazilian version of the Fugl-Meyer Assessment (FMA) was assessed through the scoring given according to observations made by a single evaluator who applied the test. When different raters apply the scale, the reliability may depend on the interpretation given to the assessment sheet. In such cases, a clear administration manual is essential for ensuring homogeneity of application. To translate and adapt the French Canadian version of the FMA administration manual into Brazilian Portuguese and to evaluate the inter-rater reliability when different evaluators apply the FMA on the basis of the information contained in the manual. Eighteen adults (59±10 years) with chronic hemiparesis (38±35 months after a stroke) took part in this study. Eight patients participated in the first part of the study and 10 in the second part. Based on analyzing the results from part 1, an adapted version was developed, in which information and photos were added to illustrate the positions of the patient and evaluator. The inter-rater reliability was assessed using the intraclass correlation coefficient (ICC). The reliability of the FMA based on the adapted version of the manual was excellent for the total motor scores for the upper limbs (ICC=0.98) and lower limbs (ICC=0.90), as well as for movement sense (ICC=0.98) and upper and lower-limb passive range of motion (ICC=0.84 and 0.90, respectively). The reliability was moderate for tactile sensitivity (0.75). The joint pain assessment presented low reliability. The results showed that, except for pain assessment, application of the FMA based on the adapted version of the application manual for Brazilian Portuguese presented adequate inter-rater reliability.

Evaluation of Scoring Skills and Non Scoring Skills in the Brazilian SuperLeague Women’s Volleyball

Directory of Open Access Journals (Sweden)

Aluizio Otávio Gouvêa Ferreira Oliveira

2016-09-01

Full Text Available This study analyzed all the games (n=253 from the 2011/2012 and 2012/2013 Seasons of Brazilian SuperLeague Women’s Volleyball, to identify the game-related factors that discriminate in favor of winning and losing teams. In the 2011/2012 Season, the Total Shares Setting (TAL and Total Points Attack (TPA were factors that discriminated in favor of a defeat. The factors that determined the victory were the Total Shares Serve (TAS, Total Shares Defense (TAD, Total Shares Reception (TAR and Total Defense Excellent (TDE. In the 2012/2013 Season, the factor (TAD most often discriminated in favor of victory and the factor that led to defeat was the Total Points Made (TPF. The scoring skills (TPA and (TPF discriminated against the final outcome of the game, but surprisingly are associated with defeat and the (TAS supposed to victory. The non-scoring skills (TAD, (TAR and (TDE discriminate the end result of the game and this may be associated with the victory. The non-scoring skill (TAL determines the outcome of the game and is supposedly associated with the defeat.
Differences in pain measures by mini-mental state examination scores of residents in aged care facilities: examining the usability of the Abbey pain scale-Japanese version.

Science.gov (United States)

Takai, Yukari; Yamamoto-Mitani, Noriko; Ko, Ayako; Heilemann, Marysue V

2014-03-01

The validity and reliability of the Abbey Pain Scale-Japanese version (APS-J) have been examined. However, the range of cognitive levels for which the APS-J can be accurately used in older adults has not been investigated. This study aimed to examine the differences between total/item scores of the APS-J and Mini-Mental State Examination (MMSE) scores of residents in aged care facilities who self-reported the presence or absence of pain. This descriptive study included 252 residents in aged care facilities. Self-reported pain, MMSE scores, and item/total APS-J scores for pain intensity were collected. The MMSE scores were used to create four groups on the basis of the cognitive impairment level. Self-reports of pain and the APS-J scores were compared with different MMSE score groups. The total APS-J score for pain intensity as well as scores for individual items such as "vocalization" and "facial expression" were significantly higher in those who reported pain than in those reporting no pain across all MMSE groups. The total APS-J score and item scores for "vocalization," "change in body language," and "behavioral changes" showed significant differences in the four MMSE groups. Pain intensity tended to be overestimated by the APS-J, especially among those with low MMSE scores. The APS-J can be used to assess pain intensity in residents despite their cognitive levels. However, caution is required when using it to compare scores among older adults with different cognitive capacity because of the possibility of overestimation of pain among residents with low cognitive capacity. Copyright © 2014 American Society for Pain Management Nursing. Published by Elsevier Inc. All rights reserved.
Validity and reliability of the Mastication Observation and Evaluation (MOE) instrument.

Science.gov (United States)

Remijn, Lianne; Speyer, Renée; Groen, Brenda E; van Limbeek, Jacques; Nijhuis-van der Sanden, Maria W G

2014-07-01

The Mastication Observation and Evaluation (MOE) instrument was developed to allow objective assessment of a child's mastication process. It contains 14 items and was developed over three Delphi rounds. The present study concerns the further development of the MOE using the COSMIN (Consensus based Standard for the Selection of Measurement Instruments) and investigated the instrument's internal consistency, inter-observer reliability, construct validity and floor and ceiling effects. Consumption of three bites of bread and biscuit was evaluated using the MOE. Data of 59 healthy children (6-48 mths) and 38 children (bread) and 37 children (biscuit) with cerebral palsy (24-72 mths) were used. Four items were excluded before analysis due to zero variance. Principal Components Analysis showed one factor with 8 items. Internal consistency was >0.70 (Chronbach's alpha) for both food consistencies and for both groups of children. Inter-observer reliability varied from 0.51 to 0.98 (weighted Gwet's agreement coefficient). The total MOE scores for both groups showed normal distribution for the population. There were no floor or ceiling effects. The revised MOE now contains 8 items that (a) have a consistent concept for mastication and can be scored on a 4-point scale with sufficient reliability and (b) are sensitive to stages of chewing development in young children. The removed items are retained as part of a criterion referenced list within the MOE. Copyright © 2014 Elsevier Ltd. All rights reserved.
An ultrasound score for knee osteoarthritis

DEFF Research Database (Denmark)

Riecke, B F; Christensen, R.; Torp-Pedersen, S

2014-01-01

OBJECTIVE: To develop standardized musculoskeletal ultrasound (MUS) procedures and scoring for detecting knee osteoarthritis (OA) and test the MUS score's ability to discern various degrees of knee OA, in comparison with plain radiography and the 'Knee injury and Osteoarthritis Outcome Score' (KOOS......) domains as comparators. METHOD: A cross-sectional study of MUS examinations in 45 patients with knee OA. Validity, reliability, and reproducibility were evaluated. RESULTS: MUS examination for knee OA consists of five separate domains assessing (1) predominantly morphological changes in the medial...... coefficients ranging from 0.75 to 0.97 for the five domains. Construct validity was confirmed with statistically significant correlation coefficients (0.47-0.81, P knee OA. In comparison with standing radiographs...
Adaptation and Assessment of Reliability and Validity of the Greek Version of the Ohkuma Questionnaire for Dysphagia Screening

Science.gov (United States)

Papadopoulou, Soultana L.; Exarchakos, Georgios; Christodoulou, Dimitrios; Theodorou, Stavroula; Beris, Alexandre; Ploumis, Avraam

2016-01-01

Introduction The Ohkuma questionnaire is a validated screening tool originally used to detect dysphagia among patients hospitalized in Japanese nursing facilities. Objective The purpose of this study is to evaluate the reliability and validity of the adapted Greek version of the Ohkuma questionnaire. Methods Following the steps for cross-cultural adaptation, we delivered the validated Ohkuma questionnaire to 70 patients (53 men, 17 women) who were either suffering from dysphagia or not. All of them completed the questionnaire a second time within a month. For all of them, we performed a bedside and VFSS study of dysphagia and asked participants to undergo a second VFSS screening, with the exception of nine individuals. Statistical analysis included measurement of internal consistency with Cronbach's α coefficient, reliability with Cohen's Kappa, Pearson's correlation coefficient and construct validity with categorical components, and One-Way Anova test. Results According to Cronbach's α coefficient (0.976) for total score, there was high internal consistency for the Ohkuma Dysphagia questionnaire. Test-retest reliability (Cohen's Kappa) ranged from 0.586 to 1.00, exhibiting acceptable stability. We also estimated the Pearson's correlation coefficient for the test-retest total score, which reached high levels (0.952; p = 0.000). The One-Way Anova test in the two measurement times showed statistically significant correlation in both measurements (p = 0.02 and p = 0.016). Conclusion The adapted Greek version of the questionnaire is valid and reliable and can be used for the screening of dysphagia in the Greek-speaking patients. PMID:28050209
Reliability and validity of a Swedish language version of the Resilience Scale.

Science.gov (United States)

Nygren, Björn; Randström, Kerstin Björkman; Lejonklou, Anna K; Lundman, Beril

2004-01-01

The purpose of this study was to test the reliability and validity of the Swedish language version of the Resilience Scale (RS). Participants were 142 adults between 19-85 years of age. Internal consistency reliability, stability over time, and construct validity were evaluated using Cronbach's alpha, principal components analysis with varimax rotation and correlations with scores on the Sense of Coherence Scale (SOC) and the Rosenberg Self-Esteem Scale (RSE). The mean score on the RS was 142 (SD = 15). The possible scores on the RS range from 25 to 175, and scores higher than 146 are considered high. The test-retest correlation was .78. Correlations with the SOC and the RSE were .41 (p Self and Life emerged as components from the principal components analysis. These findings provide evidence for the reliability and validity of the Swedish language version of the RS.
USING A TOTAL QUALITY STRATEGY IN A NEW PRACTICAL APPROACH FOR IMPROVING THE PRODUCT RELIABILITY IN AUTOMOTIVE INDUSTRY

Directory of Open Access Journals (Sweden)

Cristiano Fragassa

2014-09-01

Full Text Available In this paper a Total Quality Management strategy is proposed, refined and used with the aim at improving the quality of large-mass industrial products far beyond the technical specifications demanded at the end-customer level. This approach combines standard and non-standard tools used for Reliability, Availability and Maintainability analysis. The procedure also realizes a stricter correlation between theoretical evaluation methods and experimental evidences as part of a modern integrated method for strengthening quality in design and process. A commercial Intake Manifold, largely spread in the market, is used as test-case for the validation of the methodology. As general additional result, the research underlines the impact of Total Quality Management and its tools on the development of innovation.
The Berg Balance Scale has high intra- and inter-rater reliability but absolute reliability varies across the scale: a systematic review.

Science.gov (United States)

Downs, Stephen; Marquez, Jodie; Chiarelli, Pauline

2013-06-01

What is the intra-rater and inter-rater relative reliability of the Berg Balance Scale? What is the absolute reliability of the Berg Balance Scale? Does the absolute reliability of the Berg Balance Scale vary across the scale? Systematic review with meta-analysis of reliability studies. Any clinical population that has undergone assessment with the Berg Balance Scale. Relative intra-rater reliability, relative inter-rater reliability, and absolute reliability. Eleven studies involving 668 participants were included in the review. The relative intrarater reliability of the Berg Balance Scale was high, with a pooled estimate of 0.98 (95% CI 0.97 to 0.99). Relative inter-rater reliability was also high, with a pooled estimate of 0.97 (95% CI 0.96 to 0.98). A ceiling effect of the Berg Balance Scale was evident for some participants. In the analysis of absolute reliability, all of the relevant studies had an average score of 20 or above on the 0 to 56 point Berg Balance Scale. The absolute reliability across this part of the scale, as measured by the minimal detectable change with 95% confidence, varied between 2.8 points and 6.6 points. The Berg Balance Scale has a higher absolute reliability when close to 56 points due to the ceiling effect. We identified no data that estimated the absolute reliability of the Berg Balance Scale among participants with a mean score below 20 out of 56. The Berg Balance Scale has acceptable reliability, although it might not detect modest, clinically important changes in balance in individual subjects. The review was only able to comment on the absolute reliability of the Berg Balance Scale among people with moderately poor to normal balance. Copyright © 2013 Australian Physiotherapy Association. Published by .. All rights reserved.
MODIFIED ALVARADO SCORING AS A DIAGNOSTIC TOOL IN ACUTE APPENDICITIS- A PROSPECTIVE STUDY

Directory of Open Access Journals (Sweden)

V. K. Arun Kumar

2017-02-01

Full Text Available BACKGROUND Acute Appendicitis commonest community-acquired intra-abdominal infections. Acute appendicitis and its associated complications are significant source of morbidity and sometimes mortality. The Modified Alvarado Scoring System (MASS has been reported to be a cheap and quick diagnostic tool in patients with acute appendicitis. Diagnostic accuracy have been observed if the scores were applied to various populations and clinical settings. The purpose of this study was to evaluate the diagnostic value of Modified Alvarado Scoring System in patients with acute appendicitis in our setting. The aim of the study is to evaluate the efficacy of the modified Alvarado score as a diagnostic tool in Acute Appendicitis, as the diagnosis of appendicitis depends on the onset of symptoms and the subjective interpretation of the physical examination. MATERIALS AND METHODS This was a prospective study carried out in Pondicherry Institute of Medical Science during the period of November 2013 to May 2015. This study was done on 50 patients diagnosed with Acute Appendicitis and admitted in General Surgery. RESULTS In this study, there were a total of 50 patients who were taken up for surgery based on clinical and radiological diagnosis. Our study demonstrates that modified Alvarado score applied to all adult patients of acute appendicitis in adults with a sensitivity of 60% and a specificity of 40% only. Showing it wasn’t efficient in diagnosing acute appendicitis. The positive predictive value shown by our study was 80% which is marginally lower than that explained in literature which reports 87.5%. Negative appendicectomy rate in this study is 12%. CONCLUSION Alvarado score is a non-invasive, safe diagnostic procedure, which is simple, fast reliable and repeatable; it can be used in all conditions, without expensive and complicated supportive diagnostic methods. Alvarado score increases the diagnostic certainty of clinical examination in diagnosis of
Validation of the FOUR Score (Spanish Version) in acute stroke: an interobserver variability study.

Science.gov (United States)

Idrovo, Luis; Fuentes, Blanca; Medina, Josmarlin; Gabaldón, Laura; Ruiz-Ares, Gerardo; Abenza, María José; Aguilar-Amat, María José; Martínez-Sánchez, Patricia; Rodríguez, Luis; Cazorla, Rubén; Martínez, Marta; Tafur, Alfonso; Wijdicks, Eelco F M; Diez-Tejedor, Exuperio

2010-01-01

Methods to assess impaired consciousness in acute stroke typically include the Glasgow Coma Scale (GCS), but the verbal component has limitations in aphasic or intubated patients. The FOUR (Full Outline of UnResponsiveness) score, a new coma scale, evaluates 4 components: eye and motor responses, brainstem reflexes and respiration. We aimed to study the interobserver variability of the FOUR score in acute stroke patients. We prospectively enrolled consecutive patients with acute stroke admitted from February to July 2008 to the stroke unit of our Neurology Department. Patients were evaluated by neurology residents and nurses using the FOUR score and the GCS. For both scales, we obtained paired and total weighted kappa values (Kw) and intraclass correlation coefficients (ICC). NIH stroke scale was also recorded on admission. We obtained a total of 75 paired evaluations in 60 patients (41 cerebral infarctions, 15 cerebral hemorrhages and 4 transient ischemic attacks). Thirty-three (55%) patients were alert, 17 (28.3%) drowsy and 10 (16.7%) stuporous or comatose. The overall rater agreement was excellent in the FOUR score (Kw 0.93; 95% CI 0.89-0.97) with an ICC of 0.94 (95% CI 0.91-0.96) and in the GCS (Kw 0.96; 95% CI 0.94-0.98) with an ICC of 0.96 (95% CI 0.93-0.97). A good correlation was found between the FOUR score and the GCS (rho 0.83; p FOUR score and the NIH stroke scale (rho -0.78; p FOUR score is a reliable scale for evaluating the level of consciousness in acute stroke patients, showing a good correlation with the GCS and the NIH stroke scale. Copyright 2010 S. Karger AG, Basel.
Measuring illness beliefs in patients with lower extremity injuries: reliability and validity of the Dutch version of the Somatic Pre-Occupation and Coping questionnaire (SPOC-NL).

Science.gov (United States)

Reininga, Inge H F; Brouwer, Sandra; Dijkstra, Anita; Busse, Jason W; Ebrahim, Shanil; Wendt, Klaus W; El Moumni, Mostafa

2015-02-01

Positive coping strategies, illness perceptions and recovery expectations are associated with better clinical outcomes and earlier return to work after injuries. The Somatic Pre-Occupation and Coping (SPOC) questionnaire captures illness beliefs and coping towards recovery of physical function and return to work after surgical treatment of tibial shaft fractures. The aim of this study was to translate and culturally adapt the SPOC into Dutch (SPOC-NL) and evaluate its reliability and validity in patients with lower extremity injuries. The SPOC-NL contains four subscales: Somatic complaints, Coping, Energy, and Optimism. Patients treated for lower extremity injuries (N=106) completed the SPOC-NL, Short Form-36 and Short Musculoskeletal Function Assessment (SMFA-NL) questionnaire, and reported their current work status and self-perceived work ability. To assess test-retest reliability, 56 patients completed the SPOC-NL for a second time two weeks after the first administration of the SPOC-NL. We calculated Cronbach's Alpha, intraclass correlation coefficients (ICCs) and G coefficients to measure internal consistency and overall reliability, and used the Bland and Altman method to assess bias between test and retest SPOC-NL scores. To determine construct validity, we explored 16 a priori hypotheses regarding correlations between SPOC-NL scores and subscale scores and SF-36, SMFA-NL, work status and work ability. Internal consistency was good to excellent, with Cronbach's Alpha values ranging between 0.79 and 0.94 and G coefficients ranging between 0.77 and 0.95. Test-retest reliability was also good, since high ICCs (0.72-0.91) and G coefficients (0.82-0.94) were found. Construct validity of the SPOC-NL was good, as 75% of the predefined hypotheses were confirmed. Compared to participants who were on sick leave or receiving disability benefits, participants with a paid job had significantly higher scores on the total score and the subscales Somatic complaints and
Reliability and validity of internalized stigmatization scale in psoriasis

Directory of Open Access Journals (Sweden)

Erkan Alpsoy

2015-03-01

Full Text Available Backround and design. Internalized stigma involves endorsing negative feelings and beliefs such as insignificance, shame and withdrawal triggered by applying these negative stereotypes to one self. Internalized Stigma Scale has not been applied to psoriasis patients. We aimed to evaluate the reliability and validity of Internalized Stigma Scale in psoriasis patients. Materials and Methods. 100 consecutive, volunteer psoriasis patients (48 female, 52 male; aged, 40.59±15.44 years were enrolled in the study. PASI and BSA were evaluated by physician (A.B.. Patients responded contemporaneously to Psoriasis Internalized Stigma Scale (PISS, DQoL, and Perceived Health Status (PHS, single-item self-rated general health question, of which Likert scores 1, 2, and 3 were classified as “from fair to very poor”, and 4, 5 as “good”. Results. Cronbach's alpha coefficient of PISS subscales was 0.83 for alienation, 0.70 for stereotype endorsement, 0.70 for perceived discrimination, 0.84 for social withdrawal and 0.68 for stigma resistance. The same value was 0.89 for the total scale. PISS and DQoL scores mean values were 58.8±12.6 and 10.0±9.4, respectively. PISS was significantly correlated with the patients' DQoL scores (r=,726, p=0,001. PISS was also significantly correlated with disease duration (r=,209, p=0,047. There was no any significant relationship between PASI or BSA and PISS. Mean DQoL scores in patients reporting their PHS as “from fair to very poor” and “good” were 12.1±7.3 and 5.0±4.3, respectively. Mean values of PISS in patients reporting their PHS as “from fair to very poor” was significantly increased compared with patients reporting their PHS as “good” (p=0.001. Conclusion. PISS can be used as a reliable and valid tool in assesing internalized stigmatization in psoriasis patients. Our results indicate a high level of stigmatization in psoriasis patients. Low DQoL scores show a correlation with increased levels of
Reliability and validity of 12-item Short-Form health survey (SF-12) for the health status of Chinese community elderly population in Xujiahui district of Shanghai.

Science.gov (United States)

Shou, Juan; Ren, Limin; Wang, Haitang; Yan, Fei; Cao, Xiaoyun; Wang, Hui; Wang, Zhiliang; Zhu, Shanzhu; Liu, Yao

2016-04-01

The 12-item Short-Form Health Survey (SF-12) is the abridged practical version of SF-36. This cross-sectional study was aimed to assess the reliability and validity of SF-12 for the health status of Chinese community elderly population. The Chinese community elderly people in Xujiahui district of Shanghai were investigated. The internal consistency reliability was assessed using Cronbach's alpha and split-half reliability coefficients. Construct validity was analyzed using exploratory factor analysis (EFA) and confirmatory factor analysis (CFA). Spearman's correlation coefficient (ρ) was used for the evaluation of criterion, convergent, and discriminant validity with Spearman's ρ ≥ 0.4 as satisfactory. Comparisons of the SF-12 summary scores among populations that differed in demographics were performed for discriminant validity. Total 1343 individuals aged ≥60 and reliability coefficient (0.812) reflected satisfactory internal consistency reliability of SF-12. EFA extracted a two-factor model (physical and mental health). About 60.7 % of the total variance was explained by the two factors. CFA showed that the two-factor solution provided a good fit to the data. Good convergent validity and discriminant validity of SF-12 were proved by the correction analyses (Spearman's ρ > 0.4) and the comparisons of the SF-12 summary scores among populations (P 0.4, P reliability and validity in measuring health status of Chinese community elderly population in Xujiahui district of Shanghai.
Reliability And Validity Of Turkish Version Of Motor Activity Log-28

Directory of Open Access Journals (Sweden)

Burcu Ersöz Hüseyinsinoğlu

2011-06-01

Full Text Available OBJECTIVE: The aim of this study was to adapt the Motor Activity Log-28 (MAL-28 into Turkish and probe the reliability and validity of this questionnaire in stroke patients. METHODS: Following the translation of the MAL-28 into Turkish, its reliability and construct validity was examined in 30 stroke patients. For the reliability study, patients were interviewed twice within a three day period, during which no rehabilitative activities were undertaken. The test-retest reliability was determined by using intra-class correlation coefficient (ICC and Spearman correlation coefficient (r; internal consistency was determined by Cronbach's alpha (α. The construct validity was examined by comparing MAL-28 Quality Of Movement (QOM scale and Amount Of Use (AOU scale with Wolf Motor Function Test (WMFT-Performance Time (PT and Functional Ability (FA scores. Furthermore, item-to-scale correlations of AOU and QOM scales were determined and correlation between totol scores of two scales was examined. RESULTS: Turkish version of MAL-28 AOU and QOM scales were reliable (ICC scores were 0.97 and 0.96, respectively and internally consistent (Cronbach’s α value was 0.96 for both scales. Test-retest reliability was supported (AOU, r=0.94; QOM, r=0.93. WMFT FA scores was correlated with both scales (r=0.63. Correlation between WMFT PT and AOU and QOM scales were -0.56 and -0.55. AOU and QOM scales were highly correlated (r=0.95. CONCLUSION: The findings indicate that Turkish version of MAL-28 is reliable and valid in individuals with stroke. Further investigation about its responsiveness is needed before using that version as a primary measurement in clinical trials
First quality score for referral letters in gastroenterology—a validation study

Science.gov (United States)

Eskeland, Sigrun Losada; Brunborg, Cathrine; Seip, Birgitte; Wiencke, Kristine; Hovde, Øistein; Owen, Tanja; Skogestad, Erik; Huppertz-Hauss, Gert; Halvorsen, Fred-Arne; Garborg, Kjetil; Aabakken, Lars; de Lange, Thomas

2016-01-01

Objective To create and validate an objective and reliable score to assess referral quality in gastroenterology. Design An observational multicentre study. Setting and participants 25 gastroenterologists participated in selecting variables for a Thirty Point Score (TPS) for quality assessment of referrals to gastroenterology specialist healthcare for 9 common indications. From May to September 2014, 7 hospitals from the South-Eastern Norway Regional Health Authority participated in collecting and scoring 327 referrals to a gastroenterologist. Main outcome measure Correlation between the TPS and a visual analogue scale (VAS) for referral quality. Results The 327 referrals had an average TPS of 13.2 (range 1–25) and an average VAS of 4.7 (range 0.2–9.5). The reliability of the score was excellent, with an intra-rater intraclass correlation coefficient (ICC) of 0.87 and inter-rater ICC of 0.91. The overall correlation between the TPS and the VAS was moderate (r=0.42), and ranged from fair to substantial for the various indications. Mean agreement was good (ICC=0.47, 95% CI (0.34 to 0.57)), ranging from poor to good. Conclusions The TPS is reliable, objective and shows good agreement with the subjective VAS. The score may be a useful tool for assessing referral quality in gastroenterology, particularly important when evaluating the effect of interventions to improve referral quality. PMID:27855107
Validity and reliability of the Spanish-language version of the self-administered Leeds Assessment of Neuropathic Symptoms and Signs (S-LANSS) pain scale.

Science.gov (United States)

López-de-Uralde-Villanueva, I; Gil-Martínez, A; Candelas-Fernández, P; de Andrés-Ares, J; Beltrán-Alacreu, H; La Touche, R

2016-12-08

The self-administered Leeds Assessment of Neuropathic Symptoms and Signs (S-LANSS) scale is a tool designed to identify patients with pain with neuropathic features. To assess the validity and reliability of the Spanish-language version of the S-LANSS scale. Our study included a total of 182 patients with chronic pain to assess the convergent and discriminant validity of the S-LANSS; the sample was increased to 321 patients to evaluate construct validity and reliability. The validated Spanish-language version of the ID-Pain questionnaire was used as the criterion variable. All participants completed the ID-Pain, the S-LANSS, and the Numerical Rating Scale for pain. Discriminant validity was evaluated by analysing sensitivity, specificity, and the area under the receiver operating characteristic curve (AUC). Construct validity was assessed with factor analysis and by comparing the odds ratio of each S-LANSS item to the total score. Convergent validity and reliability were evaluated with Pearson's r and Cronbach's alpha, respectively. The optimal cut-off point for S-LANSS was ≥12 points (AUC=.89; sensitivity=88.7; specificity=76.6). Factor analysis yielded one factor; furthermore, all items contributed significantly to the positive total score on the S-LANSS (P<.05). The S-LANSS showed a significant correlation with ID-Pain (r=.734, α=.71). The Spanish-language version of the S-LANSS is valid and reliable for identifying patients with chronic pain with neuropathic features. Copyright © 2016 Sociedad Española de Neurología. Publicado por Elsevier España, S.L.U. All rights reserved.
A generic method for assignment of reliability scores applied to solvent accessibility predictions

DEFF Research Database (Denmark)

Petersen, Bent; Petersen, Thomas Nordahl; Andersen, Pernille

2009-01-01

: The performance of the neural networks was evaluated on a commonly used set of sequences known as the CB513 set. An overall Pearson's correlation coefficient of 0.72 was obtained, which is comparable to the performance of the currently best public available method, Real-SPINE. Both methods associate a reliability...... comparing the Pearson's correlation coefficient for the upper 20% of predictions sorted according to reliability. For this subset, values of 0.79 and 0.74 are obtained using our and the compared method, respectively. This tendency is true for any selected subset....
Gait Deviation Index, Gait Profile Score and Gait Variable Score in children with spastic cerebral palsy

DEFF Research Database (Denmark)

Rasmussen, Helle Mätzke; Nielsen, Dennis Brandborg; Pedersen, Niels Wisbech

2015-01-01

Abstract The Gait Deviation Index (GDI) and Gait Profile Score (GPS) are the most used summary measures of gait in children with cerebral palsy (CP). However, the reliability and agreement of these indices have not been investigated, limiting their clinimetric quality for research and clinical...... to good reliability with ICCs of 0.4–0.7. The agreement for the GDI and the logarithmically transformed GPS, in terms of the standard error of measurement as a percentage of the grand mean (SEM%) varied from 4.1 to 6.7%, whilst the smallest detectable change in percent (SDC%) ranged from 11.3 to 18...
Reliability and validity of the visual analogue scale for disability in patients with chronic musculoskeletal pain.

Science.gov (United States)

Boonstra, Anne M; Schiphorst Preuper, Henrica R; Reneman, Michiel F; Posthumus, Jitze B; Stewart, Roy E

2008-06-01

To determine the reliability and concurrent validity of a visual analogue scale (VAS) for disability as a single-item instrument measuring disability in chronic pain patients was the objective of the study. For the reliability study a test-retest design and for the validity study a cross-sectional design was used. A general rehabilitation centre and a university rehabilitation centre was the setting for the study. The study population consisted of patients over 18 years of age, suffering from chronic musculoskeletal pain; 52 patients in the reliability study, 344 patients in the validity study. Main outcome measures were as follows. Reliability study: Spearman's correlation coefficients (rho values) of the test and retest data of the VAS for disability; validity study: rho values of the VAS disability scores with the scores on four domains of the Short-Form Health Survey (SF-36) and VAS pain scores, and with Roland-Morris Disability Questionnaire scores in chronic low back pain patients. Results were as follows: in the reliability study rho values varied from 0.60 to 0.77; and in the validity study rho values of VAS disability scores with SF-36 domain scores varied from 0.16 to 0.51, with Roland-Morris Disability Questionnaire scores from 0.38 to 0.43 and with VAS pain scores from 0.76 to 0.84. The conclusion of the study was that the reliability of the VAS for disability is moderate to good. Because of a weak correlation with other disability instruments and a strong correlation with the VAS for pain, however, its validity is questionable.
The Assumption of a Reliable Instrument and Other Pitfalls to Avoid When Considering the Reliability of Data

Science.gov (United States)

Nimon, Kim; Zientek, Linda Reichwein; Henson, Robin K.

2012-01-01

The purpose of this article is to help researchers avoid common pitfalls associated with reliability including incorrectly assuming that (a) measurement error always attenuates observed score correlations, (b) different sources of measurement error originate from the same source, and (c) reliability is a function of instrumentation. To accomplish our purpose, we first describe what reliability is and why researchers should care about it with focus on its impact on effect sizes. Second, we review how reliability is assessed with comment on the consequences of cumulative measurement error. Third, we consider how researchers can use reliability generalization as a prescriptive method when designing their research studies to form hypotheses about whether or not reliability estimates will be acceptable given their sample and testing conditions. Finally, we discuss options that researchers may consider when faced with analyzing unreliable data. PMID:22518107

Validity and cross-cultural adaptation of the persian version of the oxford elbow score.

Science.gov (United States)

Ebrahimzadeh, Mohammad H; Kachooei, Amir Reza; Vahedi, Ehsan; Moradi, Ali; Mashayekhi, Zeinab; Hallaj-Moghaddam, Mohammad; Azami, Mehran; Birjandinejad, Ali

2014-01-01

Oxford Elbow Score (OES) is a patient-reported questionnaire used to assess outcomes after elbow surgery. The aim of this study was to validate and adapt the OES into Persian language. After forward-backward translation of the OES into Persian, a total number of 92 patients after elbow surgeries completed the Persian OES along with the Persian DASH and SF-36. To assess test-retest reliability, 31 randomly selected patients (34%) completed the Persian OES again after three days while abstaining from all forms of therapeutic regimens. Reliability of the Persian OES was assessed by measuring intraclass correlation coefficient (ICC) for test-retest reliability and Cronbach's alpha for internal consistency. Spearman's correlation coefficient was used to test the construct validity. Cronbach's alpha coefficient was 0.92 showing excellent reliability. Cronbach's alpha for function, pain, and social-psychological subscales was 0.95, 0.86, and 0.85, respectively. Intraclass correlation coefficient (ICC) was 0.85 for the overall questionnaire and 0.90, 0.76, and 0.75 for function, pain, and social-psychological subscales, respectively. Construct validity was confirmed as the Spearman correlation between OES and DASH was 0.80. Persian OES is a valid and reliable patient-reported outcome measure to assess postsurgical elbow status in Persian speaking population.
Ripasa score: a new diagnostic score for diagnosis of acute appendicitis

International Nuclear Information System (INIS)

Butt, M.Q.

2014-01-01

Objective: To determine the usefulness of RIPASA score for the diagnosis of acute appendicitis using histopathology as a gold standard. Study Design: Cross-sectional study. Place and Duration of Study: Department of General Surgery, Combined Military Hospital, Kohat, from September 2011 to March 2012. Methodology: A total of 267 patients were included in this study. RIPASA score was assessed. The diagnosis of appendicitis was made clinically aided by routine sonography of abdomen. After appendicectomies, resected appendices were sent for histopathological examination. The 15 parameters and the scores generated were age (less than 40 years = 1 point; greater than 40 years = 0.5 point), gender (male = 1 point; female = 0.5 point), Right Iliac Fossa (RIF) pain (0.5 point), migration of pain to RIF (0.5 point), nausea and vomiting (1 point), anorexia (1 point), duration of symptoms (less than 48 hours = 1 point; more than 48 hours = 0.5 point), RIF tenderness (1 point), guarding (2 points), rebound tenderness (1 point), Rovsing's sign (2 points), fever (1 point), raised white cell count (1 point), negative urinalysis (1 point) and foreign national registration identity card (1 point). The optimal cut-off threshold score from the ROC was 7.5. Sensitivity analysis was done. Results: Out of 267 patients, 156 (58.4%) were male while remaining 111 patients (41.6%) were female with mean age of 23.5 +- 9.1 years. Sensitivity of RIPASA score was 96.7%, specificity 93.0%, diagnostic accuracy was 95.1%, positive predictive value was 94.8% and negative predictive value was 95.54%. Conclusion: RIPASA score at a cut-off total score of 7.5 was a useful tool to diagnose appendicitis, in equivocal cases of pain. (author)
Can an arthroplasty risk score predict bundled care events after total joint arthroplasty?

Directory of Open Access Journals (Sweden)

Blair S. Ashley, MD

2018-03-01

Full Text Available Background: The validated Arthroplasty Risk Score (ARS predicts the need for postoperative triage to an intensive care setting. We hypothesized that the ARS may also predict hospital length of stay (LOS, discharge disposition, and episode-of-care cost (EOCC. Methods: We retrospectively reviewed a series of 704 patients undergoing primary total hip and knee arthroplasty over 17 months. Patient characteristics, 90-day EOCC, LOS, and readmission rates were compared before and after ARS implementation. Results: ARS implementation was associated with fewer patients going to a skilled nursing or rehabilitation facility after discharge (63% vs 74%, P = .002. There was no difference in LOS, EOCC, readmission rates, or complications. While the adoption of the ARS did not change the mean EOCC, ARS >3 was predictive of high EOCC outlier (odds ratio 2.65, 95% confidence interval 1.40-5.01, P = .003. Increased ARS correlated with increased EOCC (P = .003. Conclusions: Implementation of the ARS was associated with increased disposition to home. It was predictive of high EOCC and should be considered in risk adjustment variables in alternative payment models. Keywords: Bundled payments, Risk stratification, Arthroplasty
Further Simplification of the Simple Erosion Narrowing Score With Item Response Theory Methodology.

Science.gov (United States)

Oude Voshaar, Martijn A H; Schenk, Olga; Ten Klooster, Peter M; Vonkeman, Harald E; Bernelot Moens, Hein J; Boers, Maarten; van de Laar, Mart A F J

2016-08-01

To further simplify the simple erosion narrowing score (SENS) by removing scored areas that contribute the least to its measurement precision according to analysis based on item response theory (IRT) and to compare the measurement performance of the simplified version to the original. Baseline and 18-month data of the Combinatietherapie Bij Reumatoide Artritis (COBRA) trial were modeled using longitudinal IRT methodology. Measurement precision was evaluated across different levels of structural damage. SENS was further simplified by omitting the least reliably scored areas. Discriminant validity of SENS and its simplification were studied by comparing their ability to differentiate between the COBRA and sulfasalazine arms. Responsiveness was studied by comparing standardized change scores between versions. SENS data showed good fit to the IRT model. Carpal and feet joints contributed the least statistical information to both erosion and joint space narrowing scores. Omitting the joints of the foot reduced measurement precision for the erosion score in cases with below-average levels of structural damage (relative efficiency compared with the original version ranged 35-59%). Omitting the carpal joints had minimal effect on precision (relative efficiency range 77-88%). Responsiveness of a simplified SENS without carpal joints closely approximated the original version (i.e., all Δ standardized change scores were ≤0.06). Discriminant validity was also similar between versions for both the erosion score (relative efficiency = 97%) and the SENS total score (relative efficiency = 84%). Our results show that the carpal joints may be omitted from the SENS without notable repercussion for its measurement performance. © 2016, American College of Rheumatology.
Reliability and Validity of a Nepalese Version of the Oral Health Impact Profile for Edentulous Subjects.

Science.gov (United States)

Shrestha, Bidhan; Niraula, Surya Raj; Parajuli, Prakash K; Suwal, Pramita; Singh, Raj Kumar

2018-06-01

To assess the reliability and to validate the translated Nepalese version of the Oral Health Impact Profile (OHIP-EDENT-N) in Nepalese edentulous subjects. The international guidelines for translation and cross-cultural adaption of OHIP-EDENT were followed, and a Nepalese version of the questionnaire was adapted for this study. Eighty-eight completely edentulous subjects were then selected for the study and completed their responses for the questionnaire. The reliability of the OHIP-EDENT-N was evaluated using internal consistency. Validity was assessed as construct and convergent validity. Construct validity was determined using exploratory factor analysis (EFA). The correlation between OHIP-EDENT-N subscale scores and the global question was investigated to test the convergent validity. Cronbach's alpha for the total score of OHIP-EDENT-N was 0.78. Construct validity was assessed by factor analysis: 70.196% of the variance was accountable to five factors extracted from the factor analysis. Factor loadings above 0.40 were noted for all items. In terms of convergent validity, significant correlations could be established between OHIP-EDENT-N and global questions. This study has been able to establish the reliability and validity of the OHIP-EDENT-N, and OHIP-EDENT-N can be a considered a reliable tool to assess the oral health related quality of life in the Nepalese edentulous population. © 2016 by the American College of Prosthodontists.
Training less-experienced faculty improves reliability of skills assessment in cardiac surgery.

Science.gov (United States)

Lou, Xiaoying; Lee, Richard; Feins, Richard H; Enter, Daniel; Hicks, George L; Verrier, Edward D; Fann, James I

2014-12-01

Previous work has demonstrated high inter-rater reliability in the objective assessment of simulated anastomoses among experienced educators. We evaluated the inter-rater reliability of less-experienced educators and the impact of focused training with a video-embedded coronary anastomosis assessment tool. Nine less-experienced cardiothoracic surgery faculty members from different institutions evaluated 2 videos of simulated coronary anastomoses (1 by a medical student and 1 by a resident) at the Thoracic Surgery Directors Association Boot Camp. They then underwent a 30-minute training session using an assessment tool with embedded videos to anchor rating scores for 10 components of coronary artery anastomosis. Afterward, they evaluated 2 videos of a different student and resident performing the task. Components were scored on a 1 to 5 Likert scale, yielding an average composite score. Inter-rater reliabilities of component and composite scores were assessed using intraclass correlation coefficients (ICCs) and overall pass/fail ratings with kappa. All components of the assessment tool exhibited improvement in reliability, with 4 (bite, needle holder use, needle angles, and hand mechanics) improving the most from poor (ICC range, 0.09-0.48) to strong (ICC range, 0.80-0.90) agreement. After training, inter-rater reliabilities for composite scores improved from moderate (ICC, 0.76) to strong (ICC, 0.90) agreement, and for overall pass/fail ratings, from poor (kappa = 0.20) to moderate (kappa = 0.78) agreement. Focused, video-based anchor training facilitates greater inter-rater reliability in the objective assessment of simulated coronary anastomoses. Among raters with less teaching experience, such training may be needed before objective evaluation of technical skills. Published by Elsevier Inc.
Validity and reliability of the Baecke questionnaire for the evaluation of habitual physical activity among people living with HIV/AIDS

Directory of Open Access Journals (Sweden)

Florindo Alex Antonio

2006-01-01

Full Text Available This study evaluates the validity and reliability of the Baecke questionnaire on habitual physical activity when applied to a population of HIV/AIDS subjects. Validity was determined by comparing measurements for 30 subjects of peak oxygen uptake, peak workload, and energy expenditure with scores for occupational physical activity (OPA, physical exercise in leisure (PEL, leisure and locomotion activities (LLA, and total score (TS. Reliability was determined by testing and retesting 29 subjects at intervals of 15-30 days. Validity was evaluated with the Pearson correlation and reliability analyses were done using the intraclass correlation, paired Student t-test, and Bland-Altman methods. Peak VO2 and peak workload had significant correlation with PEL (r = 0.41; r = 0.43; respectively. Energy expenditure had a significant correlation with OPA (r = 0.64. The intraclass coefficients were 0.70 or more for OPA, PEL and TS. There was no difference in OPA, PEL, LLA and TS between the two evaluations. The Bland-Altman methods showed that there was good agreement between the measurements for all habitual physical activities scores. Results show that the Baecke questionnaire is valid for the evaluation of habitual physical activity among people living with HIV/AIDS.
The PedsQL™ Present Functioning Visual Analogue Scales: preliminary reliability and validity

Directory of Open Access Journals (Sweden)

Varni James W

2006-10-01

Full Text Available Abstract Background The PedsQL™ Present Functioning Visual Analogue Scales (PedsQL™ VAS were designed as an ecological momentary assessment (EMA instrument to rapidly measure present or at-the-moment functioning in children and adolescents. The PedsQL™ VAS assess child self-report and parent-proxy report of anxiety, sadness, anger, worry, fatigue, and pain utilizing six developmentally appropriate visual analogue scales based on the well-established Varni/Thompson Pediatric Pain Questionnaire (PPQ Pain Intensity VAS format. Methods The six-item PedsQL™ VAS was administered to 70 pediatric patients ages 5–17 and their parents upon admittance to the hospital environment (Time 1: T1 and again two hours later (Time 2: T2. It was hypothesized that the PedsQL™ VAS Emotional Distress Summary Score (anxiety, sadness, anger, worry and the fatigue VAS would demonstrate moderate to large effect size correlations with the PPQ Pain Intensity VAS, and that patient" parent concordance would increase over time. Results Test-retest reliability was demonstrated from T1 to T2 in the large effect size range. Internal consistency reliability was demonstrated for the PedsQL™ VAS Total Symptom Score (patient self-report: T1 alpha = .72, T2 alpha = .80; parent proxy-report: T1 alpha = .80, T2 alpha = .84 and Emotional Distress Summary Score (patient self-report: T1 alpha = .74, T2 alpha = .73; parent proxy-report: T1 alpha = .76, T2 alpha = .81. As hypothesized, the Emotional Distress Summary Score and Fatigue VAS were significantly correlated with the PPQ Pain VAS in the medium to large effect size range, and patient and parent concordance increased from T1 to T2. Conclusion The results demonstrate preliminary test-retest and internal consistency reliability and construct validity of the PedsQL™ Present Functioning VAS instrument for both pediatric patient self-report and parent proxy-report. Further field testing is required to extend these initial
Validity and reliability of a modified english version of the physical activity questionnaire for adolescents.

Science.gov (United States)

Aggio, Daniel; Fairclough, Stuart; Knowles, Zoe; Graves, Lee

2016-01-01

Adaptation of physical activity self-report questionnaires is sometimes required to reflect the activity behaviours of diverse populations. The processes used to modify self-report questionnaires though are typically underreported. This two-phased study used a formative approach to investigate the validity and reliability of the Physical Activity Questionnaire for Adolescents (PAQ-A) in English youth. Phase one examined test content and response process validity and subsequently informed a modified version of the PAQ-A. Phase two assessed the validity and reliability of the modified PAQ-A. In phase one, focus groups (n = 5) were conducted with adolescents (n = 20) to investigate test content and response processes of the original PAQ-A. Based on evidence gathered in phase one, a modified version of the questionnaire was administered to participants (n = 169, 14.5 ± 1.7 years) in phase two. Internal consistency and test-retest reliability were assessed using Cronbach's alpha and intra-class correlations, respectively. Spearman correlations were used to assess associations between modified PAQ-A scores and accelerometer-derived physical activity, self-reported fitness and physical activity self-efficacy. Phase one revealed that the original PAQ-A was unrepresentative for English youth and that item comprehension varied. Contextual and population/cultural-specific modifications were made to the PAQ-A for use in the subsequent phase. In phase two, modified PAQ-A scores had acceptable internal consistency (α = 0.72) and test-retest reliability (ICC = 0.78). Modified PAQ-A scores were significantly associated with objectively assessed moderate-to-vigorous physical activity (r = 0.39), total physical activity (r = 0.42), self-reported fitness (r = 0.35), and physical activity self-efficacy (r = 0.32) (p ≤ 0.01). The modified PAQ-A had acceptable internal consistency and test-retest reliability. Modified PAQ-A scores
The reliability, validity, and applicability of an English language version of the Mini-ICF-APP.

Science.gov (United States)

Molodynski, Andrew; Linden, Michael; Juckel, George; Yeeles, Ksenija; Anderson, Catriona; Vazquez-Montes, Maria; Burns, Tom

2013-08-01

This study aimed at establishing the validity and reliability of an English language version of the Mini-ICF-APP. One hundred and five patients under the care of secondary mental health care services were assessed using the Mini-ICF-APP and several well-established measures of functioning and symptom severity. 47 (45 %) patients were interviewed on two occasions to ascertain test-retest reliability and 50 (48 %) were interviewed by two researchers simultaneously to determine the instrument's inter-rater reliability. Occupational and sick leave status were also recorded to assess construct validity. The Mini-ICF-APP was found to have substantial internal consistency (Chronbach's α 0.869-0.912) and all 13 items correlated highly with the total score. Analysis also showed that the Mini-ICF-APP had good test-retest (ICC 0.832) and inter-rater (ICC 0.886) reliability. No statistically significant association with length of sick leave was found, but the unemployed scored higher on the Mini ICF-APP than those in employment (mean 18.4, SD 9.1 vs. 9.4, SD 6.4, p Mini-ICF-APP correlated highly with the other measures of illness severity and functioning considered in the study. The English version of the Mini-ICF-APP is a reliable and valid measure of disorders of capacity as defined by the International Classification of Functioning. Further work is necessary to establish whether the scale could be divided into sub scales which would allow the instrument to more sensitively measure an individual's specific impairments.
Empirical scoring functions for advanced protein-ligand docking with PLANTS.

Science.gov (United States)

Korb, Oliver; Stützle, Thomas; Exner, Thomas E

2009-01-01

In this paper we present two empirical scoring functions, PLANTS(CHEMPLP) and PLANTS(PLP), designed for our docking algorithm PLANTS (Protein-Ligand ANT System), which is based on ant colony optimization (ACO). They are related, regarding their functional form, to parts of already published scoring functions and force fields. The parametrization procedure described here was able to identify several parameter settings showing an excellent performance for the task of pose prediction on two test sets comprising 298 complexes in total. Up to 87% of the complexes of the Astex diverse set and 77% of the CCDC/Astex clean listnc (noncovalently bound complexes of the clean list) could be reproduced with root-mean-square deviations of less than 2 A with respect to the experimentally determined structures. A comparison with the state-of-the-art docking tool GOLD clearly shows that this is, especially for the druglike Astex diverse set, an improvement in pose prediction performance. Additionally, optimized parameter settings for the search algorithm were identified, which can be used to balance pose prediction reliability and search speed.
Evaluation of a Lameness Scoring System for Dairy Cows

DEFF Research Database (Denmark)

Thomsen, P T; Munksgaard, L; Tøgersen, F A

2008-01-01

Lameness is a major problem in dairy production both in terms of reduced production and compromised animal welfare. A 5-point lameness scoring system was developed based on previously published systems, but optimized for use under field conditions. The scoring system included the words "in most...... categories by different observers before or after training. In conclusion, the results suggest that the lameness categories were not equidistant and the scoring system has reasonable reliability in terms of intra- and interobserver agreement...
Assessing Reliability of Two Versions of Vocabulary Levels Tests in Iranian Context

Directory of Open Access Journals (Sweden)

Aso Bayazidi

2017-02-01

Full Text Available This study examined the equivalence and reliability of the two versions of the Vocabulary Levels Test in an Iranian context. This study was motivated by the fact that the Vocabulary Levels test is increasingly being used in Iran for both research and pedagogical purposes without having been checked for validity and reliability in this context. The equivalence and reliability of the two versions of the test were examined through the parallel-form approach to reliability in Classical True Score theory. Seventy-five intermediate learners of English as a foreign language at the Iran Language Institute took the two versions of the test with one week interval between the two administrations in a counterbalanced fashion. To examine the equivalence of the two versions, the means and variances of the scores obtained for the two tests were compared using paired-sample t-test and one-way ANOVA, respectively. The results of the analyses indicated that the difference between the means of the two versions was significant, and the two versions cannot be considered as parallel forms. To assess the reliability of the two versions, the correlation between the scores obtained from them was estimated using Pearson Product Moment correlation. The results of the analyses showed that the two versions are highly correlated and are reliable tests. It is concluded that the two versions should not be treated as equivalent in longitudinal and gain score studies.
Five times sit-to-stand test in subjects with total knee replacement: Reliability and relationship with functional mobility tests.

Science.gov (United States)

Medina-Mirapeix, Francesc; Vivo-Fernández, Iván; López-Cañizares, Juan; García-Vidal, José A; Benítez-Martínez, Josep Carles; Del Baño-Aledo, María Elena

2018-01-01

The objective was to determine the inter-observer and test/retest reliability of the "Five-repetition sit-to-stand" (5STS) test in patients with total knee replacement (TKR). To explore correlation between 5STS and two mobility tests. A reliability study was conducted among 24 (mean age 72.13, S.D. 10.67; 50% were women) outpatients with TKR. They were recruited from a traumatology unit of a public hospital via convenience sampling. A physiotherapist and trauma physician assessed each patient at the same time. The same physiotherapist realized a 5STS second measurement 45-60min after the first one. Reliability was assessed with intraclass correlation coefficients (ICCs) and Bland-Altman plots. Pearson coefficient was calculated to assess the correlation between 5STS, time up to go test (TUG) and four meters gait speed (4MGS). ICC for inter-observer and test-retest reliability of the 5STS were 0.998 (95% confidence interval [CI], 0.995-0.999) and 0.982 (95% CI, 0.959-0.992). Bland-Altman plot inter-observer showed limits between -0.82 and 1.06 with a mean of 0.11 and no heteroscedasticity within the data. Bland-Altman plot for test-retest showed the limits between 1.76 and 4.16, a mean of 1.20 and heteroscedasticity within the data. Pearson correlation coefficient revealed significant correlation between 5STS and TUG (r=0.7, ptest-retest reliability when it is used in people with TKR, and also significant correlation with other functional mobility tests. These findings support the use of 5STS as outcome measure in TKR population. Copyright © 2017 Elsevier B.V. All rights reserved.
An examination of the RCMAS-2 scores across gender, ethnic background, and age in a large Asian school sample.

Science.gov (United States)

Ang, Rebecca P; Lowe, Patricia A; Yusof, Noradlin

2011-12-01

The present study investigated the factor structure, reliability, convergent and discriminant validity, and U.S. norms of the Revised Children's Manifest Anxiety Scale, Second Edition (RCMAS-2; C. R. Reynolds & B. O. Richmond, 2008a) scores in a Singapore sample of 1,618 school-age children and adolescents. Although there were small statistically significant differences in the average RCMAS-2 T scores found across various demographic groupings, on the whole, the U.S. norms appear adequate for use in the Asian Singapore sample. Results from item bias analyses suggested that biased items detected had small effects and were counterbalanced across gender and ethnicity, and hence, their relative impact on test score variation appears to be minimal. Results of factor analyses on the RCMAS-2 scores supported the presence of a large general anxiety factor, the Total Anxiety factor, and the 5-factor structure found in U.S. samples was replicated. Both the large general anxiety factor and the 5-factor solution were invariant across gender and ethnic background. Internal consistency estimates ranged from adequate to good, and 2-week test-retest reliability estimates were comparable to previous studies. Evidence providing support for convergent and discriminant validity of the RCMAS-2 scores was also found. Taken together, findings provide additional cross-cultural evidence of the appropriateness and usefulness of the RCMAS-2 as a measure of anxiety in Asian Singaporean school-age children and adolescents.
Reliability and validity of the Japanese version of the simplified nutritional appetite questionnaire in community-dwelling older adults.

Science.gov (United States)

Nakatsu, Nobuyuki; Sawa, Ryuichi; Misu, Shogo; Ueda, Yuya; Ono, Rei

2015-12-01

To translate the Simplified Nutritional Appetite Questionnaire (SNAQ) into Japanese, and assess its reliability and validity in Japanese community-dwelling older adults. A total of 84 community-dwelling older adults people aged 65 years or older were included in the present study, and those with a Mini-Mental State Examination score of validity was evaluated by measuring the Pearson's correlation coefficient between the SNAQ and Mini-Nutritional Assessment Short-Form scores, Geriatric Depression Scale scores, walking speed test, chair-stand test, hand grip strength test, or the Timed Up and Go test. The mean score of the Japanese version of the SNAQ was 15.5, with a Cronbach's alpha coefficient of 0.545 and intraclass correlation coefficient of 0.754. Factor analysis showed a single factor with 50.0% explained variance. The SNAQ was significantly associated with the Mini-Nutritional Assessment Short-Form, Geriatric Depression Scale, walking speed test, chair-stand test and the Timed Up and Go test. Handgrip strength test did not show a significant association with the SNAQ. The Japanese version of the SNAQ had sufficient reliability and validity. Furthermore, SNAQ (Japanese version) is useful for evaluating the appetite of community-dwelling older adults in Japan. Geriatr Gerontol Int 2015; 15: 1264-1269. © 2014 Japan Geriatrics Society.
Validity and Reliability of the Bahasa Melayu Version of the Migraine Disability Assessment Questionnaire

Directory of Open Access Journals (Sweden)

Munvar Miya Shaik

2014-01-01

Full Text Available Background. The study was designed to determine the validity and reliability of the Bahasa Melayu version (MIDAS-M of the Migraine Disability Assessment (MIDAS questionnaire. Methods. Patients having migraine for more than six months attending the Neurology Clinic, Hospital Universiti Sains Malaysia, Kubang Kerian, Kelantan, Malaysia, were recruited. Standard forward and back translation procedures were used to translate and adapt the MIDAS questionnaire to produce the Bahasa Melayu version. The translated Malay version was tested for face and content validity. Validity and reliability testing were further conducted with 100 migraine patients (1st administration followed by a retesting session 21 days later (2nd administration. Results. A total of 100 patients between 15 and 60 years of age were recruited. The majority of the patients were single (66% and students (46%. Cronbach’s alpha values were 0.84 (1st administration and 0.80 (2nd administration. The test-retest reliability for the total MIDAS score was 0.73, indicating that the MIDAS-M questionnaire is stable; for the five disability questions, the test-retest values ranged from 0.77 to 0.87. Conclusion. The MIDAS-M questionnaire is comparable with the original English version in terms of validity and reliability and may be used for the assessment of migraine in clinical settings.
SIGI: score-based identification of genomic islands

Directory of Open Access Journals (Sweden)

Merkl Rainer

2004-03-01

Full Text Available Abstract Background Genomic islands can be observed in many microbial genomes. These stretches of DNA have a conspicuous composition with regard to sequence or encoded functions. Genomic islands are assumed to be frequently acquired via horizontal gene transfer. For the analysis of genome structure and the study of horizontal gene transfer, it is necessary to reliably identify and characterize these islands. Results A scoring scheme on codon frequencies Score_G1G2(cdn = log(f_G2(cdn / f_G1(cdn was utilized. To analyse genes of a species G1 and to test their relatedness to species G2, scores were determined by applying the formula to log-odds derived from mean codon frequencies of the two genomes. A non-redundant set of nearly 400 codon usage tables comprising microbial species was derived; its members were used alternatively at position G2. Genes having at least one score value above a species-specific and dynamically determined cut-off value were analysed further. By means of cluster analysis, genes were identified that comprise clusters of statistically significant size. These clusters were predicted as genomic islands. Finally and individually for each of these genes, the taxonomical relation among those species responsible for significant scores was interpreted. The validity of the approach and its limitations were made plausible by an extensive analysis of natural genes and synthetic ones aimed at modelling the process of gene amelioration. Conclusions The method reliably allows to identify genomic island and the likely origin of alien genes.
Stroke Impact Scale 3.0: Reliability and Validity Evaluation of the Korean Version.

Science.gov (United States)

Choi, Seong Uk; Lee, Hye Sun; Shin, Joon Ho; Ho, Seung Hee; Koo, Mi Jung; Park, Kyoung Hae; Yoon, Jeong Ah; Kim, Dong Min; Oh, Jung Eun; Yu, Se Hwa; Kim, Dong A

2017-06-01

To establish the reliability and validity the Korean version of the Stroke Impact Scale (K-SIS) 3.0. A total of 70 post-stroke patients were enrolled. All subjects were evaluated for general characteristics, Mini-Mental State Examination (MMSE), the National Institutes of Health Stroke Scale (NIHSS), Modified Barthel Index, Hospital Anxiety and Depression Scale (HADS). The SF-36 and K-SIS 3.0 assessed their health-related quality of life. Statistical analysis after evaluation, determined the reliability and validity of the K-SIS 3.0. A total of 70 patients (mean age, 54.97 years) participated in this study. Internal consistency of the SIS 3.0 (Cronbach's alpha) was obtained, and all domains had good co-efficiency, with threshold above 0.70. Test-retest reliability of SIS 3.0 required correlation (Spearman's rho) of the same domain scores obtained on the first and second assessments. Results were above 0.5, with the exception of social participation and mobility. Concurrent validity of K-SIS 3.0 was assessed using the SF-36, and other scales with the same or similar domains. Each domain of K-SIS 3.0 had a positive correlation with corresponding similar domain of SF-36 and other scales (HADS, MMSE, and NIHSS). The newly developed K-SIS 3.0 showed high inter-intra reliability and test-retest reliabilities, together with high concurrent validity with the original and various other scales, for patients with stroke. K-SIS 3.0 can therefore be used for stroke patients, to assess their health-related quality of life and treatment efficacy.
Key performance indicators score (KPIs-score) based on clinical and laboratorial parameters can establish benchmarks for internal quality control in an ART program.

Science.gov (United States)

Franco, José G; Petersen, Claudia G; Mauri, Ana L; Vagnini, Laura D; Renzi, Adriana; Petersen, Bruna; Mattila, M C; Comar, Vanessa A; Ricci, Juliana; Dieamant, Felipe; Oliveira, João Batista A; Baruffi, Ricardo L R

2017-06-01

KPIs have been employed for internal quality control (IQC) in ART. However, clinical KPIs (C-KPIs) such as age, AMH and number of oocytes collected are never added to laboratory KPIs (L-KPIs), such as fertilization rate and morphological quality of the embryos for analysis, even though the final endpoint is the evaluation of clinical pregnancy rates. This paper analyzed if a KPIs-score strategy with clinical and laboratorial parameters could be used to establish benchmarks for IQC in ART cycles. In this prospective cohort study, 280 patients (36.4±4.3years) underwent ART. The total KPIs-score was obtained by the analysis of age, AMH (AMH Gen II ELISA/pre-mixing modified, Beckman Coulter Inc.), number of metaphase-II oocytes, fertilization rates and morphological quality of the embryonic lot. The total KPIs-score (C-KPIs+L-KPIs) was correlated with the presence or absence of clinical pregnancy. The relationship between the C-KPIs and L-KPIs scores was analyzed to establish quality standards, to increase the performance of clinical and laboratorial processes in ART. The logistic regression model (LRM), with respect to pregnancy and total KPIs-score (280 patients/102 clinical pregnancies), yielded an odds ratio of 1.24 (95%CI = 1.16-1.32). There was also a significant difference (pclinical pregnancies (total KPIs-score=20.4±3.7) and the group without clinical pregnancies (total KPIs-score=15.9±5). Clinical pregnancy probabilities (CPP) can be obtained using the LRM (prediction key) with the total KPIs-score as a predictor variable. The mean C-KPIs and L-KPIs scores obtained in the pregnancy group were 11.9±2.9 and 8.5±1.7, respectively. Routinely, in all cases where the C-KPIs score was ≥9, after the procedure, the L-KPIs score obtained was ≤6, a revision of the laboratory procedure was performed to assess quality standards. This total KPIs-score could set up benchmarks for clinical pregnancy. Moreover, IQC can use C-KPIs and L-KPIs scores to detect problems

Scoring sacroiliac joints by magnetic resonance imaging. A multiple-reader reliability experiment

DEFF Research Database (Denmark)

Landewe, Robert B.M.; Hermann, Kay Geert A; Van Der Heijde, Desiree M.F.M

2005-01-01

Magnetic resonance imaging (MRI) of the sacroiliac (SI) joints and the spine is increasingly important in the assessment of inflammatory activity and structural damage in clinical trials with patients with ankylosing spondylitis (AS). We investigated inter-reader reliability and sensitivity...
Modification site localization scoring integrated into a search engine.

Science.gov (United States)

Baker, Peter R; Trinidad, Jonathan C; Chalkley, Robert J

2011-07-01

Large proteomic data sets identifying hundreds or thousands of modified peptides are becoming increasingly common in the literature. Several methods for assessing the reliability of peptide identifications both at the individual peptide or data set level have become established. However, tools for measuring the confidence of modification site assignments are sparse and are not often employed. A few tools for estimating phosphorylation site assignment reliabilities have been developed, but these are not integral to a search engine, so require a particular search engine output for a second step of processing. They may also require use of a particular fragmentation method and are mostly only applicable for phosphorylation analysis, rather than post-translational modifications analysis in general. In this study, we present the performance of site assignment scoring that is directly integrated into the search engine Protein Prospector, which allows site assignment reliability to be automatically reported for all modifications present in an identified peptide. It clearly indicates when a site assignment is ambiguous (and if so, between which residues), and reports an assignment score that can be translated into a reliability measure for individual site assignments.
Achilles tendon Total Rupture Score at 3 months can predict patients' ability to return to sport 1 year after injury

DEFF Research Database (Denmark)

Hansen, Maria Swennergren; Christensen, Marianne; Budolfsen, Thomas

2016-01-01

PURPOSE: To investigate how the Achilles tendon Total Rupture Score (ATRS) at 3 months and 1 year after injury is associated with a patient's ability to return to work and sports as well as to investigate whether sex and age influence ATRS after 3 months and 1 year. METHOD: This is a retrospectiv...
Evaluation of Factorial Validity and Reliability of a Food Behavior Checklist for Low-Income Filipinos.

Science.gov (United States)

Suzuki, Asuka; Choi, So Yung; Lim, Eunjung; Tauyan, Socorro; Banna, Jinan C

To examine factorial validity, test-retest reliability, and internal consistency of a Tagalog-language food behavior checklist (FBC) for a low-income Filipino population. Participants (n = 160) completed the FBC on 2 occasions 3 weeks apart. Factor structure was examined using principal component analysis. For internal consistency, Cronbach α was calculated. For test-retest reliability, Spearman correlation or intraclass correlation coefficient (ICC) was calculated between scores at the 2 points. All but 1 item loaded on 6 factors: fruit and vegetable quantity, fruit and vegetable variety, fast food, sweetened beverage, healthy fat, and diet quality. Cronbach α was .75 for the total scale (range, .39-.76 for subscales). Spearman correlation was 0.78 (ICC, 0.79) for the total scale (range, 0.66-0.80 [ICC, 0.68-0.80] for subscales). The FBC demonstrated adequate factorial validity, test-retest reliability, and internal consistency. With additional testing, the FBC may be used to evaluate the US Department of Agriculture's nutrition education programs for Tagalog speakers. Copyright © 2017 Society for Nutrition Education and Behavior. Published by Elsevier Inc. All rights reserved.
Screening applicants for risk of poor academic performance: a novel scoring system using preadmission grade point averages and graduate record examination scores.

Science.gov (United States)

Luce, David

2011-01-01

The purpose of this study was to develop an effective screening tool for identifying physician assistant (PA) program applicants at highest risk for poor academic performance. Prior to reviewing applications for the class of 2009, a retrospective analysis of preadmission data took place for the classes of 2006, 2007, and 2008. A single composite score was calculated for each student who matriculated (number of subjects, N=228) incorporating the total undergraduate grade point average (UGPA), the science GPA (SGPA), and the three component Graduate Record Examination (GRE) scores: verbal (GRE-V), quantitative (GRE-Q), analytical (GRE-A). Individual applicant scores for each of the five parameters were ranked in descending quintiles. Each applicant's five quintile scores were then added, yielding a total quintile score ranging from 25, which indicated an excellent performance, to 5, which indicated poorer performance. Thirteen of the 228 students had academic difficulty (dismissal, suspension, or one-quarter on academic warning or probation). Twelve of the 13 students having academic difficulty had a preadmission total quintile score 12 (range, 6-14). In response to this descriptive analysis, when selecting applicants for the class of 2009, the admissions committee used the total quintile score for screening applicants for interviews. Analysis of correlations in preadmission, graduate, and postgraduate performance data for the classes of 2009-2013 will continue and may help identify those applicants at risk for academic difficulty. Establishing a threshold total quintile score of applicant GPA and GRE scores may significantly decrease the number of entering PA students at risk for poor academic performance.
[Circadian rhythm : Influence on Epworth Sleepiness Scale score].

Science.gov (United States)

Herzog, M; Bedorf, A; Rohrmeier, C; Kühnel, T; Herzog, B; Bremert, T; Plontke, S; Plößl, S

2017-02-01

The Epworth Sleepiness Scale (ESS) is frequently used to determine daytime sleepiness in patients with sleep-disordered breathing. It is still unclear whether different levels of alertness induced by the circadian rhythm influence ESS score. The aim of this study is to investigate the influence of circadian rhythm-dependent alertness on ESS performance. In a monocentric prospective noninterventional observation study, 97 patients with suspected sleep-disordered breathing were investigated with respect to daytime sleepiness in temporal relationship to polysomnographic examination and treatment. The Karolinska Sleepiness Scale (KSS) and the Stanford Sleepiness Scale (SSS) served as references for the detection of present sleepiness at three different measurement times (morning, noon, evening), prior to and following a diagnostic polysomnography night as well as after a continuous positive airway pressure (CPAP) titration night (9 measurements in total). The KSS, SSS, and ESS were performed at these times in a randomized order. The KSS and SSS scores revealed a circadian rhythm-dependent curve with increased sleepiness at noon and in the evening. Following a diagnostic polysomnography night, the scores were increased compared to the measurements prior to the night. After the CPAP titration night, sleepiness in the morning was reduced. KSS and SSS reflect the changes in alertness induced by the circadian rhythm. The ESS score war neither altered by the intra-daily nor by the inter-daily changes in the level of alertness. According to the present data, the ESS serves as a reliable instrument to detect the level of daytime sleepiness independently of the circadian rhythm-dependent level of alertness.
First quality score for referral letters in gastroenterology-a validation study.

Science.gov (United States)

Eskeland, Sigrun Losada; Brunborg, Cathrine; Seip, Birgitte; Wiencke, Kristine; Hovde, Øistein; Owen, Tanja; Skogestad, Erik; Huppertz-Hauss, Gert; Halvorsen, Fred-Arne; Garborg, Kjetil; Aabakken, Lars; de Lange, Thomas

2016-10-08

To create and validate an objective and reliable score to assess referral quality in gastroenterology. An observational multicentre study. 25 gastroenterologists participated in selecting variables for a Thirty Point Score (TPS) for quality assessment of referrals to gastroenterology specialist healthcare for 9 common indications. From May to September 2014, 7 hospitals from the South-Eastern Norway Regional Health Authority participated in collecting and scoring 327 referrals to a gastroenterologist. Correlation between the TPS and a visual analogue scale (VAS) for referral quality. The 327 referrals had an average TPS of 13.2 (range 1-25) and an average VAS of 4.7 (range 0.2-9.5). The reliability of the score was excellent, with an intra-rater intraclass correlation coefficient (ICC) of 0.87 and inter-rater ICC of 0.91. The overall correlation between the TPS and the VAS was moderate (r=0.42), and ranged from fair to substantial for the various indications. Mean agreement was good (ICC=0.47, 95% CI (0.34 to 0.57)), ranging from poor to good. The TPS is reliable, objective and shows good agreement with the subjective VAS. The score may be a useful tool for assessing referral quality in gastroenterology, particularly important when evaluating the effect of interventions to improve referral quality. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
THE INTRA- AND INTER-RATER RELIABILITY OF THE SOCCER INJURY MOVEMENT SCREEN (SIMS).

Science.gov (United States)

McCunn, Robert; Aus der Fünten, Karen; Govus, Andrew; Julian, Ross; Schimpchen, Jan; Meyer, Tim

2017-02-01

The growing volume of movement screening research reveals a belief among practitioners and researchers alike that movement quality may have an association with injury risk. However, existing movement screening tools have not considered the sport-specific movement and injury patterns relevant to soccer. The present study introduces the Soccer Injury Movement Screen (SIMS), which has been designed specifically for use within soccer. Furthermore, the purpose of the present study was to assess the intra- and inter-rater reliability of the SIMS and determine its suitability for use in further research. The study utilized a test-retest design to discern reliablility. Twenty-five (11 males, 14 females) healthy, recreationally active university students (age 25.5 ± 4.0 years, height 171 ± 9 cm, weight 64.7 ± 12.6 kg) agreed to participate. The SIMS contains five sub-tests: the anterior reach, single-leg deadlift, in-line lunge, single-leg hop for distance and tuck jump. Each movement was scored out of 10 points and summed to produce a composite score out of 50. The anterior reach and single-leg hop for distance were scored in real-time while the remaining tests were filmed and scored retrospectively. Three raters conducted the SIMS with each participant on three occasions separated by an average of three and a half days (minimum one day, maximum seven days). Rater 1 re-scored the filmed movements for all participants on all occasions six months later to establish the 'pure' intra-rater (intra-occasion) reliability for those movements. Intraclass correlation coefficient (ICC) values for intra- and inter-rater composite score reliability ranged from 0.66-0.72 and 0.79-0.86 respectively. Weighted kappa values representing the intra- and inter-rater reliability of the individual sub-tests ranged from 0.35-0.91 indicating fair to almost perfect agreement. Establishing the reliability of the SIMS is a prerequisite for further research seeking to investigate
Psychometric properties of a Swedish translation of the VISA-P outcome score for patellar tendinopathy.

Science.gov (United States)

Frohm, Anna; Saartok, Tönu; Edman, Gunnar; Renström, Per

2004-12-18

Self-administrated patient outcome scores are increasingly recommended for evaluation of primary outcome in clinical studies. The VISA-P score, developed at the Victorian Institute of Sport Assessment in Melbourne, Australia, is a questionnaire developed for patients with patellar tendinopathy and the patients assess severity of symptoms, function and ability to participate in sport. The aim of this study was to translate the questionnaire into Swedish and to study the reliability and validity of the translated questionnaire and resultant scores. The questionnaire was translated into Swedish according to internationally recommended guidelines for cross-cultural adaptation of self-report measures. The reliability and validity were tested in three different populations. The populations used were healthy students (n = 17), members of the Swedish male national basketball team (n = 17), considered as a population at risk, and a group of non-surgically treated patients (n = 17) with clinically diagnosed patellar tendinopathy. The questionnaire was completed by 51 subjects altogether. The translated VISA-P questionnaire showed very good test-retest reliability (ICC = 0.97).The mean (+/- SD) of the VISA-P score, at both the first and second test occasions was highest in the healthy student group 83 (+/- 13) and 81 (+/- 15), respectively. The score of the basketball players was 79 (+/- 24) and 80 (+/- 23), while the patient group scored significantly (p < 0.05) lower, 48 (+/- 20) and 52 (+/- 19). The translated version of the VISA-P questionnaire was linguistically and culturally equivalent to the original version. The translated score showed good reliability.
Psychometric properties of a Swedish translation of the VISA-P outcome score for patellar tendinopathy

Directory of Open Access Journals (Sweden)

Edman Gunnar

2004-12-01

Full Text Available Abstract Background Self-administrated patient outcome scores are increasingly recommended for evaluation of primary outcome in clinical studies. The VISA-P score, developed at the Victorian Institute of Sport Assessment in Melbourne, Australia, is a questionnaire developed for patients with patellar tendinopathy and the patients assess severity of symptoms, function and ability to participate in sport. The aim of this study was to translate the questionnaire into Swedish and to study the reliability and validity of the translated questionnaire and resultant scores. Methods The questionnaire was translated into Swedish according to internationally recommended guidelines for cross-cultural adaptation of self-report measures. The reliability and validity were tested in three different populations. The populations used were healthy students (n = 17, members of the Swedish male national basketball team (n = 17, considered as a population at risk, and a group of non-surgically treated patients (n = 17 with clinically diagnosed patellar tendinopathy. The questionnaire was completed by 51 subjects altogether. Results The translated VISA-P questionnaire showed very good test-retest reliability (ICC = 0.97. The mean (± SD of the VISA-P score, at both the first and second test occasions was highest in the healthy student group 83 (± 13 and 81 (± 15, respectively. The score of the basketball players was 79 (± 24 and 80 (± 23, while the patient group scored significantly (p Conclusions The translated version of the VISA-P questionnaire was linguistically and culturally equivalent to the original version. The translated score showed good reliability.
The role of test-retest reliability in measuring individual and group differences in executive functioning.

Science.gov (United States)

Paap, Kenneth R; Sawi, Oliver

2016-12-01

Studies testing for individual or group differences in executive functioning can be compromised by unknown test-retest reliability. Test-retest reliabilities across an interval of about one week were obtained from performance in the antisaccade, flanker, Simon, and color-shape switching tasks. There is a general trade-off between the greater reliability of single mean RT measures, and the greater process purity of measures based on contrasts between mean RTs in two conditions. The individual differences in RT model recently developed by Miller and Ulrich was used to evaluate the trade-off. Test-retest reliability was statistically significant for 11 of the 12 measures, but was of moderate size, at best, for the difference scores. The test-retest reliabilities for the Simon and flanker interference scores were lower than those for switching costs. Standard practice evaluates the reliability of executive-functioning measures using split-half methods based on data obtained in a single day. Our test-retest measures of reliability are lower, especially for difference scores. These reliability measures must also take into account possible day effects that classical test theory assumes do not occur. Measures based on single mean RTs tend to have acceptable levels of reliability and convergent validity, but are "impure" measures of specific executive functions. The individual differences in RT model shows that the impurity problem is worse than typically assumed. However, the "purer" measures based on difference scores have low convergent validity that is partly caused by deficiencies in test-retest reliability. Copyright © 2016 Elsevier B.V. All rights reserved.
Reliable and Valid Assessment of Point-of-care Ultrasonography

DEFF Research Database (Denmark)

Todsen, Tobias; Tolsgaard, Martin Grønnebæk; Olsen, Beth Härstedt

2015-01-01

physicians' OSAUS scores with diagnostic accuracy. RESULTS: The generalizability coefficient was high (0.81) and a D-study demonstrated that 1 assessor and 5 cases would result in similar reliability. The construct validity of the OSAUS scale was supported by a significant difference in the mean scores......OBJECTIVE: To explore the reliability and validity of the Objective Structured Assessment of Ultrasound Skills (OSAUS) scale for point-of-care ultrasonography (POC US) performance. BACKGROUND: POC US is increasingly used by clinicians and is an essential part of the management of acute surgical...... conditions. However, the quality of performance is highly operator-dependent. Therefore, reliable and valid assessment of trainees' ultrasonography competence is needed to ensure patient safety. METHODS: Twenty-four physicians, representing novices, intermediates, and experts in POC US, scanned 4 different...
Computed tomography for the detection of distal radioulnar joint instability: normal variation and reliability of four CT scoring systems in 46 patients

Energy Technology Data Exchange (ETDEWEB)

Wijffels, Mathieu; Krijnen, Pieta; Schipper, Inger [Leiden University Medical Center, Department of Surgery-Trauma Surgery, P.O. Box 9600, Leiden (Netherlands); Stomp, Wouter; Reijnierse, Monique [Leiden University Medical Center, Department of Radiology, P.O. Box 9600, Leiden (Netherlands)

2016-11-15

The diagnosis of distal radioulnar joint (DRUJ) instability is clinically challenging. Computed tomography (CT) may aid in the diagnosis, but the reliability and normal variation for DRUJ translation on CT have not been established in detail. The aim of this study was to evaluate inter- and intraobserver agreement and normal ranges of CT scoring methods for determination of DRUJ translation in both posttraumatic and uninjured wrists. Patients with a conservatively treated, unilateral distal radius fracture were included. CT scans of both wrists were evaluated independently, by two readers using the radioulnar line method, subluxation ratio method, epicenter method and radioulnar ratio method. The inter- and intraobserver agreement was assessed and normal values were determined based on the uninjured wrists. Ninety-two wrist CTs (mean age: 56.5 years, SD: 17.0, mean follow-up 4.2 years, SD: 0.5) were evaluated. Interobserver agreement was best for the epicenter method [ICC = 0.73, 95 % confidence interval (CI) 0.65-0.79]. Intraobserver agreement was almost perfect for the radioulnar line method (ICC = 0.82, 95 % CI 0.77-0.87). Each method showed a wide normal range for normal DRUJ translation. Normal range for the epicenter method is -0.35 to -0.06 in pronation and -0.11 to 0.19 in supination. DRUJ translation on CT in pro- and supination can be reliably evaluated in both normal and posttraumatic wrists, however with large normal variation. The epicenter method seems the most reliable. Scanning of both wrists might be helpful to prevent the radiological overdiagnosis of instability. (orig.)
The Reliability of Turkish "Basic Life Support" and "Cardiac Massage" Videos Uploaded to Websites.

Science.gov (United States)

Elicabuk, Hayri; Yaylacı, Serpil; Yilmaz, Atakan; Hatipoglu, Celile; Kaya, F Gokhan; Serinken, Mustafa

2016-02-01

In this study, the reliability of Turkish cardiac massage and Basic Life Support (BLS) videos, which have already been downloaded from three website such as YouTube, Google, Yahoo following the publication of 2010 cardiopulmonary resuscitation (CPR) guideline and their suitability to the same guideline were researched. The videos uploaded to the three web-site to search videos on internet were queried by using the keywords "cardiac massage" and "basic life support". Videos that had been uploaded between January 2011 and July 2014 were analyzed and scored by two experienced emergency specialists. A total of 1126 videos were obtained. 1029 of the videos (91.4%) were excluded by researchers. 97 videos were detected to accord with study criteria. Despite most of the videos were found on Google website by keywords, the enormous part of videos proper to criteria were sourced from YouTube website (n=65, 67.0%). One fourth of the videos (24.7%) were observed to not be suitable for 2010 CPR guideline. AED usage was mentioned slightly in the videos (14.4%). Median score of the videos is 5 (IQR: 4-6). The rate and scores of the videos uploaded by official institution or association were significantly higher than others (p=0.007 and 0.006, respectively). Moreover, scores of the videos compatible with guidelines uploaded by official institution or association and medical personal were also found higher (p=0.001). Eventually, all the data obtained in this study support that Turkish videos were not reliable on the subject of BLS and cardiac massage. It is promising that videos with high follow-up rates also have been scored higher.
Reliability and validity of the Parenting Scale of Inconsistency.

Science.gov (United States)

Yoshizumi, Takahiro; Murase, Satomi; Murakami, Takashi; Takai, Jiro

2006-08-01

The purposes of the present study were to develop a Parenting Scale of Inconsistency and to evaluate its initial reliability and validity. The 12 items assess the inconsistency among parents' moods, behaviors, and attitudes toward children. In the primary study, 517 participants completed three measures: the new Parenting Scale of Inconsistency, the Parental Bonding Instrument, and the Depression Scale of the General Health Questionnaire. The Parenting Scale of Inconsistency had good test-retest reliability of .85 and internal consistency of .88 (Cronbach coefficient alpha). Construct validity was good as Inconsistency scores were significantly correlated with the Care and Overprotection scores of the Parental Bonding Instrument and with the Depression scores. Moreover, Inconsistency scores' relation with a dimension of parenting style distinct from Care and Overprotection suggested that the Parenting Scale of Inconsistency had factorial validity. This scale seems a potential measure for examining the relationships between inconsistent parenting and the mental health of children.
Test Reliability at the Individual Level

Science.gov (United States)

Hu, Yueqin; Nesselroade, John R.; Erbacher, Monica K.; Boker, Steven M.; Burt, S. Alexandra; Keel, Pamela K.; Neale, Michael C.; Sisk, Cheryl L.; Klump, Kelly

2016-01-01

Reliability has a long history as one of the key psychometric properties of a test. However, a given test might not measure people equally reliably. Test scores from some individuals may have considerably greater error than others. This study proposed two approaches using intraindividual variation to estimate test reliability for each person. A simulation study suggested that the parallel tests approach and the structural equation modeling approach recovered the simulated reliability coefficients. Then in an empirical study, where forty-five females were measured daily on the Positive and Negative Affect Schedule (PANAS) for 45 consecutive days, separate estimates of reliability were generated for each person. Results showed that reliability estimates of the PANAS varied substantially from person to person. The methods provided in this article apply to tests measuring changeable attributes and require repeated measures across time on each individual. This article also provides a set of parallel forms of PANAS. PMID:28936107
Risk score to predict gastrointestinal bleeding after acute ischemic stroke.

Science.gov (United States)

Ji, Ruijun; Shen, Haipeng; Pan, Yuesong; Wang, Penglian; Liu, Gaifen; Wang, Yilong; Li, Hao; Singhal, Aneesh B; Wang, Yongjun

2014-07-25

Gastrointestinal bleeding (GIB) is a common and often serious complication after stroke. Although several risk factors for post-stroke GIB have been identified, no reliable or validated scoring system is currently available to predict GIB after acute stroke in routine clinical practice or clinical trials. In the present study, we aimed to develop and validate a risk model (acute ischemic stroke associated gastrointestinal bleeding score, the AIS-GIB score) to predict in-hospital GIB after acute ischemic stroke. The AIS-GIB score was developed from data in the China National Stroke Registry (CNSR). Eligible patients in the CNSR were randomly divided into derivation (60%) and internal validation (40%) cohorts. External validation was performed using data from the prospective Chinese Intracranial Atherosclerosis Study (CICAS). Independent predictors of in-hospital GIB were obtained using multivariable logistic regression in the derivation cohort, and β-coefficients were used to generate point scoring system for the AIS-GIB. The area under the receiver operating characteristic curve (AUROC) and the Hosmer-Lemeshow goodness-of-fit test were used to assess model discrimination and calibration, respectively. A total of 8,820, 5,882, and 2,938 patients were enrolled in the derivation, internal validation and external validation cohorts. The overall in-hospital GIB after AIS was 2.6%, 2.3%, and 1.5% in the derivation, internal, and external validation cohort, respectively. An 18-point AIS-GIB score was developed from the set of independent predictors of GIB including age, gender, history of hypertension, hepatic cirrhosis, peptic ulcer or previous GIB, pre-stroke dependence, admission National Institutes of Health stroke scale score, Glasgow Coma Scale score and stroke subtype (Oxfordshire). The AIS-GIB score showed good discrimination in the derivation (0.79; 95% CI, 0.764-0.825), internal (0.78; 95% CI, 0.74-0.82) and external (0.76; 95% CI, 0.71-0.82) validation cohorts
Reliability and validity of the Malay translated version of diabetes quality of life for youth questionnaire

Directory of Open Access Journals (Sweden)

Jamaiyah H

2013-05-01

Full Text Available Introduction: Many studies reported poorer quality of life (QoL in youth with diabetes compared to healthy peers. One of the tools used is the Diabetes Quality of Life for Youth (DQoLY questionnaire in English. A validated instrument in Malay is needed to assess the perception of QoL among youth with diabetes in Malaysia. Objective: To translate the modified version, i.e., the DQoLY questionnaire,into Malay and determine its reliability and validity.Methods: Translation and back-translation were used. An expert panel reviewed the translated version for conceptual and content equivalence. The final version was then administered to youths with type 1 diabetes mellitus from the universities and Ministry of Health hospitals between August 2006 and September 2007. Reliability was analysed using Cronbach’s alpha, while validity was confirmed using concurrent validity (HbA1c and self-rated health score.Results: A total of 82 youths with type 1 diabetes (38 males aged 10-18 years were enrolled from eight hospitals. The reliability of overall questionnaire was 0.917, and the reliabilities of the three domains ranged from 0.832 to 0.867. HbA1c was positively correlated with worry (p=0.03. The self-rated health score was found to have significant negative correlation with the “satisfaction” (p=0.013 and “impact” (p=0.007 domains.Conclusion: The Malay translated version of DQoLY questionnaire was reliable and valid to be used among youths with type 2 diabetes in Malaysia.
Guidelines for Reporting Reliability and Agreement Studies (GRRAS) were proposed

DEFF Research Database (Denmark)

Kottner, Jan; Audigé, Laurent; Brorson, Stig

2011-01-01

Results of reliability and agreement studies are intended to provide information about the amount of error inherent in any diagnosis, score, or measurement. The level of reliability and agreement among users of scales, instruments, or classifications is widely unknown. Therefore, there is a need ......, standards, or guidelines for reporting reliability and agreement in the health care and medical field are lacking. The objective was to develop guidelines for reporting reliability and agreement studies....
The six-item Clock Drawing Test – reliability and validity in mild Alzheimer’s disease

DEFF Research Database (Denmark)

Jørgensen, Kasper; Kristensen, Maria K; Waldemar, Gunhild

2015-01-01

This study presents a reliable, short and practical version of the Clock Drawing Test (CDT) for clinical use and examines its diagnostic accuracy in mild Alzheimer's disease versus elderly nonpatients. Clock drawings from 231 participants were scored independently by four clinical neuropsychologi......This study presents a reliable, short and practical version of the Clock Drawing Test (CDT) for clinical use and examines its diagnostic accuracy in mild Alzheimer's disease versus elderly nonpatients. Clock drawings from 231 participants were scored independently by four clinical...... neuropsychologists blind to diagnostic classification. The interrater agreement of individual scoring criteria was analyzed and items with poor or moderate reliability were excluded. The classification accuracy of the resulting scoring system - the six-item CDT - was examined. We explored the effect of further...

Development and reliability of a multi-modality scoring system for evaluation of disease progression in pre-clinical models of osteoarthritis: celecoxib may possess disease-modifying properties.

Science.gov (United States)

Panahifar, A; Jaremko, J L; Tessier, A G; Lambert, R G; Maksymowych, W P; Fallone, B G; Doschak, M R

2014-10-01

We sought to develop a comprehensive scoring system for evaluation of pre-clinical models of osteoarthritis (OA) progression, and use this to evaluate two different classes of drugs for management of OA. Post-traumatic OA (PTOA) was surgically induced in skeletally mature rats. Rats were randomly divided in three groups receiving either glucosamine (high dose of 192 mg/kg) or celecoxib (clinical dose) or no treatment. Disease progression was monitored utilizing micro-magnetic resonance imaging (MRI), micro-computed tomography (CT) and histology. Pertinent features such as osteophytes, subchondral sclerosis, joint effusion, bone marrow lesion (BML), cysts, loose bodies and cartilage abnormalities were included in designing a sensitive multi-modality based scoring system, termed the rat arthritis knee scoring system (RAKSS). Overall, an inter-observer correlation coefficient (ICC) of greater than 0.750 was achieved for each scored feature. None of the treatments prevented cartilage loss, synovitis, joint effusion, or sclerosis. However, celecoxib significantly reduced osteophyte development compared to placebo. Although signs of inflammation such as synovitis and joint effusion were readily identified at 4 weeks post-operation, we did not detect any BML. We report the development of a sensitive and reliable multi-modality scoring system, the RAKSS, for evaluation of OA severity in pre-clinical animal models. Using this scoring system, we found that celecoxib prevented enlargement of osteophytes in this animal model of PTOA, and thus it may be useful in preventing OA progression. However, it did not show any chondroprotective effect using the recommended dose. In contrast, high dose glucosamine had no measurable effects. Copyright © 2014 Osteoarthritis Research Society International. Published by Elsevier Ltd. All rights reserved.
[Reliability and validity of the PAQ-A questionnaire to assess physical activity in Spanish adolescents].

Science.gov (United States)

Martínez-Gómez, David; Martínez-de-Haro, Vicente; Pozo, Tamara; Welk, Gregory J; Villagra, Ariel; Calle, Marisa E; Marcos, Ascensión; Veiga, Oscar L

2009-01-01

Questionnaires are feasible instruments to assess physical activity (PA) in large samples. The aim of the current study was to evaluate the reliability and validity of the PAQ-A questionnaire in Spanish adolescents using the measurement of PA by accelerometer as criterion. In a sample of 82 adolescents, aged 12 to 17 years, 1-week PAQ-A test-retest was administered. Reliability was analyzed by the Intraclass Correlation Coefficient (ICC) and the internal consistency by the Cronbach's alpha Coefficient. Two hundred thirty-two adolescents, aged 13-17 years, completed the PAQ-A and wore the ActiGraph GT1M accelerometer during 7-days. The PAQ-A was compared against total PA and moderate to vigorous PA (MVPA) obtained by the accelerometer. Test-retest reliability showed ICC = 0.71 for the final score of PAQ-A. Internal consistency was alpha = 0.65 in the first self-report, alpha = 0.67 in the retest in 82 adolescents sample, and alpha = 0.74 in the 232 adolescents sample. The PAQ-A was moderately correlated with total PA (rho = 0.39) and MVPA (rho= 0.34) assessed by the accelerometer. The PAQ-A obtained significantly moderate correlations in boys but not in girls against the accelerometer. The PAQ-A questionnaire shows an adequate reliability and a reasonable validity for assessing PA in Spanish adolescents.
Reliability and validity of the Farsi version of the standardized assessment of personality-abbreviated scale

Directory of Open Access Journals (Sweden)

Maryam Sepehri

2017-06-01

Full Text Available Introduction: A short screening tool for high-risk individuals with personality disorder (PD is useful both for clinicians and researchers. The aim of this study was to assess the validity and reliability of the Farsi version of the Standardized Assessment of Personality-Abbreviated Scale (SAPAS. Methods: The original English version of the SAPAS questionnaire was translated into Farsi, and then, translated back into English by two professionals. A survey was then conducted using the questionnaire on 150 clients of primary health care centers in Tabriz, Iran. A total of 235 medical students were also studied for the reliability assessment of the questionnaire. The SAPAS was compared to the short form of Minnesota Multiphasic Personality Inventory (MMPI. The data analysis was performed using receiver operating characteristic (ROC curve technique, operating characteristic for diagnostic efficacy, Cronbach's alpha, and test-retest for reliability evaluation. Results: We found an area under the curve (AUC of 0.566 [95% confidence intervals (CI: 0.455-0.677]; sensitivity of 0.89 and specificity of 0.26 at the cut-off score of 2 and higher. The total Cronbach's alpha coefficient was 0.38 and Cohen's kappa ranged between 0.5 and 0.8. Conclusion: The current study showed that the Farsi version of the SAPAS was relatively less efficient, in term of validity and reliability, in the screening of PD in the population.
Relative Merits of Four Methods for Scoring Cloze Tests.

Science.gov (United States)

Brown, James Dean

1980-01-01

Describes study comparing merits of exact answer, acceptable answer, clozentropy and multiple choice methods for scoring tests. Results show differences among reliability, mean item facility, discrimination and usability, but not validity. (BK)
Reliability and sensitivity to change of the Simple Erosion Narrowing Score compared with the Sharp-van der Heijde method for scoring radiographs in rheumatoid arthritis

NARCIS (Netherlands)

Dias, E. M.; Lukas, C.; Landewé, R.; Fatenejad, S.; van der Heijde, D.

2008-01-01

To compare the performance of a simplified scoring method for structural damage on radiographs of patients with rheumatoid arthritis (the Simple Erosion Narrowing Score or SENS) with the Sharp-van der Heijde Score (SHS) as reference. We used the radiographic data from the Trial of Etanercept and
Cross-cultural adaptation and validation of the persian version of the oxford knee score in patients with knee osteoarthritis.

Science.gov (United States)

Ebrahimzadeh, Mohammad Hosein; Makhmalbaf, Hadi; Birjandinejad, Ali; Soltani-Moghaddas, Seyed Hosein

2014-11-01

The Oxford Knee Score (OKS) is a short patient-reported outcome instrument that measures pain and physical activity related to knee osteoarthritis. The purpose of this study is to evaluate, construct validity and consistent reliability of the Persian version of the OKS. The case series consisted of 80 patients who were clinically diagnosed with having knee osteoarthritis. All patients were requested to fill-in the Persian OKS and Short-Form 36 Health Survey (SF-36). Correlation analysis between the Persian versions of these two instruments was then carried out. The scores of the Persian SF-36 were used to evaluate convergent and divergent validity of the 12-item Persian OKS. From a total of 80 patients, 63 were female (79%) and the remaining 17 were male (21%) with a mean age of 52.2 years. In the present study, high Cronbach's alpha of 0.95 confirms excellent internal consistency of the Persian OKS scale similar to previous investigations. The results confirm that the Persian version of this instrument is valid and reliable, similar to its English index and its subsequent translations in different languages. The Persian OKS is a reliable instrument to evaluate knee function in patients with knee osteoarthritis and is a useful tool for outcome measurement in clinical research.
Numerical differences between Guttman's reliability coefficients and the GLB

NARCIS (Netherlands)

Oosterwijk, P.R.; van der Ark, L.A.; Sijtsma, K.; van der Ark, L.A.; Bolt, D.M; Wang, W.-C.; Douglas, J.A.; Wiberg, M.

2016-01-01

For samples smaller than 1000 and tests longer than ten items, the greatest lower bound (GLB) to the reliability is known to be biased and not recommended as a method to estimate test-score reliability. As a first step in finding alternative lower bounds under these conditions, we investigated the
The ageing males' symptoms scale for Chinese men: reliability,validation and applicability of the Chinese version.

Science.gov (United States)

Kong, X-b; Guan, H-t; Li, H-g; Zhou, Y; Xiong, C-l

2014-11-01

In this study, the ageing males' symptoms (AMS) scale was translated into Chinese following methodological recommendations for linguistic and cultural adaptation. This study aimed to confirm the reliability, validation and applicability of the simplified Chinese version of the scale (CN-AMS) in older Chinese men, a free health screening for men older than 40 years was conducted. All participants completed a health questionnaire, which consisted of personal health information, AMS scale, the generic quality of life (QoL) instrument SF36 and the Beck Depression Inventory (BDI). The fasting blood samples of participants were collected on the day of completing the health questionnaire. Serum total testosterone (TT), albumin and sex hormone-binding globulin levels were measured and the level of free testosterone was calculated (calculated free testosterone, CFT). A total of 244 men (mean age: 52 ± 7.3 years, range: 40-79 years) were involved in the investigation and provided informed consent before their participation. The reliability of CN-AMS was analysed as internal consistency reliability (Cronbach's alpha was 0.91) as well as a 4-week-interval test-retest stability (Pearson's correlation was 0.83) and found to be good. The validation of CN-AMS was analysed as the internal structure analysis (Pearson's correlation between total score and each item score r = 0.48-0.75), total-domain-correlation (among the three domains r = 0.47-0.68, p < 0.01; domains with the total score r = 0.81-0.88, p < 0.01), and cross-validation with other scales (with SF36 r = -0.59, p < 0.01; with BDI r = 0.50, p < 0.01). Androgen deficiency (AD) was defined as the presence of three sexual symptoms (decreased frequency of morning erections, sexual thoughts and erectile dysfunction) in combination with TT < 11 nmol/L and CFT < 220 pmol/L, and the sensitivity and specificity for CN-AMS was 68.8 and 6.8% respectively. The CN-AMS had sufficient sensitivity in
Validity and reliability of a self-administered foot evaluation questionnaire (SAFE-Q).

Science.gov (United States)

Niki, Hisateru; Tatsunami, Shinobu; Haraguchi, Naoki; Aoki, Takafumi; Okuda, Ryuzo; Suda, Yasunori; Takao, Masato; Tanaka, Yasuhito

2013-03-01

The Japanese Society for Surgery of the Foot (JSSF) is developing a QOL questionnaire instrument for use in pathological conditions related to the foot and ankle. The main body of the outcome instrument (the Self-Administered Foot Evaluation Questionnaire, SAFE-Q version 2) consists of 34 questionnaire items, which provide five subscale scores (1: Pain and Pain-Related; 2: Physical Functioning and Daily Living; 3: Social Functioning; 4: Shoe-Related; and 5: General Health and Well-Being). In addition, the instrument has nine optional questionnaire items that provide a Sports Activity subscale score. The purpose of this study was to evaluate the test-retest reliability of the SAFE-Q. Version 2 of the SAFE-Q was administered to 876 patients and 491 non-patients, and the test-retest reliability was evaluated for 131 patients. In addition, the SF-36 questionnaire and the JSSF Scale scoring form were administered to all of the participants. Subscale scores were scaled such that the final sum of scores ranged between zero (least healthy) to 100 (healthiest). The intraclass correlation coefficients were larger than 0.7 for all of the scores. The means of the five subscale scores were between 60 and 75. The five subscales easily separated patients from non-patients. The coefficients for the correlations of the subscale scores with the scores on the JSSF Scale and the SF-36 subscales were all highly statistically significantly greater than zero (p valid and reliable. In the future, it will be beneficial to test the responsiveness of the SAFE-Q.
Reliability and Validity of a Turkish version of the Prenatal Breastfeeding Self-Efficacy Scale.

Science.gov (United States)

Aydin, Ayse; Pasinlioglu, Turkan

2018-05-18

This study aims to conduct reliability and validity study of the Turkish version of the "Prenatal Breastfeeding Self-Efficacy Scale", which determines pregnant women's perception of breastfeeding self-efficacy in the prenatal period. This methodological research was carried out between December 2014 and May 2016 in maternity clinics of the Erzurum Nene Hatun Maternity Hospital and Atatürk University Research Hospital. The study population consisted of pregnant women, admitted to the specified clinics for prenatal controls. The study was carried out with 326 pregnant women, who met the inclusion criteria and agreed to participate in the research without any sample selection. "Personal Information Form" and "Prenatal Breastfeeding Self-Efficacy Scale - Turkish Form" were used for data collection. The data were collected by the face-to-face interview method, and analyzed by SPSS 18 software. In the validity-reliability analysis of the scale, language and content validity, explanatory factor analysis, Cronbach's Alpha coefficient, item-total score correlation, and testretest methods were used. Linguistic validity was verified by the translation-backtranslation of the Prenatal Breastfeeding Self-Efficacy Scale, then the necessary corrections were made according to the recommendations of the expert opinions, to ensure the content validity. As a result of the explanatory factor analysis, performed to determine the construct validity of the scale, a single factor structure was found, having factor loadings in the appropriate range (0.30-0.76). In the internal consistency analysis of the scale, Cronbach's Alpha was 0.86, and the item-total score correlations were between 0.23 and 0.65, and no item was removed from the scale. In order to test the time-invariance of the scale, the test-retest correlation value was found to be 0.94. The relationship between the two applications were determined to be statistically significant (p valid and reliable measurement instrument
The Screening Test for Emotional Problems-Parent Report (STEP-P): Studies of Reliability and Validity

Science.gov (United States)

Erford, Bradley T.; Alsamadi, Silvana C.

2012-01-01

Score reliability and validity of parent responses concerning their 10- to 17-year-old students were analyzed using the Screening Test for Emotional Problems-Parent Report (STEP-P), which assesses a variety of emotional problems classified under the Individuals with Disabilities Education Improvement Act. Score reliability, convergent, and…
Reliability of the ECHOWS Tool for Assessment of Patient Interviewing Skills.

Science.gov (United States)

Boissonnault, Jill S; Evans, Kerrie; Tuttle, Neil; Hetzel, Scott J; Boissonnault, William G

2016-04-01

History taking is an important component of patient/client management. Assessment of student history-taking competency can be achieved via a standardized tool. The ECHOWS tool has been shown to be valid with modest intrarater reliability in a previous study but did not demonstrate sufficient power to definitively prove its stability. The purposes of this study were: (1) to assess the reliability of the ECHOWS tool for student assessment of patient interviewing skills and (2) to determine whether the tool discerns between novice and experienced skill levels. A reliability and construct validity assessment was conducted. Three faculty members from the United States and Australia scored videotaped histories from standardized patients taken by students and experienced clinicians from each of these countries. The tapes were scored twice, 3 to 6 weeks apart. Reliability was assessed using interclass correlation coefficients (ICCs) and repeated measures. Analysis of variance models assessed the ability of the tool to discern between novice and experienced skill levels. The ECHOWS tool showed excellent intrarater reliability (ICC [3,1]=.74-.89) and good interrater reliability (ICC [2,1]=.55) as a whole. The summary of performance (S) section showed poor interrater reliability (ICC [2,1]=.27). There was no statistical difference in performance on the tool between novice and experienced clinicians. A possible ceiling effect may occur when standardized patients are not coached to provide complex and obtuse responses to interviewer questions. Variation in familiarity with the ECHOWS tool and in use of the online training may have influenced scoring of the S section. The ECHOWS tool demonstrates excellent intrarater reliability and moderate interrater reliability. Sufficient training with the tool prior to student assessment is recommended. The S section must evolve in order to provide a more discerning measure of interviewing skills. © 2016 American Physical Therapy
Posterior probability of linkage and maximal lod score.

Science.gov (United States)

Génin, E; Martinez, M; Clerget-Darpoux, F

1995-01-01

To detect linkage between a trait and a marker, Morton (1955) proposed to calculate the lod score z(theta 1) at a given value theta 1 of the recombination fraction. If z(theta 1) reaches +3 then linkage is concluded. However, in practice, lod scores are calculated for different values of the recombination fraction between 0 and 0.5 and the test is based on the maximum value of the lod score Zmax. The impact of this deviation of the test on the probability that in fact linkage does not exist, when linkage was concluded, is documented here. This posterior probability of no linkage can be derived by using Bayes' theorem. It is less than 5% when the lod score at a predetermined theta 1 is used for the test. But, for a Zmax of +3, we showed that it can reach 16.4%. Thus, considering a composite alternative hypothesis instead of a single one decreases the reliability of the test. The reliability decreases rapidly when Zmax is less than +3. Given a Zmax of +2.5, there is a 33% chance that linkage does not exist. Moreover, the posterior probability depends not only on the value of Zmax but also jointly on the family structures and on the genetic model. For a given Zmax, the chance that linkage exists may then vary.
A Reliability Generalization Study of the Marlowe-Crowne Social Desirability Scale.

Science.gov (United States)

Beretvas, S, Natasha; Meyers, Jason L.; Leite, Walter L.

2002-01-01

Conducted a reliability generalization study of the Marlowe-Crowne Social Desirability Scale (D. Crowne and D. Marlowe, 1960). Analysis of 93 studies show that the predicted score reliability for male adolescents was 0.53, and reliability for men's responses was lower than for women's. Discusses the need for further analysis of the scale. (SLD)
The Adaptation of Acceptance of Couple Violence Scale into Turkish: Validity and Reliability Studies

Directory of Open Access Journals (Sweden)

Özcan SEZER

2008-01-01

Full Text Available This study investigates the validity and reliability of the Turkish adaptation ofAcceptance of Couple Violence Scale (ACVS. The data of research has been attainedfrom 474 (M =243, F=231 high school students who were attending 1st, 2nd and 3thclass and coming from middle socio-economic levels in Malatya. Acceptance of CoupleViolence Scale has 11 items, Likert type and 4 point response format. The constructvalidity of ACVS was conducted by using exploratory factor analysis and varimaxrotation. Single independent factor with the eigenvalue over 1.00 has been found. Thisfactor explained 44% of total variance. To test concurrent validity, correlations betweenscores on ACVS and Aggressiveness Questionnaire were calculated. There was asignificant relationship between scores on the two scales (r= .61. Cronbach alphacoefficient of the scale was found “.87”; test-retest correlation coefficient was “r=.80”.Item-total correlation co-efficiencies vary between “.52” and “.71”. Findings show thatACVS can be used with acceptable level of validity and reliability for high schoolstudents.
Reliability of Oronasal Fistula Classification.

Science.gov (United States)

Sitzman, Thomas J; Allori, Alexander C; Matic, Damir B; Beals, Stephen P; Fisher, David M; Samson, Thomas D; Marcus, Jeffrey R; Tse, Raymond W

2018-01-01

Objective Oronasal fistula is an important complication of cleft palate repair that is frequently used to evaluate surgical quality, yet reliability of fistula classification has never been examined. The objective of this study was to determine the reliability of oronasal fistula classification both within individual surgeons and between multiple surgeons. Design Using intraoral photographs of children with repaired cleft palate, surgeons rated the location of palatal fistulae using the Pittsburgh Fistula Classification System. Intrarater and interrater reliability scores were calculated for each region of the palate. Participants Eight cleft surgeons rated photographs obtained from 29 children. Results Within individual surgeons reliability for each region of the Pittsburgh classification ranged from moderate to almost perfect (κ = .60-.96). By contrast, reliability between surgeons was lower, ranging from fair to substantial (κ = .23-.70). Between-surgeon reliability was lowest for the junction of the soft and hard palates (κ = .23). Within-surgeon and between-surgeon reliability were almost perfect for the more general classification of fistula in the secondary palate (κ = .95 and κ = .83, respectively). Conclusions This is the first reliability study of fistula classification. We show that the Pittsburgh Fistula Classification System is reliable when used by an individual surgeon, but less reliable when used among multiple surgeons. Comparisons of fistula occurrence among surgeons may be subject to less bias if they use the more general classification of "presence or absence of fistula of the secondary palate" rather than the Pittsburgh Fistula Classification System.
Test-retest reliability and minimal detectable change scores for sit-to-stand-to-sit tests, the six-minute walk test, the one-leg heel-rise test, and handgrip strength in people undergoing hemodialysis.

Science.gov (United States)

Segura-Ortí, Eva; Martínez-Olmos, Francisco José

2011-08-01

Determining the relative and absolute reliability of outcomes of physical performance tests for people undergoing hemodialysis is necessary to discriminate between the true effects of exercise interventions and the inherent variability of this cohort. The aims of this study were to assess the relative reliability of sit-to-stand-to-sit tests (the STS-10, which measures the time [in seconds] required to complete 10 full stands from a sitting position, and the STS-60, which measures the number of repetitions achieved in 60 seconds), the Six-Minute Walk Test (6MWT), the one-leg heel-rise test, and the handgrip strength test and to calculate minimal detectable change (MDC) scores in people undergoing hemodialysis. This study was a prospective, nonexperimental investigation. Thirty-nine people undergoing hemodialysis at 2 clinics in Spain were contacted. Study participants performed the STS-10 (n=37), the STS-60 (n=37), and the 6MWT (n=36). At one of the settings, the participants also performed the one-leg heel-rise test (n=21) and the handgrip strength test (n=12) on both the right and the left sides. Participants attended 2 testing sessions 1 to 2 weeks apart. High intraclass correlation coefficients (≥.88) were found for all tests, suggesting good relative reliability. The MDC scores at 90% confidence intervals were as follows: 8.4 seconds for the STS-10, 4 repetitions for the STS-60, 66.3 m for the 6MWT, 3.4 kg for handgrip strength (force-generating capacity), 3.7 repetitions for the one-leg heel-rise test with the right leg, and 5.2 repetitions for the one-leg heel-rise test with the left leg. Limitations A limited sample of patients was used in this study. The STS-16, STS-60, 6MWT, one-leg heel rise test, and handgrip strength test are reliable outcome measures. The MDC scores at 90% confidence intervals for these tests will help to determine whether a change is due to error or to an intervention.
Influence of reliability of the relay protection to the whole reliability of electric power systems

International Nuclear Information System (INIS)

Stojanovski, Ljupcho I.

2001-01-01

The influence of the reliability of the elements of relay protection up today analyses of the reliability on electric power systems, very rare has been taken into consideration, in other words, in these analyses it is assumed that the reliability of the protection has value one. In this work an attempt is that through modelling of individual types of protection of the elements of high-voltage systems to make calculation to the influence of the reliability of the relay protection on the total reliability of the high-voltage systems.(Author)
Validation of microsatellite instability histology scores with Bethesda guidelines in hereditary nonpolyposis colorectal cancer

Directory of Open Access Journals (Sweden)

Mustafa Kaya

2017-01-01

Conclusions: The MSI scoring systems, MsPath, and PathScore, are reliable systems and effectively correlated with BG for predicting patients who need advanced analysis techniques because of the risk of HNPCC.
Psychometric Properties of Raw and Scale Scores on Mixed-Format Tests

Science.gov (United States)

Kolen, Michael J.; Lee, Won-Chan

2011-01-01

This paper illustrates that the psychometric properties of scores and scales that are used with mixed-format educational tests can impact the use and interpretation of the scores that are reported to examinees. Psychometric properties that include reliability and conditional standard errors of measurement are considered in this paper. The focus is…

Modified personal interviews: resurrecting reliable personal interviews for admissions?

Science.gov (United States)

Hanson, Mark D; Kulasegaram, Kulamakan Mahan; Woods, Nicole N; Fechtig, Lindsey; Anderson, Geoff

2012-10-01

Traditional admissions personal interviews provide flexible faculty-student interactions but are plagued by low inter-interview reliability. Axelson and Kreiter (2009) retrospectively showed that multiple independent sampling (MIS) may improve reliability of personal interviews; thus, the authors incorporated MIS into the admissions process for medical students applying to the University of Toronto's Leadership Education and Development Program (LEAD). They examined the reliability and resource demands of this modified personal interview (MPI) format. In 2010-2011, LEAD candidates submitted written applications, which were used to screen for participation in the MPI process. Selected candidates completed four brief (10-12 minutes) independent MPIs each with a different interviewer. The authors blueprinted MPI questions to (i.e., aligned them with) leadership attributes, and interviewers assessed candidates' eligibility on a five-point Likert-type scale. The authors analyzed inter-interview reliability using the generalizability theory. Sixteen candidates submitted applications; 10 proceeded to the MPI stage. Reliability of the written application components was 0.75. The MPI process had overall inter-interview reliability of 0.79. Correlation between the written application and MPI scores was 0.49. A decision study showed acceptable reliability of 0.74 with only three MPIs scored using one global rating. Furthermore, a traditional admissions interview format would take 66% more time than the MPI format. The MPI format, used during the LEAD admissions process, achieved high reliability with minimal faculty resources. The MPI format's reliability and effective resource use were possible through MIS and employment of expert interviewers. MPIs may be useful for other admissions tasks.
Validation of the total dysphagia risk score (TDRS) in head and neck cancer patients in a conventional and a partially accelerated radiotherapy scheme

NARCIS (Netherlands)

Nevens, Daan; Deschuymer, Sarah; Langendijk, Johannes A.; Daisne, Jean -Francois; Duprez, Frederic; De Neve, Wilfried; Nuyts, Sandra

Background and purpose: A risk model, the total dysphagia risk score (TDRS), was developed to predict which patients are most at risk to develop grade >= 2 dysphagia at 6 months following radiotherapy (RT) for head and neck cancer. The purpose of this study was to validate this model at 6 months and
Psychometric Evaluation of the Lower Extremity Computerized Adaptive Test, the Modified Harris Hip Score, and the Hip Outcome Score.

Science.gov (United States)

Hung, Man; Hon, Shirley D; Cheng, Christine; Franklin, Jeremy D; Aoki, Stephen K; Anderson, Mike B; Kapron, Ashley L; Peters, Christopher L; Pelt, Christopher E

2014-12-01

The applicability and validity of many patient-reported outcome measures in the high-functioning population are not well understood. To compare the psychometric properties of the modified Harris Hip Score (mHHS), the Hip Outcome Score activities of daily living subscale (HOS-ADL) and sports (HOS-sports), and the Lower Extremity Computerized Adaptive Test (LE CAT). The hypotheses was that all instruments would perform well but that the LE CAT would show superiority psychometrically because a combination of CAT and a large item bank allows for a high degree of measurement precision. Cohort study (diagnosis); Level of evidence, 2. Data were collected from 472 advanced-age, active participants from the Huntsman World Senior Games in 2012. Validity evidences were examined through item fit, dimensionality, monotonicity, local independence, differential item functioning, person raw score to measure correlation, and instrument coverage (ie, ceiling and floor effects), and reliability evidences were examined through Cronbach alpha and person separation index. All instruments demonstrated good item fit, unidimensionality, monotonicity, local independence, and person raw score to measure correlations. The HOS-ADL had high ceiling effects of 36.02%, and the mHHS had ceiling effects of 27.54%. The LE CAT had ceiling effects of 8.47%, and the HOS-sports had no ceiling effects. None of the instruments had any floor effects. The mHHS had a very low Cronbach alpha of 0.41 and an extremely low person separation index of 0.08. Reliabilities for the LE CAT were excellent and for the HOS-ADL and HOS-sports were good. The LE CAT showed better psychometric properties overall than the HOS-ADL, HOS-sports, and mHHS for the senior population. The mHHS demonstrated pronounced ceiling effects and poor reliabilities that should be of concern. The high ceiling effects for the HOS-ADL were also of concern. The LE CAT was superior in all psychometric aspects examined in this study. Future
Specific algorithm method of scoring the Clock Drawing Test applied in cognitively normal elderly

Directory of Open Access Journals (Sweden)

Liana Chaves Mendes-Santos

Full Text Available The Clock Drawing Test (CDT is an inexpensive, fast and easily administered measure of cognitive function, especially in the elderly. This instrument is a popular clinical tool widely used in screening for cognitive disorders and dementia. The CDT can be applied in different ways and scoring procedures also vary. OBJECTIVE: The aims of this study were to analyze the performance of elderly on the CDT and evaluate inter-rater reliability of the CDT scored by using a specific algorithm method adapted from Sunderland et al. (1989. METHODS: We analyzed the CDT of 100 cognitively normal elderly aged 60 years or older. The CDT ("free-drawn" and Mini-Mental State Examination (MMSE were administered to all participants. Six independent examiners scored the CDT of 30 participants to evaluate inter-rater reliability. RESULTS AND CONCLUSION: A score of 5 on the proposed algorithm ("Numbers in reverse order or concentrated", equivalent to 5 points on the original Sunderland scale, was the most frequent (53.5%. The CDT specific algorithm method used had high inter-rater reliability (p<0.01, and mean score ranged from 5.06 to 5.96. The high frequency of an overall score of 5 points may suggest the need to create more nuanced evaluation criteria, which are sensitive to differences in levels of impairment in visuoconstructive and executive abilities during aging.
Internal Structure of Mini-CEX Scores for Internal Medicine Residents: Factor Analysis and Generalizability

Science.gov (United States)

Cook, David A.; Beckman, Thomas J.; Mandrekar, Jayawant N.; Pankratz, V. Shane

2010-01-01

The mini-CEX is widely used to rate directly observed resident-patient encounters. Although several studies have explored the reliability of mini-CEX scores, the dimensionality of mini-CEX scores is incompletely understood. Objective: Explore the dimensionality of mini-CEX scores through factor analysis and generalizability analysis. Design:…
Reliability and validity of the upper-body dressing scale in Japanese patients with vascular dementia with hemiparesis.

Science.gov (United States)

Endo, Arisa; Suzuki, Makoto; Akagi, Atsumi; Chiba, Naoyuki; Ishizaka, Ikuyo; Matsunaga, Atsuhiko; Fukuda, Michinari

2015-03-01

The purpose of this study was to examine the reliability and validity of the Upper-body Dressing Scale (UBDS) for buttoned shirt dressing, which evaluates the learning process of new component actions of upper-body dressing in patients diagnosed with dementia and hemiparesis. This was a preliminary correlational study of concurrent validity and reliability in which 10 vascular dementia patients with hemiparesis were enrolled and assessed repeatedly by six occupational therapists by means of the UBDS and the dressing item of the Functional Independence Measure (FIM). Intraclass correlation coefficient was 0.97 for intra-rater reliability and 0.99 for inter-rater reliability. The level of correlation between UBDS score and FIM dressing item scores was -0.93. UBDS scores for paralytic hand passed into the sleeve and sleeve pulled up beyond the shoulder joint were worse than the scores for the other components of the task. The UBDS has good reliability and validity for vascular dementia patients with hemiparesis. Further research is needed to investigate the relation between UBDS score and the effect of intervention and to clarify sensitivity or responsiveness of the scale to clinical change. Copyright © 2014 John Wiley & Sons, Ltd.
The Personality Inventory for DSM-5 Brief Form: Evidence for Reliability and Construct Validity in a Sample of Community-Dwelling Italian Adolescents.

Science.gov (United States)

Fossati, Andrea; Somma, Antonella; Borroni, Serena; Markon, Kristian E; Krueger, Robert F

2017-07-01

To assess the reliability and construct validity of the Personality Inventory for DSM-5 Brief Form (PID-5-BF) among adolescents, 877 Italian high school students were administered the PID-5-BF. Participants were administered also the Measure of Disordered Personality Functioning (MDPF) as a criterion measure. In the full sample, Cronbach's alpha values for the PID-5-BF scales ranged from .59 (Detachment) to .77 (Psychoticism); in addition, all PID-5-BF scales showed mean interitem correlation values in the .22 to .40 range. Cronbach's alpha values for the PID-5-BF total score was .83 (mean interitem r = .16). Although 2-month test-retest reliability could be assessed only in a small ( n = 42) subsample of participants, all PID-5-BF scale scores showed adequate temporal stability, as indexed by intraclass r values ranging from .78 (Negative Affectivity) to .97 (Detachment), all ps <.001. Exploratory structural equation modeling analyses provided at least moderate support for the a priori model of PID-5-BF items. Multiple regression analyses showed that PID-5-BF scales predicted a nonnegligible amount of variance in MDPF Non-Cooperativeness, adjusted R 2 = .17, p < .001, and Non-Coping scales, adjusted R 2 = .32, p < .001. Similarly, the PID-5-BF total score was a significant predictor of both MDPF Non-Coping, and Non-Cooperativeness scales.
Validity and Reliability of the Verbal Numerical Rating Scale for Children Aged 4 to 17 Years With Acute Pain.

Science.gov (United States)

Tsze, Daniel S; von Baeyer, Carl L; Pahalyants, Vartan; Dayan, Peter S

2018-06-01

The Verbal Numerical Rating Scale is the most commonly used self-report measure of pain intensity. It is unclear how the validity and reliability of the scale scores vary across children's ages. We aimed to determine the validity and reliability of the scale for children presenting to the emergency department across a comprehensive spectrum of age. This was a cross-sectional study of children aged 4 to 17 years. Children self-reported their pain intensity, using the Verbal Numerical Rating Scale and Faces Pain Scale-Revised at 2 serial assessments. We evaluated convergent validity (strong validity defined as correlation coefficient ≥0.60), agreement (difference between concurrent Verbal Numerical Rating Scale and Faces Pain Scale-Revised scores), known-groups validity (difference in score between children with painful versus nonpainful conditions), responsivity (decrease in score after analgesic administration), and reliability (test-retest at 2 serial assessments) in the total sample and subgroups based on age. We enrolled 760 children; 27 did not understand the Verbal Numerical Rating Scale and were removed. Of the remainder, Pearson correlations were strong to very strong (0.62 to 0.96) in all years of age except 4 and 5 years, and agreement was strong for children aged 8 and older. Known-groups validity and responsivity were strong in all years of age. Reliability was strong in all age subgroups, including each year of age from 4 to 7 years. Convergent validity, known-groups validity, responsivity, and reliability of the Verbal Numerical Rating Scale were strong for children aged 6 to 17 years. Convergent validity was not strong for children aged 4 and 5 years. Our findings support the use of the Verbal Numerical Rating Scale for most children aged 6 years and older, but not for those aged 4 and 5 years. Copyright © 2017 American College of Emergency Physicians. Published by Elsevier Inc. All rights reserved.
Interrater and intrarater reliability of the Knosp scale for pituitary adenoma grading.

Science.gov (United States)

Mooney, Michael A; Hardesty, Douglas A; Sheehy, John P; Bird, Robert; Chapple, Kristina; White, William L; Little, Andrew S

2017-05-01

OBJECTIVE The goal of this study was to determine the interrater and intrarater reliability of the Knosp grading scale for predicting pituitary adenoma cavernous sinus (CS) involvement. METHODS Six independent raters (3 neurosurgery residents, 2 pituitary surgeons, and 1 neuroradiologist) participated in the study. Each rater scored 50 unique pituitary MRI scans (with contrast) of biopsy-proven pituitary adenoma. Reliabilities for the full scale were determined 3 ways: 1) using all 50 scans, 2) using scans with midrange scores versus end scores, and 3) using a dichotomized scale that reflects common clinical practice. The performance of resident raters was compared with that of faculty raters to assess the influence of training level on reliability. RESULTS Overall, the interrater reliability of the Knosp scale was "strong" (0.73, 95% CI 0.56-0.84). However, the percent agreement for all 6 reviewers was only 10% (26% for faculty members, 30% for residents). The reliability of the middle scores (i.e., average rated Knosp Grades 1 and 2) was "very weak" (0.18, 95% CI -0.27 to 0.56) and the percent agreement for all reviewers was only 5%. When the scale was dichotomized into tumors unlikely to have intraoperative CS involvement (Grades 0, 1, and 2) and those likely to have CS involvement (Grades 3 and 4), the reliability was "strong" (0.60, 95% CI 0.39-0.75) and the percent agreement for all raters improved to 60%. There was no significant difference in reliability between residents and faculty (residents 0.72, 95% CI 0.55-0.83 vs faculty 0.73, 95% CI 0.56-0.84). Intrarater reliability was moderate to strong and increased with the level of experience. CONCLUSIONS Although these findings suggest that the Knosp grading scale has acceptable interrater reliability overall, it raises important questions about the "very weak" reliability of the scale's middle grades. By dichotomizing the scale into clinically useful groups, the authors were able to address the poor
Standardized Total Average Toxicity Score: A Scale- and Grade-Independent Measure of Late Radiotherapy Toxicity to Facilitate Pooling of Data From Different Studies

Energy Technology Data Exchange (ETDEWEB)

Barnett, Gillian C., E-mail: gillbarnett@doctors.org.uk [University of Cambridge Department of Oncology, Oncology Centre, Cambridge (United Kingdom); Cancer Research-UK Centre for Genetic Epidemiology and Department of Oncology, Strangeways Research Laboratories, Cambridge (United Kingdom); West, Catharine M.L. [School of Cancer and Enabling Sciences, Manchester Academic Health Science Centre, University of Manchester, Christie Hospital, Manchester (United Kingdom); Coles, Charlotte E. [University of Cambridge Department of Oncology, Oncology Centre, Cambridge (United Kingdom); Pharoah, Paul D.P. [Cancer Research-UK Centre for Genetic Epidemiology and Department of Oncology, Strangeways Research Laboratories, Cambridge (United Kingdom); Talbot, Christopher J. [Department of Genetics, University of Leicester, Leicester (United Kingdom); Elliott, Rebecca M. [School of Cancer and Enabling Sciences, Manchester Academic Health Science Centre, University of Manchester, Christie Hospital, Manchester (United Kingdom); Tanteles, George A. [Department of Clinical Genetics, University Hospitals of Leicester, Leicester (United Kingdom); Symonds, R. Paul [Department of Cancer Studies and Molecular Medicine, University Hospitals of Leicester, Leicester (United Kingdom); Wilkinson, Jennifer S. [University of Cambridge Department of Oncology, Oncology Centre, Cambridge (United Kingdom); Dunning, Alison M. [Cancer Research-UK Centre for Genetic Epidemiology and Department of Oncology, Strangeways Research Laboratories, Cambridge (United Kingdom); Burnet, Neil G. [University of Cambridge Department of Oncology, Oncology Centre, Cambridge (United Kingdom); Bentzen, Soren M. [University of Wisconsin, School of Medicine and Public Health, Department of Human Oncology, Madison, WI (United States)

2012-03-01

Purpose: The search for clinical and biologic biomarkers associated with late radiotherapy toxicity is hindered by the use of multiple and different endpoints from a variety of scoring systems, hampering comparisons across studies and pooling of data. We propose a novel metric, the Standardized Total Average Toxicity (STAT) score, to try to overcome these difficulties. Methods and Materials: STAT scores were derived for 1010 patients from the Cambridge breast intensity-modulated radiotherapy trial and 493 women from University Hospitals of Leicester. The sensitivity of the STAT score to detect differences between patient groups, stratified by factors known to influence late toxicity, was compared with that of individual endpoints. Analysis of residuals was used to quantify the effect of these covariates. Results: In the Cambridge cohort, STAT scores detected differences (p < 0.00005) between patients attributable to breast volume, surgical specimen weight, dosimetry, acute toxicity, radiation boost to tumor bed, postoperative infection, and smoking (p < 0.0002), with no loss of sensitivity over individual toxicity endpoints. Diabetes (p = 0.017), poor postoperative surgical cosmesis (p = 0.0036), use of chemotherapy (p = 0.0054), and increasing age (p = 0.041) were also associated with increased STAT score. When the Cambridge and Leicester datasets were combined, STAT was associated with smoking status (p < 0.00005), diabetes (p = 0.041), chemotherapy (p = 0.0008), and radiotherapy boost (p = 0.0001). STAT was independent of the toxicity scale used and was able to deal with missing data. There were correlations between residuals of the STAT score obtained using different toxicity scales (r > 0.86, p < 0.00005 for both datasets). Conclusions: The STAT score may be used to facilitate the analysis of overall late radiation toxicity, from multiple trials or centers, in studies of possible genetic and nongenetic determinants of radiotherapy toxicity.
Reliability evaluation of linear multi-state consecutively-connected systems constrained by m consecutive and n total gaps

International Nuclear Information System (INIS)

Yu, Huan; Yang, Jun; Peng, Rui; Zhao, Yu

2016-01-01

This paper extends the linear multi-state consecutively-connected system (LMCCS) to the case of LMCCS-MN, where MN denotes the dual constraints of m consecutive gaps and n total gaps. All the nodes are distributed along a line and form a sequence. The distances between the adjacent nodes are usually non-uniform. The nodes except the last one can contain statistically independent multi-state connection elements (MCEs). Each MCE can provide a connection between the node at which it is located and the next nodes along the sequence. The LMCCS-MN fails if it meets either of the two constraints. The universal generating function technique is adopted to evaluate the system reliability. The optimal allocations of LMCCS-MN with two different types of failures are solved by genetic algorithm. Finally, two examples are given for the demonstration of the proposed model. - Highlights: • A new model of multi-state consecutively-connected system (LMCCS-MN) is proposed. • The non-uniform distributed nodes are involved in the proposed LMCCS-MN model. • An algorithm for system reliability evaluation is provided by the UGF method. • The computational complexity of the proposed algorithm is discussed in detail. • Optimal element allocation problem is formulated and solved.
Human reliability

International Nuclear Information System (INIS)

Bubb, H.

1992-01-01

This book resulted from the activity of Task Force 4.2 - 'Human Reliability'. This group was established on February 27th, 1986, at the plenary meeting of the Technical Reliability Committee of VDI, within the framework of the joint committee of VDI on industrial systems technology - GIS. It is composed of representatives of industry, representatives of research institutes, of technical control boards and universities, whose job it is to study how man fits into the technical side of the world of work and to optimize this interaction. In a total of 17 sessions, information from the part of ergonomy dealing with human reliability in using technical systems at work was exchanged, and different methods for its evaluation were examined and analyzed. The outcome of this work was systematized and compiled in this book. (orig.) [de
Perceptions of Organizational Politics Scale (POPS Questionnaire into Turkish: A Validity and Reliability Study

Directory of Open Access Journals (Sweden)

Evrim EROL

2016-07-01

Full Text Available In this study it was aimed to make the studies of the translation of Perception of Organizational Politics Scale into Turkish and the validity and reliability of the scale. Perceptions of Organizational Politics Scale’s (POPS validities has been tested in terms of view, content and structure. The application is designed as a two-stage process. In the first stage, face and content validity was tested. In the second stage, it was sought evidences for the construct validity of the scale by making exploratory factor analysis (EFA and then the confirmatory factor analysis (CFA to the data obtained. In determining the reliability of the scale item-total score correlations and Cronbach alpha coefficient was used. The application made for the validity and reliability of the scale was conducted on the data collected from 277 faculty members working in universities’ education faculties. As a method of achieving those faculty members "Simple randomized (random sampling" is used. The psychometric properties of the Turkish version of Perception of Organizational Politics Scale showed that the scale has a satisfactory level of reliability and validity for the Turkish employee sample.
The European Portuguese WHOQOL-OLD module and the new facet Family/Family life: reliability and validity studies.

Science.gov (United States)

Vilar, Manuela; Sousa, Liliana B; Simões, Mário R

2016-09-01

The aim of this study was to examine the psychometric properties of the European Portuguese version of the World Health Organization Quality of Life-Older Adults Module (WHOQOL-OLD). The European Portuguese WHOQOL-OLD includes a new identified facet, Family/Family life. A convenience sample of older adults was recruited (N = 921). The assessment protocol included demographics, self-perceived health, depressive symptoms (GDS-30), cognitive function (ACE-R), daily life activities (IAFAI), health status (SF-12) and QoL (WHOQOL-Bref, EUROHIS-QOL-8 and WHOQOL-OLD). The internal consistency was excellent for the total 24-item WHOQOL-OLD original version and also for the final 28-item European Portuguese WHOQOL-OLD version. The test-retest reliability for total scores was good. The construct validity of the European Portuguese WHOQOL-OLD was supported in the correlation matrix analysis. The results indicated good convergent/divergent validity. The WHOQOL-OLD scores differentiated groups of older adults who were healthy/unhealthy and without/mild/severe depressive symptoms. The new facet, Family/Family life, presented evidence of good reliability and validity parameters. Comparatively to international studies, the European Portuguese WHOQOL-OLD version showed similar and/or better psychometric properties. The new facet, Family/Family life, introduces cross-cultural specificity to the study of QoL of older adults and generally improves the psychometric robustness of the WHOQOL-OLD.
Validation of interpersonal support evaluation list-12 (ISEL-12) scores among English- and Spanish-speaking Hispanics/Latinos from the HCHS/SOL Sociocultural Ancillary Study.

Science.gov (United States)

Merz, Erin L; Roesch, Scott C; Malcarne, Vanessa L; Penedo, Frank J; Llabre, Maria M; Weitzman, Orit B; Navas-Nacher, Elena L; Perreira, Krista M; Gonzalez, Franklyn; Ponguta, Liliana A; Johnson, Timothy P; Gallo, Linda C

2014-06-01

The Interpersonal Support Evaluation List-12 (ISEL-12; Cohen, Mermelstein, Kamarck, & Hoberman, 1985) is broadly employed as a short-form measure of the traditional ISEL, which measures functional (i.e., perceived) social support. The ISEL-12 can be scored by summing the items to create an overall social support score; three subscale scores representing appraisal, belonging, and tangible social support have also been proposed. Despite extensive use, studies of the psychometric properties of ISEL-12 scores have been limited, particularly among Hispanics/Latinos, the largest and fastest growing ethnic group in the United States. The current study investigated the reliability and structural and convergent validity of ISEL-12 scores using data from 5,313 Hispanics/Latinos who participated in the Hispanic Community Health Study/Study of Latinos Sociocultural Ancillary Study. Participants completed measures in English or Spanish and identified their ancestry as Dominican, Central American, Cuban, Mexican, Puerto Rican, or South American. Cronbach's alphas suggested adequate internal consistency for the total score for all languages and ancestry groups; coefficients for the subscale scores were not acceptable. Confirmatory factor analyses revealed that the one-factor and three-factor models fit the data equally well. Results from multigroup confirmatory factor analyses supported a similar one-factor structure with equivalent response patterns and variances between language groups and ancestry groups. Convergent validity analyses suggested that the total social support score related to scores of social network integration, life engagement, perceived stress, and negative affect (depression, anxiety) in the expected directions.
Cross-cultural adaptation and validation of the Italian version of the Kerlan-Jobe Orthopaedic Clinic Shoulder and Elbow score.

Science.gov (United States)

Merolla, Giovanni; Corona, Katia; Zanoli, Gustavo; Cerciello, Simone; Giannotti, Stefano; Porcellini, Giuseppe

2017-12-01

The Kerlan-Jobe Orthopaedic Clinic (KJOC) Shoulder and Elbow score is a reliable and sensitive tool to measure the performance of overhead athletes. The purpose of this study was to carry out a cross-cultural adaptation and validation of the KJOC questionnaire in Italian and to assess its reliability, validity, and responsiveness. Ninety professional athletes with a painful shoulder were included in this study and were assigned to the "injury group" (n = 32) or the "overuse group" (n = 58); 65 were managed conservatively and 25 were treated by arthroscopic surgery. To assess the reliability of the KJOC score, patients were asked to fill in the questionnaire at baseline and after 2 weeks. To test the construct validity, KJOC scores were compared to those obtained with the Italian version of the Disabilities of the Arm, Shoulder, and Hand (DASH) scale, and with the DASH sports/performing arts module. To test KJOC score responsiveness, the follow-up KJOC scores of the participants treated conservatively were compared to those of the patients treated by arthroscopic surgery. Statistical analysis demonstrated that the KJOC questionnaire is reliable in terms of the single items and the overall score (ICC 0.95-0.99); that it has high construct validity (r s = -0.697; p differences in shoulder function (p < 0.0001). The Italian version of the KJOC Shoulder and Elbow score performed in a similar way to the English version and demonstrated good validity, reliability, and responsiveness after conservative and surgical treatment. II.
The Veterans Affairs Cardiac Risk Score: Recalibrating the Atherosclerotic Cardiovascular Disease Score for Applied Use.

Science.gov (United States)

Sussman, Jeremy B; Wiitala, Wyndy L; Zawistowski, Matthew; Hofer, Timothy P; Bentley, Douglas; Hayward, Rodney A

2017-09-01

Accurately estimating cardiovascular risk is fundamental to good decision-making in cardiovascular disease (CVD) prevention, but risk scores developed in one population often perform poorly in dissimilar populations. We sought to examine whether a large integrated health system can use their electronic health data to better predict individual patients' risk of developing CVD. We created a cohort using all patients ages 45-80 who used Department of Veterans Affairs (VA) ambulatory care services in 2006 with no history of CVD, heart failure, or loop diuretics. Our outcome variable was new-onset CVD in 2007-2011. We then developed a series of recalibrated scores, including a fully refit "VA Risk Score-CVD (VARS-CVD)." We tested the different scores using standard measures of prediction quality. For the 1,512,092 patients in the study, the Atherosclerotic cardiovascular disease risk score had similar discrimination as the VARS-CVD (c-statistic of 0.66 in men and 0.73 in women), but the Atherosclerotic cardiovascular disease model had poor calibration, predicting 63% more events than observed. Calibration was excellent in the fully recalibrated VARS-CVD tool, but simpler techniques tested proved less reliable. We found that local electronic health record data can be used to estimate CVD better than an established risk score based on research populations. Recalibration improved estimates dramatically, and the type of recalibration was important. Such tools can also easily be integrated into health system's electronic health record and can be more readily updated.
Reliability and validity of the Perceived Stress Scale-10 in Hispanic Americans with English or Spanish language preference.

Science.gov (United States)

Baik, Sharon H; Fox, Rina S; Mills, Sarah D; Roesch, Scott C; Sadler, Georgia Robins; Klonoff, Elizabeth A; Malcarne, Vanessa L

2017-01-01

This study examined the psychometric properties of the Perceived Stress Scale-10 among 436 community-dwelling Hispanic Americans with English or Spanish language preference. Multigroup confirmatory factor analysis examined the factorial invariance of the Perceived Stress Scale-10 across language groups. Results supported a two-factor model (negative, positive) with equivalent response patterns and item intercepts but different factor covariances across languages. Internal consistency reliability of the Perceived Stress Scale-10 total and subscale scores was good in both language groups. Convergent validity was supported by expected relationships of Perceived Stress Scale-10 scores to measures of anxiety and depression. These results support the use of the Perceived Stress Scale-10 among Hispanic Americans.
Large Sample Confidence Intervals for Item Response Theory Reliability Coefficients

Science.gov (United States)

Andersson, Björn; Xin, Tao

2018-01-01

In applications of item response theory (IRT), an estimate of the reliability of the ability estimates or sum scores is often reported. However, analytical expressions for the standard errors of the estimators of the reliability coefficients are not available in the literature and therefore the variability associated with the estimated reliability…
The reliability of tablet computers in depicting maxillofacial radiographic landmarks

Energy Technology Data Exchange (ETDEWEB)

Tadinada, Aditya; Mahdian, Mina; Sheth, Sonam; Chandhoke, Taranpreet K.; Gopalakrishna, Aadarsh; Potluri, Anitha; Yadav, Sumit [University of Connecticut School of Dental Medicine, Farmington (United States)

2015-09-15

This study was performed to evaluate the reliability of the identification of anatomical landmarks in panoramic and lateral cephalometric radiographs on a standard medical grade picture archiving communication system (PACS) monitor and a tablet computer (iPad 5). A total of 1000 radiographs, including 500 panoramic and 500 lateral cephalometric radiographs, were retrieved from the de-identified dataset of the archive of the Section of Oral and Maxillofacial Radiology of the University Of Connecticut School Of Dental Medicine. Major radiographic anatomical landmarks were independently reviewed by two examiners on both displays. The examiners initially reviewed ten panoramic and ten lateral cephalometric radiographs using each imaging system, in order to verify interoperator agreement in landmark identification. The images were scored on a four-point scale reflecting the diagnostic image quality and exposure level of the images. Statistical analysis showed no significant difference between the two displays regarding the visibility and clarity of the landmarks in either the panoramic or cephalometric radiographs. Tablet computers can reliably show anatomical landmarks in panoramic and lateral cephalometric radiographs.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.