inter-observer reliability study: Topics by WorldWideScience.org

Sample records for inter-observer reliability study

Exploring Differences in Measurement and Reporting of Classroom Observation Inter-Rater Reliability

Science.gov (United States)

Wilhelm, Anne Garrison; Gillespie Rouse, Amy; Jones, Francesca

2018-01-01

Although inter-rater reliability is an important aspect of using observational instruments, it has received little theoretical attention. In this article, we offer some guidance for practitioners and consumers of classroom observations so that they can make decisions about inter-rater reliability, both for study design and in the reporting of data…
Inter- and intra-observer reliability of masking in plantar pressure measurement analysis.

Science.gov (United States)

Deschamps, K; Birch, I; Mc Innes, J; Desloovere, K; Matricali, G A

2009-10-01

Plantar pressure measurement is an important tool in gait analysis. Manual placement of small masks (masking) is increasingly used to calculate plantar pressure characteristics. Little is known concerning the reliability of manual masking. The aim of this study was to determine the reliability of masking on 2D plantar pressure footprints, in a population with forefoot deformity (i.e. hallux valgus). Using a random repeated-measure design, four observers identified the third metatarsal head on a peak-pressure barefoot footprint, using a small mask. Subsequently, the location of all five metatarsal heads was identified, using the same size of masks and the same protocol. The 2D positional variation of the masks and the peak pressure (PP) and pressure time integral (PTI) values of each mask were calculated. For single-masking the lowest inter-observer reliability was found for the distal-proximal direction, causing a clear, adverse impact on the reliability of the pressure characteristics (PP and PTI). In the medial-lateral direction the inter-observer reliability could be scored as high. Intra-observer reliability was better and could be scored as high or good for both directions, with a correlated improved reliability of the pressure characteristics. Reliability of multi-masking showed a similar pattern, but overall values tended to be lower. Therefore, small sized masking in order to define pressure characteristics in the forefoot should be done with care.
Cardiac valve calcifications on low-dose unenhanced ungated chest computed tomography: inter-observer and inter-examination reliability, agreement and variability

International Nuclear Information System (INIS)

Hamersvelt, Robbert W. van; Willemink, Martin J.; Takx, Richard A.P.; Eikendal, Anouk L.M.; Budde, Ricardo P.J.; Leiner, Tim; Jong, Pim A. de; Mol, Christian P.; Isgum, Ivana

2014-01-01

To determine inter-observer and inter-examination variability for aortic valve calcification (AVC) and mitral valve and annulus calcification (MC) in low-dose unenhanced ungated lung cancer screening chest computed tomography (CT). We included 578 lung cancer screening trial participants who were examined by CT twice within 3 months to follow indeterminate pulmonary nodules. On these CTs, AVC and MC were measured in cubic millimetres. One hundred CTs were examined by five observers to determine the inter-observer variability. Reliability was assessed by kappa statistics (κ) and intra-class correlation coefficients (ICCs). Variability was expressed as the mean difference ± standard deviation (SD). Inter-examination reliability was excellent for AVC (κ = 0.94, ICC = 0.96) and MC (κ = 0.95, ICC = 0.90). Inter-examination variability was 12.7 ± 118.2 mm 3 for AVC and 31.5 ± 219.2 mm 3 for MC. Inter-observer reliability ranged from κ = 0.68 to κ = 0.92 for AVC and from κ = 0.20 to κ = 0.66 for MC. Inter-observer ICC was 0.94 for AVC and ranged from 0.56 to 0.97 for MC. Inter-observer variability ranged from -30.5 ± 252.0 mm 3 to 84.0 ± 240.5 mm 3 for AVC and from -95.2 ± 210.0 mm 3 to 303.7 ± 501.6 mm 3 for MC. AVC can be quantified with excellent reliability on ungated unenhanced low-dose chest CT, but manual detection of MC can be subject to substantial inter-observer variability. Lung cancer screening CT may be used for detection and quantification of cardiac valve calcifications. (orig.)
Reliability of radiographic observations recorded on a proforma measured using inter- and intra-observer variation: a preliminary study.

Science.gov (United States)

Saunders, M B; Gulabivala, K; Holt, R; Kahan, R S

2000-05-01

The aim of this preliminary study was to test the reliability of radiographic evaluation of features of endodontic interest using a newly devised data collection system. Twelve endodontic MSc postgraduate students and one specialist endodontist examined sample radiographs derived from a random selection of 42 patients seen previously on an Endodontic New Patient Clinic (EDI). Each student examined a random selection of 8-9 roots on periapical radiographs of single- and multirooted teeth, with and without previous root canal therapy and 3-4 dental panoramic tomograms (DPTs). A total of 100 roots were examined. A proforma was used to record observations on 67 radiographic features using predefined criteria. Intra-observer agreement was tested by asking the students to re-examine the radiographs. The principle investigator and the specialist endodontist examined the same radiographs and devised a Gold Standard using the same criteria. This was compared with the student assessments to determine inter-observer variation. The postgraduates then attended a revision session on the use of the form. Each student subsequently examined 8-9 different roots from the pool of radiographs. A further assessment of inter-observer variation was made by comparing these observations with the Gold Standard. Of the 67 radiographic features, only 25 had sufficient response to allow statistical analysis. Kappa values for intra- and inter-observer variation were estimated. These varied depending on the particular radiographic feature being assessed. Fifteen out of 25 intra-observer recordings showed 'good' or 'very good' Kappa agreement, but only three out of 25 inter-observer observations achieved 'good' or 'very good' values. Inter-observer variation was improved following the revision session with 16 out of 25 observations achieving 'good' or 'very good' Kappa agreement. Modification to the proforma, the criteria used, and training for radiographic assessment were considered necessary to
Cardiac valve calcifications on low-dose unenhanced ungated chest computed tomography: inter-observer and inter-examination reliability, agreement and variability

Energy Technology Data Exchange (ETDEWEB)

Hamersvelt, Robbert W. van; Willemink, Martin J.; Takx, Richard A.P.; Eikendal, Anouk L.M.; Budde, Ricardo P.J.; Leiner, Tim; Jong, Pim A. de [University Medical Center Utrecht, Department of Radiology, Utrecht (Netherlands); Mol, Christian P.; Isgum, Ivana [University Medical Center Utrecht, Image Sciences Institute, Utrecht (Netherlands)

2014-07-15

To determine inter-observer and inter-examination variability for aortic valve calcification (AVC) and mitral valve and annulus calcification (MC) in low-dose unenhanced ungated lung cancer screening chest computed tomography (CT). We included 578 lung cancer screening trial participants who were examined by CT twice within 3 months to follow indeterminate pulmonary nodules. On these CTs, AVC and MC were measured in cubic millimetres. One hundred CTs were examined by five observers to determine the inter-observer variability. Reliability was assessed by kappa statistics (κ) and intra-class correlation coefficients (ICCs). Variability was expressed as the mean difference ± standard deviation (SD). Inter-examination reliability was excellent for AVC (κ = 0.94, ICC = 0.96) and MC (κ = 0.95, ICC = 0.90). Inter-examination variability was 12.7 ± 118.2 mm{sup 3} for AVC and 31.5 ± 219.2 mm{sup 3} for MC. Inter-observer reliability ranged from κ = 0.68 to κ = 0.92 for AVC and from κ = 0.20 to κ = 0.66 for MC. Inter-observer ICC was 0.94 for AVC and ranged from 0.56 to 0.97 for MC. Inter-observer variability ranged from -30.5 ± 252.0 mm{sup 3} to 84.0 ± 240.5 mm{sup 3} for AVC and from -95.2 ± 210.0 mm{sup 3} to 303.7 ± 501.6 mm{sup 3} for MC. AVC can be quantified with excellent reliability on ungated unenhanced low-dose chest CT, but manual detection of MC can be subject to substantial inter-observer variability. Lung cancer screening CT may be used for detection and quantification of cardiac valve calcifications. (orig.)
Inter-observer reliability assessments in time motion studies: the foundation for meaningful clinical workflow analysis.

Science.gov (United States)

Lopetegui, Marcelo A; Bai, Shasha; Yen, Po-Yin; Lai, Albert; Embi, Peter; Payne, Philip R O

2013-01-01

Understanding clinical workflow is critical for researchers and healthcare decision makers. Current workflow studies tend to oversimplify and underrepresent the complexity of clinical workflow. Continuous observation time motion studies (TMS) could enhance clinical workflow studies by providing rich quantitative data required for in-depth workflow analyses. However, methodological inconsistencies have been reported in continuous observation TMS, potentially reducing the validity of TMS' data and limiting their contribution to the general state of knowledge. We believe that a cornerstone in standardizing TMS is to ensure the reliability of the human observers. In this manuscript we review the approaches for inter-observer reliability assessment (IORA) in a representative sample of TMS focusing on clinical workflow. We found that IORA is an uncommon practice, inconsistently reported, and often uses methods that provide partial and overestimated measures of agreement. Since a comprehensive approach to IORA is yet to be proposed and validated, we provide initial recommendations for IORA reporting in continuous observation TMS.
Inter-Observer and Intra-Observer Reliability of Clinical Assessments in Knee Osteoarthritis

Science.gov (United States)

Maricar, Nasimah; Callaghan, Michael J; Parkes, Matthew J; Felson, David T; O’Neill, Terence W

2016-01-01

Background Clinical examination of the knee is subject to measurement error. The aim of this analysis was to determine inter- and intra-observer reliability of commonly used clinical tests in patients with knee osteoarthritis(OA). Methods We studied subjects with symptomatic knee OA who were participants in an open-label clinical trial of intra-articular steroid therapy. Following standardisation of the clinical test procedures, two clinicians assessed 25 subjects independently at the same visit, and the same clinician assessed 88 subjects over an interval period of 2–10 weeks; in both cases prior to the steroid intervention. Clinical examination included assessment of bony enlargement, crepitus, quadriceps wasting, knee effusion, joint-line and anserine tenderness and knee range of movement(ROM). Intra-class correlation coefficients(ICC), estimated kappa(κ), weighted kappa(κω) and Bland and Altman plots were used to determine inter- and intra-observer levels of agreement. Results Using Landis and Koch criteria, inter-observer kappa scores were moderate for patellofemoral joint(κ=0.53) and anserine tenderness(κ=0.48); good for bony enlargement(κ=0.66), quadriceps wasting(κ=0.78), crepitus(κ=0.78), medial tibiofemoral joint tenderness(κ=0.76), and effusion assessed by ballottement(κ=0.73) and bulge sign(κω =0.78); and excellent for lateral tibiofemoral joint tenderness(κ=1.00), flexion(ICC=0.97) and extension(ICC=0.87) ROM. Intra-observer kappa scores were moderate for lateral tibiofemoral joint tenderness(κ=0.60), good for crepitus(κ=0.78), effusion assessed by ballottement test(κ=0.77), patellofemoral joint(κ=0.66), medial tibiofemoral joint(κ=0.64) and anserine(κ=0.73) tenderness and excellent for effusion assessed by bulge sign(κω =0.83), bony enlargement(κ=0.98), quadriceps wasting(κ=0.83), flexion(ICC=0.99) and extension(ICC=0.96) ROM. Conclusion Among individuals with symptomatic knee OA, the reliability of clinical examination of the
Assessment of disabilities in stroke patients with apraxia : Internal consistency and inter-observer reliability

NARCIS (Netherlands)

van Heugten, CM; Dekker, J; Deelman, BG; Stehmann-Saris, JC; Kinebanian, A

1999-01-01

In this paper the internal consistency and inter-observer reliability of the assessment of disabilities in stroke patients with apraxia is presented. Disabilities were assessed by means of observation of activities of daily living (ADL). The study was conducted at occupational therapy departments in
Assessment of disabilities in stroke patients with apraxia: internal consistency and inter-observer reliability.

NARCIS (Netherlands)

Heugten, C.M. van; Dekker, J.; Deelman, B.G.; Stehmann-Saris, J.C.; Kinebanian, A.

1999-01-01

In this paper the internal consistency and inter-observer reliability of the assessment of disabilities in stroke patients with apraxia is presented. Disabilities were assessed by means of observation of activities of daily living (ADL). The study was conducted at occupational therapy departments in
Inter- and intra- observer reliability of risk assessment of repetitive work without an explicit method.

Science.gov (United States)

Eliasson, Kristina; Palm, Peter; Nyman, Teresia; Forsman, Mikael

2017-07-01

A common way to conduct practical risk assessments is to observe a job and report the observed long term risks for musculoskeletal disorders. The aim of this study was to evaluate the inter- and intra-observer reliability of ergonomists' risk assessments without the support of an explicit risk assessment method. Twenty-one experienced ergonomists assessed the risk level (low, moderate, high risk) of eight upper body regions, as well as the global risk of 10 video recorded work tasks. Intra-observer reliability was assessed by having nine of the ergonomists repeat the procedure at least three weeks after the first assessment. The ergonomists made their risk assessment based on his/her experience and knowledge. The statistical parameters of reliability included agreement in %, kappa, linearly weighted kappa, intraclass correlation and Kendall's coefficient of concordance. The average inter-observer agreement of the global risk was 53% and the corresponding weighted kappa (K w ) was 0.32, indicating fair reliability. The intra-observer agreement was 61% and 0.41 (K w ). This study indicates that risk assessments of the upper body, without the use of an explicit observational method, have non-acceptable reliability. It is therefore recommended to use systematic risk assessment methods to a higher degree. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Inter-observer reliability of DSM-5 substance use disorders.

Science.gov (United States)

Denis, Cécile M; Gelernter, Joel; Hart, Amy B; Kranzler, Henry R

2015-08-01

Although studies have examined the impact of changes made in DSM-5 on the estimated prevalence of substance use disorder (SUD) diagnoses, there is limited evidence concerning the reliability of DSM-5 SUDs. We evaluated the inter-observer reliability of four DSM-5 SUDs in a sample in which we had previously evaluated the reliability of DSM-IV diagnoses, allowing us to compare the two systems. Two different interviewers each assessed 173 subjects over a 2-week period using the Semi-Structured Assessment for Drug Dependence and Alcoholism (SSADDA). Using the percent agreement and kappa (κ) coefficient, we examined the reliability of DSM-5 lifetime alcohol, opioid, cocaine, and cannabis use disorders, which we compared to that of SSADDA-derived DSM-IV SUD diagnoses. We also assessed the effect of additional lifetime SUD and lifetime mood or anxiety disorder diagnoses on the reliability of the DSM-5 SUD diagnoses. Reliability was good to excellent for the four disorders, with κ values ranging from 0.65 to 0.94. Agreement was consistently lower for SUDs of mild severity than for moderate or severe disorders. DSM-5 SUD diagnoses showed greater reliability than DSM-IV diagnoses of abuse or dependence or dependence only. Co-occurring SUD and lifetime mood or anxiety disorders exerted a modest effect on the reliability of the DSM-5 SUD diagnoses. For alcohol, opioid, cocaine and cannabis use disorders, DSM-5 criteria and diagnoses are at least as reliable as those of DSM-IV. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Inter-Observer Reliability of DSM-5 Substance Use Disorders*

Science.gov (United States)

Denis, Cécile M.; Gelernter, Joel; Hart, Amy B.; Kranzler, Henry R.

2015-01-01

Aims Although studies have examined the impact of changes made in DSM-5 on the estimated prevalence of substance use disorder (SUD) diagnoses, there is limited evidence of the reliability of DSM-5 SUDs. We evaluated the inter-observer reliability of four DSM-5 SUDs in a sample in which we had previously evaluated the reliability of DSM-IV diagnoses, allowing us to compare the two systems. Methods Two different interviewers each assessed 173 subjects over a 2-week period using the Semi-Structured Assessment for Drug Dependence and Alcoholism (SSADDA). Using the percent agreement and kappa (κ) coefficient, we examined the reliability of DSM-5 lifetime alcohol, opioid, cocaine, and cannabis use disorders, which we compared to that of SSADDA-derived DSM-IV SUD diagnoses. We also assessed the effect of additional lifetime SUD and lifetime mood or anxiety disorder diagnoses on the reliability of the DSM-5 SUD diagnoses. Results Reliability was good to excellent for the four disorders, with κ values ranging from 0.65 to 0.94. Agreement was consistently lower for SUDs of mild severity than for moderate or severe disorders. DSM-5 SUD diagnoses showed greater reliability than DSM-IV diagnoses of abuse or dependence or dependence only. Co-occurring SUD and lifetime mood or anxiety disorders exerted a modest effect on the reliability of the DSM-5 SUD diagnoses. Conclusions For alcohol, opioid, cocaine and cannabis use disorders, DSM-5 criteria and diagnoses are at least as reliable as those of DSM-IV. PMID:26048641
Improved radiograph measurement inter-observer reliability by use of statistical shape models

Energy Technology Data Exchange (ETDEWEB)

Pegg, E.C., E-mail: elise.pegg@ndorms.ox.ac.uk [University of Oxford, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Nuffield Orthopaedic Centre, Windmill Road, Oxford OX3 7LD (United Kingdom); Mellon, S.J., E-mail: stephen.mellon@ndorms.ox.ac.uk [University of Oxford, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Nuffield Orthopaedic Centre, Windmill Road, Oxford OX3 7LD (United Kingdom); Salmon, G. [University of Oxford, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Nuffield Orthopaedic Centre, Windmill Road, Oxford OX3 7LD (United Kingdom); Alvand, A., E-mail: abtin.alvand@ndorms.ox.ac.uk [University of Oxford, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Nuffield Orthopaedic Centre, Windmill Road, Oxford OX3 7LD (United Kingdom); Pandit, H., E-mail: hemant.pandit@ndorms.ox.ac.uk [University of Oxford, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Nuffield Orthopaedic Centre, Windmill Road, Oxford OX3 7LD (United Kingdom); Murray, D.W., E-mail: david.murray@ndorms.ox.ac.uk [University of Oxford, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Nuffield Orthopaedic Centre, Windmill Road, Oxford OX3 7LD (United Kingdom); Gill, H.S., E-mail: richie.gill@ndorms.ox.ac.uk [University of Oxford, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Nuffield Orthopaedic Centre, Windmill Road, Oxford OX3 7LD (United Kingdom)

2012-10-15

Pre- and post-operative radiographs of patients undergoing joint arthroplasty are often examined for a variety of purposes including preoperative planning and patient assessment. This work examines the feasibility of using active shape models (ASM) to semi-automate measurements from post-operative radiographs for the specific case of the Oxford™ Unicompartmental Knee. Measurements of the proximal tibia and the position of the tibial tray were made using the ASM model and manually. Data were obtained by four observers and one observer took four sets of measurements to allow assessment of the inter- and intra-observer reliability, respectively. The parameters measured were the tibial tray angle, the tray overhang, the tray size, the sagittal cut position, the resection level and the tibial width. Results demonstrated improved reliability (average of 27% and 11.2% increase for intra- and inter-reliability, respectively) and equivalent accuracy (p > 0.05 for compared data values) for all of the measurements using the ASM model, with the exception of the tray overhang (p = 0.0001). Less time (15 s) was required to take measurements using the ASM model compared with manual measurements, which was significant. These encouraging results indicate that semi-automated measurement techniques could improve the reliability of radiographic measurements.
Improved radiograph measurement inter-observer reliability by use of statistical shape models

International Nuclear Information System (INIS)

Pegg, E.C.; Mellon, S.J.; Salmon, G.; Alvand, A.; Pandit, H.; Murray, D.W.; Gill, H.S.

2012-01-01

Pre- and post-operative radiographs of patients undergoing joint arthroplasty are often examined for a variety of purposes including preoperative planning and patient assessment. This work examines the feasibility of using active shape models (ASM) to semi-automate measurements from post-operative radiographs for the specific case of the Oxford™ Unicompartmental Knee. Measurements of the proximal tibia and the position of the tibial tray were made using the ASM model and manually. Data were obtained by four observers and one observer took four sets of measurements to allow assessment of the inter- and intra-observer reliability, respectively. The parameters measured were the tibial tray angle, the tray overhang, the tray size, the sagittal cut position, the resection level and the tibial width. Results demonstrated improved reliability (average of 27% and 11.2% increase for intra- and inter-reliability, respectively) and equivalent accuracy (p > 0.05 for compared data values) for all of the measurements using the ASM model, with the exception of the tray overhang (p = 0.0001). Less time (15 s) was required to take measurements using the ASM model compared with manual measurements, which was significant. These encouraging results indicate that semi-automated measurement techniques could improve the reliability of radiographic measurements
IRR (Inter-Rater Reliability) of a COP (Classroom Observation Protocol)--A Critical Appraisal

Science.gov (United States)

Rui, Ning; Feldman, Jill M.

2012-01-01

Notwithstanding broad utility of COPs (classroom observation protocols), there has been limited documentation of the psychometric properties of even the most popular COPs. This study attempted to fill this void by closely examining the item and domain-level IRR (inter-rater reliability) of a COP that was used in a federally funded striving readers…
Inter-rater reliability of an observation-based ergonomics assessment checklist for office workers.

Science.gov (United States)

Pereira, Michelle Jessica; Straker, Leon Melville; Comans, Tracy Anne; Johnston, Venerina

2016-12-01

To establish the inter-rater reliability of an observation-based ergonomics assessment checklist for computer workers. A 37-item (38-item if a laptop was part of the workstation) comprehensive observational ergonomics assessment checklist comparable to government guidelines and up to date with empirical evidence was developed. Two trained practitioners assessed full-time office workers performing their usual computer-based work and evaluated the suitability of workstations used. Practitioners assessed each participant consecutively. The order of assessors was randomised, and the second assessor was blinded to the findings of the first. Unadjusted kappa coefficients between the raters were obtained for the overall checklist and subsections that were formed from question-items relevant to specific workstation equipment. Twenty-seven office workers were recruited. The inter-rater reliability between two trained practitioners achieved moderate to good reliability for all except one checklist component. This checklist has mostly moderate to good reliability between two trained practitioners. Practitioner Summary: This reliable ergonomics assessment checklist for computer workers was designed using accessible government guidelines and supplemented with up-to-date evidence. Employers in Queensland (Australia) can fulfil legislative requirements by using this reliable checklist to identify and subsequently address potential risk factors for work-related injury to provide a safe working environment.
Computing Inter-Rater Reliability for Observational Data: An Overview and Tutorial

Directory of Open Access Journals (Sweden)

Kevin A. Hallgren

2012-02-01

Full Text Available Many research designs require the assessment of inter-rater reliability (IRR to demonstrate consistency among observational ratings provided by multiple coders. However, many studies use incorrect statistical procedures, fail to fully report the information necessary to interpret their results, or do not address how IRR affects the power of their subsequent analyses for hypothesis testing. This paper provides an overview of methodological issues related to the assessment of IRR with a focus on study design, selection of appropriate statistics, and the computation, interpretation, and reporting of some commonly-used IRR statistics. Computational examples include SPSS and R syntax for computing Cohens kappa and intra-class correlations to assess IRR.
Inter-observer and inter-examination variability of manual vertebral bone attenuation measurements on computed tomography

International Nuclear Information System (INIS)

Pompe, Esther; Lammers, Jan-Willem J.; Jong, Pim A. de; Jong, Werner U. de; Takx, Richard A.P.; Eikendal, Anouk L.M.; Willemink, Martin J.; Mohamed Hoesein, Firdaus A.A.; Oudkerk, Matthijs; Budde, Ricardo P.J.

2016-01-01

To determine inter-observer and inter-examination variability of manual attenuation measurements of the vertebrae in low-dose unenhanced chest computed tomography (CT). Three hundred and sixty-seven lung cancer screening trial participants who underwent baseline and repeat unenhanced low-dose CT after 3 months because of an indeterminate lung nodule were included. The CT attenuation value of the first lumbar vertebrae (L1) was measured in all CTs by one observer to obtain inter-examination reliability. Six observers performed measurements in 100 randomly selected CTs to determine agreement with limits of agreement and Bland-Altman plots and reliability with intraclass correlation coefficients (ICCs). Reclassification analyses were performed using a threshold of 110 HU to define osteoporosis. Inter-examination reliability was excellent with an ICC of 0.92 (p < 0.001). Inter-examination limits of agreement ranged from -26 to 28 HU with a mean difference of 1 ± 14 HU. Inter-observer reliability ICCs ranged from 0.70 to 0.91. Inter-examination variability led to 11.2 % reclassification of participants and inter-observer variability led to 22.1 % reclassification. Vertebral attenuation values can be manually quantified with good to excellent inter-examination and inter-observer reliability on unenhanced low-dose chest CT. This information is valuable for early detection of osteoporosis on low-dose chest CT. (orig.)
Inter-observer and intra-observer reliability in the radiographic diagnosis of avascular necrosis of the femoral head following reconstructive hip surgery in children with cerebral palsy.

Science.gov (United States)

Hesketh, Kim; Sankar, Wudbhav; Joseph, Benjamin; Narayanan, Unni; Mulpuri, Kishore

2016-04-01

The incidence of avascular necrosis (AVN) following reconstructive hip surgery in cerebral palsy (CP) ranges from 0 to 69 % in the current literature. The purpose of this study was to determine the inter- and intra-observer reliability of radiographically diagnosing AVN in children with CP after hip surgery. A retrospective review of 65 children with CP who had reconstructive hip surgery between 2009 and 2012 at BC Children's Hospital was completed. Anterior-posterior and lateral radiographs were presented to four pediatric orthopaedic surgeons over two rounds. Surgeons were asked to review the set of unidentified radiographs and comment 'yes' or 'no' for the presence of AVN. Two weeks later the same set of radiographs was sent in a different order and the surgeons were again asked to comment on AVN. Inter- and intra-observer reliability was determined using kappa statistics. The intra-observer reliability ranged from 0.65 to 0.88 with an average score of 0.76. Inter-observer reliability showed greater variability, ranging from 0.41 to 0.77 with an average score of 0.56 across all surgeons. Although the intra-rater reliability produced a strength of "good" and the inter-rater reliability a strength of "moderate" agreement, the variability within these scores is clinically important as it demonstrates the difficulty in identifying AVN. This may explain the variability in AVN that is reported in the literature. The need for further education and research in the diagnosis of AVN in children with CP who have undergone reconstructive hip surgery is clinically necessary.
Inter-rater reliability of PATH observations for assessment of ergonomic risk factors in hospital work.

Science.gov (United States)

Park, Jung-Keun; Boyer, Jon; Tessler, Jamie; Casey, Jeffrey; Schemm, Linda; Gore, Rebecca; Punnett, Laura

2009-07-01

This study examined the inter-rater reliability of expert observations of ergonomic risk factors by four analysts. Ten jobs were observed at a hospital using a newly expanded version of the PATH method (Buchholz et al. 1996), to which selected upper extremity exposures had been added. Two of the four raters simultaneously observed each worker onsite for a total of 443 observation pairs containing 18 categorical exposure items each. For most exposure items, kappa coefficients were 0.4 or higher. For some items, agreement was higher both for the jobs with less rapid hand activity and for the analysts with a higher level of ergonomic job analysis experience. These upper extremity exposures could be characterised reliably with real-time observation, given adequate experience and training of the observers. The revised version of PATH is applicable to the analysis of jobs where upper extremity musculoskeletal strain is of concern.

Scoring haemophilic arthropathy on X-rays: improving inter- and intra-observer reliability and agreement using a consensus atlas

Energy Technology Data Exchange (ETDEWEB)

Foppen, Wouter; Schaaf, Irene C. van der; Beek, Frederik J.A. [University Medical Center Utrecht, Department of Radiology (Netherlands); Verkooijen, Helena M. [University Medical Center Utrecht, Department of Radiology (Netherlands); University Medical Center Utrecht, Julius Center for Health Sciences and Primary Care, Utrecht (Netherlands); Fischer, Kathelijn [University Medical Center Utrecht, Julius Center for Health Sciences and Primary Care, Utrecht (Netherlands); University Medical Center Utrecht, Van Creveldkliniek, Department of Hematology, Utrecht (Netherlands)

2016-06-15

The radiological Pettersson score (PS) is widely applied for classification of arthropathy to evaluate costly haemophilia treatment. This study aims to assess and improve inter- and intra-observer reliability and agreement of the PS. Two series of X-rays (bilateral elbows, knees, and ankles) of 10 haemophilia patients (120 joints) with haemophilic arthropathy were scored by three observers according to the PS (maximum score 13/joint). Subsequently, (dis-)agreement in scoring was discussed until consensus. Example images were collected in an atlas. Thereafter, second series of 120 joints were scored using the atlas. One observer rescored the second series after three months. Reliability was assessed by intraclass correlation coefficients (ICC), agreement by limits of agreement (LoA). Median Pettersson score at joint level (PS{sub joint}) of affected joints was 6 (interquartile range 3-9). Using the consensus atlas, inter-observer reliability of the PS{sub joint} improved significantly from 0.94 (95 % confidence interval (CI) 0.91-0.96) to 0.97 (CI 0.96-0.98). LoA improved from ±1.7 to ±1.1 for the PS{sub joint}. Therefore, true differences in arthropathy were differences in the PS{sub joint} of >2 points. Intra-observer reliability of the PS{sub joint} was 0.98 (CI 0.97-0.98), intra-observer LoA were ±0.9 points. Reliability and agreement of the PS improved by using a consensus atlas. (orig.)
Reliability of Alberta Infant Motor Scale Using Recorded Video Observations Among the Preterm Infants in India: A Reliability Study

Directory of Open Access Journals (Sweden)

Veena Kirthika S

2017-10-01

Full Text Available Background: Assessment of motor function is a vital characteristic of infant development. Alberta Infant Motor scale (AIMS is considered to be one of the tool available for screening the developmental delays, but this scale was formulated by using western samples. Every country has its own ethnic and cultural background and various differences are observed in the culture and ethnicity. Therefore, there is a need to obtain reliability for the use of AIMS in south Indian population. Purpose: To find the intra-rater and inter-rater reliability of Alberta Infant Motor Scale (AIMS on pre-term infants using the recorded video observations in Indian population. Method: 30 preterm infants in three age groups, 0-3 months (10 infants, 4-7 months (10 infants, 8-18 months (10 infants were recruited for this reliability study. The AIMS was administered to the preterm infants and the performance was videotaped. The performance was then rescored by the same therapist, immediately from the video and on another two consecutive months to estimate intra-rater reliability using ICC (3,1, two-way mixed effects model. For reporting inter-rater reliability, AIMS was scored by three different raters, using ICC (2,k two-way random effects model and by two other therapists to examine the inter and intra-rater reliability. Results: The two-way mixed effects model for intra-rater reliability of AIMS, ICC (3,1 = 0.99 and for reporting inter-rater reliability of AIMS by two-way random effects model, ICC (2,k = 0.96. Conclusion: AIMS has excellent intra and inter-rater reliability using recorded video observations among the preterm infants in India
Measurement of transplanted pancreatic volume using computed tomography: reliability by intra- and inter-observer variability

International Nuclear Information System (INIS)

Lundqvist, Eva; Segelsjoe, Monica; Magnusson, Anders; Andersson, Anna; Biglarnia, Ali-Reza

2012-01-01

Background Unlike other solid organ transplants, pancreas allografts can undergo a substantial decrease in baseline volume after transplantation. This phenomenon has not been well characterized, as there are insufficient data on reliable and reproducible volume assessments. We hypothesized that characterization of pancreatic volume by means of computed tomography (CT) could be a useful method for clinical follow-up in pancreas transplant patients. Purpose To evaluate the feasibility and reliability of pancreatic volume assessment using CT scan in transplanted patients. Material and Methods CT examinations were performed on 21 consecutive patients undergoing pancreas transplantation. Volume measurements were carried out by two observers tracing the pancreatic contours in all slices. The observers performed the measurements twice for each patient. Differences in volume measurement were used to evaluate intra- and inter-observer variability. Results The intra-observer variability for the pancreatic volume measurements of Observers 1 and 2 was found to be in almost perfect agreement, with an intraclass correlation coefficient (ICC) of 0.90 (0.77-0.96) and 0.99 (0.98-1.0), respectively. Regarding inter-observer validity, the ICCs for the first and second measurements were 0.90 (range, 0.77-0.96) and 0.95 (range, 0.85-0.98), respectively. Conclusion CT volumetry is a reliable and reproducible method for measurement of transplanted pancreatic volume
Measurement of transplanted pancreatic volume using computed tomography: reliability by intra- and inter-observer variability

Energy Technology Data Exchange (ETDEWEB)

Lundqvist, Eva; Segelsjoe, Monica; Magnusson, Anders [Uppsala Univ., Dept. of Radiology, Oncology and Radiation Science, Section of Radiology, Uppsala (Sweden)], E-mail: eva.lundqvist.8954@student.uu.se; Andersson, Anna; Biglarnia, Ali-Reza [Dept. of Surgical Sciences, Section of Transplantation Surgery, Uppsala Univ. Hospital, Uppsala (Sweden)

2012-11-15

Background Unlike other solid organ transplants, pancreas allografts can undergo a substantial decrease in baseline volume after transplantation. This phenomenon has not been well characterized, as there are insufficient data on reliable and reproducible volume assessments. We hypothesized that characterization of pancreatic volume by means of computed tomography (CT) could be a useful method for clinical follow-up in pancreas transplant patients. Purpose To evaluate the feasibility and reliability of pancreatic volume assessment using CT scan in transplanted patients. Material and Methods CT examinations were performed on 21 consecutive patients undergoing pancreas transplantation. Volume measurements were carried out by two observers tracing the pancreatic contours in all slices. The observers performed the measurements twice for each patient. Differences in volume measurement were used to evaluate intra- and inter-observer variability. Results The intra-observer variability for the pancreatic volume measurements of Observers 1 and 2 was found to be in almost perfect agreement, with an intraclass correlation coefficient (ICC) of 0.90 (0.77-0.96) and 0.99 (0.98-1.0), respectively. Regarding inter-observer validity, the ICCs for the first and second measurements were 0.90 (range, 0.77-0.96) and 0.95 (range, 0.85-0.98), respectively. Conclusion CT volumetry is a reliable and reproducible method for measurement of transplanted pancreatic volume.
Intra-rater and inter-rater reliability of a medical record abstraction study on transition of care after childhood cancer.

Directory of Open Access Journals (Sweden)

Micòl E Gianinazzi

Full Text Available The abstraction of data from medical records is a widespread practice in epidemiological research. However, studies using this means of data collection rarely report reliability. Within the Transition after Childhood Cancer Study (TaCC which is based on a medical record abstraction, we conducted a second independent abstraction of data with the aim to assess a intra-rater reliability of one rater at two time points; b the possible learning effects between these two time points compared to a gold-standard; and c inter-rater reliability.Within the TaCC study we conducted a systematic medical record abstraction in the 9 Swiss clinics with pediatric oncology wards. In a second phase we selected a subsample of medical records in 3 clinics to conduct a second independent abstraction. We then assessed intra-rater reliability at two time points, the learning effect over time (comparing each rater at two time-points with a gold-standard and the inter-rater reliability of a selected number of variables. We calculated percentage agreement and Cohen's kappa.For the assessment of the intra-rater reliability we included 154 records (80 for rater 1; 74 for rater 2. For the inter-rater reliability we could include 70 records. Intra-rater reliability was substantial to excellent (Cohen's kappa 0-6-0.8 with an observed percentage agreement of 75%-95%. In all variables learning effects were observed. Inter-rater reliability was substantial to excellent (Cohen's kappa 0.70-0.83 with high agreement ranging from 86% to 100%.Our study showed that data abstracted from medical records are reliable. Investigating intra-rater and inter-rater reliability can give confidence to draw conclusions from the abstracted data and increase data quality by minimizing systematic errors.
Inter-operator and inter-device agreement and reliability of the SEM Scanner.

Science.gov (United States)

Clendenin, Marta; Jaradeh, Kindah; Shamirian, Anasheh; Rhodes, Shannon L

2015-02-01

The SEM Scanner is a medical device designed for use by healthcare providers as part of pressure ulcer prevention programs. The objective of this study was to evaluate the inter-rater and inter-device agreement and reliability of the SEM Scanner. Thirty-one (31) volunteers free of pressure ulcers or broken skin at the sternum, sacrum, and heels were assessed with the SEM Scanner. Each of three operators utilized each of three devices to collect readings from four anatomical sites (sternum, sacrum, left and right heels) on each subject for a total of 108 readings per subject collected over approximately 30 min. For each combination of operator-device-anatomical site, three SEM readings were collected. Inter-operator and inter-device agreement and reliability were estimated. Over the course of this study, more than 3000 SEM Scanner readings were collected. Agreement between operators was good with mean differences ranging from -0.01 to 0.11. Inter-operator and inter-device reliability exceeded 0.80 at all anatomical sites assessed. The results of this study demonstrate the high reliability and good agreement of the SEM Scanner across different operators and different devices. Given the limitations of current methods to prevent and detect pressure ulcers, the SEM Scanner shows promise as an objective, reliable tool for assessing the presence or absence of pressure-induced tissue damage such as pressure ulcers. Copyright © 2015 Bruin Biometrics, LLC. Published by Elsevier Ltd.. All rights reserved.
Reliability of Ultrasound Diameter Measurements in Patients with a Small Asymptomatic Popliteal Artery Aneurysm: An Intra- and Inter-observer Agreement Study.

Science.gov (United States)

Zwiers, I; Hoogland, C M T; Mackaay, A J C

2016-03-01

In this study the intra- and inter-observer variability of ultrasound measurements of the diameter of the popliteal artery were tested in a group of patients under surveillance for a small (diameter 10-20 mm), asymptomatic popliteal artery aneurysm (PAA). From a group of patients under ultrasound surveillance for bilateral, asymptomatic PAAs, 13 consecutive patients agreed to participate in the study and provided informed consent. The maximum diameter of the popliteal arteries was assessed by a vascular technologist. The same assessment was repeated by a second vascular technologist, unaware of the results of the first measurement. After a week, this protocol was repeated. The intra- and inter-observer reliability of this measurement was calculated using intra-class correlation coefficients (ICCs) and Bland and Altman plots. Of the 10 patients with bilateral and three patients with unilateral PAA, 12 completed the 2 week protocol. A total of 86 measurements were analyzed. The mean diameter of the popliteal arteries was 13.5 ± 3.4 mm. The ICC for the intra-observer reliability of observer 1 was 0.96 (95% CI 0.92-0.99), p .47. The absolute magnitude of the systematic error of both observers was less than 0.135 mm (median 0.00). Ultrasound measurement of the maximum diameter of the popliteal artery is reproducible; hence, it is suitable for making a clinical treatment decision. Its use for surveillance of small, asymptomatic PAAs is justified. Copyright © 2015 European Society for Vascular Surgery. Published by Elsevier Ltd. All rights reserved.
Inter-rater reliability of a modified version of Delitto et al.’s classification-based system for low back pain : a pilot study

NARCIS (Netherlands)

Apeldoorn, Adri T.; van Helvoirt, Hans; Ostelo, Raymond W.; Meihuizen, Hanneke; Kamper, Steven J.; van Tulder, Maurits W.; de Vet, Henrica C W

2016-01-01

Study design:: Observational inter-rater reliability study. Objectives: To examine: (1) the inter-rater reliability of a modified version of Delitto et al.’s classification-based algorithm for patients with low back pain; (2) the influence of different levels of familiarity with the system; and (3)
Inter- and intra-operator reliability and repeatability of shear wave elastography in the liver: a study in healthy volunteers.

Science.gov (United States)

Hudson, John M; Milot, Laurent; Parry, Craig; Williams, Ross; Burns, Peter N

2013-06-01

This study assessed the reproducibility of shear wave elastography (SWE) in the liver of healthy volunteers. Intra- and inter-operator reliability and repeatability were quantified in three different liver segments in a sample of 15 subjects, scanned during four independent sessions (two scans on day 1, two scans 1 wk later) by two operators. A total of 1440 measurements were made. Reproducibility was assessed using the intra-class correlation coefficient (ICC) and a repeated measures analysis of variance. The shear wave speed was measured and used to estimate Young's modulus using the Supersonics Imagine Aixplorer. The median Young's modulus measured through the inter-costal space was 5.55 ± 0.74 kPa. The intra-operator reliability was better for same-day evaluations (ICC = 0.91) than the inter-operator reliability (ICC = 0.78). Intra-observer agreement decreased when scans were repeated on a different day. Inter-session repeatability was between 3.3% and 9.9% for intra-day repeated scans, compared with to 6.5%-12% for inter-day repeated scans. No significant difference was observed in subjects with a body mass index greater or less than 25 kg/m(2). Copyright © 2013 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.
Inter-rater reliability of shoulder measurements in middle-aged women.

Science.gov (United States)

De Groef, A; Van Kampen, M; Vervloesem, N; Clabau, E; Christiaens, M-R; Neven, P; Geraerts, I; Struyf, F; Devoogdt, N

2017-06-01

To investigate inter-rater reliability of a set of shoulder measurements including inclinometry [shoulder range of motion (ROM)], acromion-table distance and pectoralis minor muscle length (static scapular positioning), upward rotation with two inclinometers (scapular kinematics) and pain pressure thresholds (muscle tenderness) in middle-aged women. Observational study. Thirty symptom-free middle-aged women (first cohort) were measured by two raters. All measurements with an intraclass correlation coefficient (ICC) below 0.75 were retested after an additional training period in a second cohort of 30 symptom-free middle-aged women. Inter-rater reliability of all variables was measured with the ICC (95% confidence interval) and standard error of measurement (SEM). Acromion-table distance (ICC=0.91, SEM 0.22 to 0.28% of body length), pectoralis minor muscle length (ICC=0.91, SEM 0.16% of body length), pain pressure thresholds (ICC=0.78 to 0.85, SEM 0.39 to 0.70kg) and abduction ROM (ICC=0.77, SEM 5°) showed good to excellent inter-rater reliability in the first cohort. After an additional training period, forward flexion ROM showed good inter-rater reliability (ICC=0.83, SEM 5°), scapular upward rotation in resting position showed moderate reliability (ICC=0.52, SEM 2°), and other scaption angles showed weak reliability (ICC=0.26 to 0.43, SEM 3 to 8°). In a battery of clinical tools to evaluate factors contributing to shoulder pain, static scapular positioning and pressure pain thresholds were found to have good to excellent inter-rater reliability in middle-aged women. Additional training is recommended for measurements with a gravity inclinometer. Copyright © 2016 Chartered Society of Physiotherapy. Published by Elsevier Ltd. All rights reserved.
Inter-rater and intra-rater reliability of a movement control test in shoulder.

Science.gov (United States)

Rajasekar, S; Bangera, Rakshith K; Sekaran, Padmanaban

2017-07-01

Movement faults are commonly observed in patients with musculoskeletal pain. The Kinetic Medial Rotation Test (KMRT) is a movement control test used to identify movement faults of the scapula and gleno-humeral joints during arm movement. Objective tests such as the KMRT need to be reliable and valid for the results to be applied across different clinical settings and patient populations. The primary objective of the present study was to determine the intra-rater and inter-rater reliability of KMRT in subjects with and without shoulder pain. Sixty subjects were included in this study based on specific inclusion and exclusion criteria. Two musculoskeletal physiotherapists with different levels of clinical experience performed the tests. The intra-rater reliability was tested in twenty asymptomatic subjects by a single assessor at two week intervals. An equal number of subjects with and without shoulder pain were tested by both the assessors to determine the inter-rater reliability. Both components of the KMRT, the Gleno- Humeral Anterior Translation (GHAT) and the Scapular Forward Tilt (SCFT) were tested. The Kappa values for inter-rater reliability of the GHAT and SCFT were K = 0.68 & K = 0.65 respectively in subjects with shoulder pain. In asymptomatic subjects, the inter-rater reliability of GHAT was K = 0.61 and SCFT was K = 0.85. Intra-rater reliability ranged from K = 0.66 for GHAT to K = 0.87 for SCFT. Our study found substantial agreement in inter-rater reliability of KMRT in subjects with shoulder pain, whereas substantial to near perfect agreement was found in intra-rater and inter-rater reliability of KMRT in subjects without shoulder pain. Copyright © 2017 Elsevier Ltd. All rights reserved.
Inter-Rater Reliability of Cyclotorsion Measurements Using Fundus Photography.

Science.gov (United States)

Dysli, Muriel; Kanku, Madeleine; Traber, Ghislaine L

2018-04-01

The foveo-papillary angle (FPA) on fundus photographs is the accepted standard for the measurement of ocular cyclotorsion. We assessed the inter-rater reliability of this method in healthy subjects and in patients with trochlear nerve palsies. In this methodological study, fundus photographs of healthy subjects and of patients with trochlear nerve palsies were made with a fundus camera (Zeiss Fundus Camera FF 450 plus, Jena, Germany). Three independent observers measured the FPA on the fundus photographs of all subjects in synedra View (synedra View 16, Version 16.0.0.11, Innsbruck, Austria). One hundred and four eyes of 52 subjects (26 healthy controls and 26 patients) were assessed. The mean FPA of the healthy controls was 5.80 degrees (°) [± 0.44 standard error of the mean (SEM)] compared to 11.55° (± 0.80 SEM) for patients with trochlear nerve palsies. The inter-rater reliability of all measured FPAs showed an intraclass correlation coefficient (ICC) of 0.98 (95% CI 0.97 - 0.98). The inter-rater reliability of objective cyclotorsion measurements using fundus photographs was very high. Georg Thieme Verlag KG Stuttgart · New York.
Inter-observer reliability of animal-based welfare indicators included in the Animal Welfare Indicators welfare assessment protocol for dairy goats.

Science.gov (United States)

Vieira, A; Battini, M; Can, E; Mattiello, S; Stilwell, G

2018-01-08

This study was conducted within the context of the Animal Welfare Indicators (AWIN) project and the underlying scientific motivation for the development of the study was the scarcity of data regarding inter-observer reliability (IOR) of welfare indicators, particularly given the importance of reliability as a further step for developing on-farm welfare assessment protocols. The objective of this study is therefore to evaluate IOR of animal-based indicators (at group and individual-level) of the AWIN welfare assessment protocol (prototype) for dairy goats. In the design of the study, two pairs of observers, one in Portugal and another in Italy, visited 10 farms each and applied the AWIN prototype protocol. Farms in both countries were visited between January and March 2014, and all the observers received the same training before the farm visits were initiated. Data collected during farm visits, and analysed in this study, include group-level and individual-level observations. The results of our study allow us to conclude that most of the group-level indicators presented the highest IOR level ('substantial', 0.85 to 0.99) in both field studies, pointing to a usable set of animal-based welfare indicators that were therefore included in the first level of the final AWIN welfare assessment protocol for dairy goats. Inter-observer reliability of individual-level indicators was lower, but the majority of them still reached 'fair to good' (0.41 to 0.75) and 'excellent' (0.76 to 1) levels. In the paper we explore reasons for the differences found in IOR between the group and individual-level indicators, including how the number of individual-level indicators to be assessed on each animal and the restraining method may have affected the results. Furthermore, we discuss the differences found in the IOR of individual-level indicators in both countries: the Portuguese pair of observers reached a higher level of IOR, when compared with the Italian observers. We argue how the
Intra and inter-rater reliability study of pelvic floor muscle dynamometric measurements

Directory of Open Access Journals (Sweden)

Natalia M. Martinho

2015-04-01

Full Text Available OBJECTIVE: The aim of this study was to evaluate the intra and inter-rater reliability of pelvic floor muscle (PFM dynamometric measurements for maximum and average strengths, as well as endurance. METHOD: A convenience sample of 18 nulliparous women, without any urogynecological complaints, aged between 19 and 31 (mean age of 25.4±3.9 participated in this study. They were evaluated using a pelvic floor dynamometer based on load cell technology. The dynamometric evaluations were repeated in three successive sessions: two on the same day with a rest period of 30 minutes between them, and the third on the following day. All participants were evaluated twice in each session; first by examiner 1 followed by examiner 2. The vaginal dynamometry data were analyzed using three parameters: maximum strength, average strength, and endurance. The Intraclass Correlation Coefficient (ICC was applied to estimate the PFM dynamometric measurement reliability, considering a good level as being above 0.75. RESULTS: The intra and inter-raters' analyses showed good reliability for maximum strength (ICCintra-rater1=0.96, ICCintra-rater2=0.95, and ICCinter-rater=0.96, average strength (ICCintra-rater1=0.96, ICCintra-rater2=0.94, and ICCinter-rater=0.97, and endurance (ICCintra-rater1=0.88, ICCintra-rater2=0.86, and ICCinter-rater=0.92 dynamometric measurements. CONCLUSIONS: The PFM dynamometric measurements showed good intra- and inter-rater reliability for maximum strength, average strength and endurance, which demonstrates that this is a reliable device that can be used in clinical practice.
Inter-observer variability between radiologists reporting on cerebellopontine angle tumours on magnetic resonance imaging.

Science.gov (United States)

Teh, S R; Ranguis, S; Fagan, P

2017-01-01

Studies demonstrate the significance of intra- and inter-observer variability when measuring cerebellopontine angle tumours on magnetic resonance imaging, with measured differences as high as 2 mm. To determine intra- and inter-observer measurement variability of cerebellopontine angle tumours in a specialised institution. The magnetic resonance imaging maximal diameter of 12 randomly selected cerebellopontine angle tumours were independently measured by 4 neuroradiologists at a tertiary referral centre using a standard definition for maximal tumour diameter. Average deviation and intraclass correlation were subsequently calculated. Inter-observer difference averaged 0.33 ± 0.04 mm (range, 0.0-0.8 mm). Intra-observer measurements were more consistent than inter-observer measurements, with differences averaging 0.17 mm (95 per cent confidence interval = 0.27-0.06, p = 0.002). Inter-observer reliability was 0.99 (95 per cent confidence interval = 0.97-0.99), suggesting high reliability between the readings. The use of a standard definition for maximal tumour volume provided high reliability amongst radiologists' readings. To avoid oversizing tumours, it is recommended that conservative monitoring be conducted by the same institution with thin slice magnetic resonance imaging scans.
Inter- and intra-rater reliability of nasal auscultation in daycare children.

Science.gov (United States)

Santos, Rita; Silva Alexandrino, Ana; Tomé, David; Melo, Cristina; Mesquita Montes, António; Costa, Daniel; Pinto Ferreira, João

2018-02-01

The aim of this study was to assess nasal auscultation's intra- and inter-rater reliability and to analyze ear and respiratory clinical condition according to nasal auscultation. Cross-sectional study performed in 125 children aged up to 3 years old attending daycare centers. Nasal auscultation, tympanometry and Paediatric Respiratory Severity Score (PRSS) were applied to all children. Nasal sounds were classified by an expert panel in order to determine nasal auscultation's intra and inter- rater reliability. The classification of nasal sounds was assessed against tympanometric and PRSS values. Nasal auscultation revealed substantial inter-rater (K=0.75) and intra-rater (K=0.69; K=0.61 and K=0.72) reliability. Children with a "non-obstructed" classification revealed a lower peak pressure (t=-3.599, Pauscultation revealed substantial intra- and inter-rater reliability. Nasal auscultation exhibited important differences according to ear and respiratory clinical conditions. Nasal auscultation in pediatrics seems to be an original topic as well as a simple method that can be used to identify early signs of nasopharyngeal obstruction.
Inter-rater reliability of case-note audit: a systematic review.

Science.gov (United States)

Lilford, Richard; Edwards, Alex; Girling, Alan; Hofer, Timothy; Di Tanna, Gian Luca; Petty, Jane; Nicholl, Jon

2007-07-01

The quality of clinical care is often assessed by retrospective examination of case-notes (charts, medical records). Our objective was to determine the inter-rater reliability of case-note audit. We conducted a systematic review of the inter-rater reliability of case-note audit. Analysis was restricted to 26 papers reporting comparisons of two or three raters making independent judgements about the quality of care. Sixty-six separate comparisons were possible, since some papers reported more than one measurement of reliability. Mean kappa values ranged from 0.32 to 0.70. These may be inflated due to publication bias. Measured reliabilities were found to be higher for case-note reviews based on explicit, as opposed to implicit, criteria and for reviews that focused on outcome (including adverse effects) rather than process errors. We found an association between kappa and the prevalence of errors (poor quality care), suggesting alternatives such as tetrachoric and polychoric correlation coefficients be considered to assess inter-rater reliability. Comparative studies should take into account the relationship between kappa and the prevalence of the events being measured.
Validity and inter-rater reliability of medio-lateral knee motion observed during a single-limb mini squat

DEFF Research Database (Denmark)

Ageberg, Eva; Bennell, Kim L; Hunt, Michael A

2010-01-01

Muscle function may influence the risk of knee injury and outcomes following injury. Clinical tests, such as a single-limb mini squat, resemble conditions of daily life and are easy to administer. Fewer squats per 30 seconds indicate poorer function. However, the quality of movement, such as the ......, such as the medio-lateral knee motion may also be important. The aim was to validate an observational clinical test of assessing the medio-lateral knee motion, using a three-dimensional (3-D) motion analysis system. In addition, the inter-rater reliability was evaluated....
Inter-arch digital model vs. manual cast measurements: Accuracy and reliability.

Science.gov (United States)

Kiviahde, Heikki; Bukovac, Lea; Jussila, Päivi; Pesonen, Paula; Sipilä, Kirsi; Raustia, Aune; Pirttiniemi, Pertti

2017-06-28

The purpose of this study was to evaluate the accuracy and reliability of inter-arch measurements using digital dental models and conventional dental casts. Thirty sets of dental casts with permanent dentition were examined. Manual measurements were done with a digital caliper directly on the dental casts, and digital measurements were made on 3D models by two independent examiners. Intra-class correlation coefficients (ICC), a paired sample t-test or Wilcoxon signed-rank test, and Bland-Altman plots were used to evaluate intra- and inter-examiner error and to determine the accuracy and reliability of the measurements. The ICC values were generally good for manual and excellent for digital measurements. The Bland-Altman plots of all the measurements showed good agreement between the manual and digital methods and excellent inter-examiner agreement using the digital method. Inter-arch occlusal measurements on digital models are accurate and reliable and are superior to manual measurements.
Inter-rater and intra-rater reliability of the Bahasa Melayu version of Rose Angina Questionnaire.

Science.gov (United States)

Hassan, N B; Choudhury, S R; Naing, L; Conroy, R M; Rahman, A R A

2007-01-01

The objective of the study is to translate the Rose Questionnaire (RQ) into a Bahasa Melayu version and adapt it cross-culturally, and to measure its inter-rater and intrarater reliability. This cross sectional study was conducted in the respondents' homes or workplaces in Kelantan, Malaysia. One hundred respondents aged 30 and above with different socio-demographic status were interviewed for face validity. For each inter-rater and intra-rater reliability, a sample of 150 respondents was interviewed. Inter-rater and intra-rater reliabilities were assessed by Cohen's kappa. The overall inter-rater agreements by the five pair of interviewers at point one and two were 0.86, and intrarater reliability by the five interviewers on the seven-item questionnaire at poinone and two was 0.88, as measured by kappa coefficient. The translated Malay version of RQ demonstrated an almost perfect inter-rater and intra-rater reliability and further validation such as sensitivity and specificity analysis of this translated questionnaire is highly recommended.

INTER-RATER RELIABILITY FOR MOVEMENT PATTERN ANALYSIS (MPA: MEASURING PATTERNING OF BEHAVIORS VERSUS DISCRETE BEHAVIOR COUNTS AS INDICATORS OF DECISION-MAKING STYLE

Directory of Open Access Journals (Sweden)

Brenda L Connors

2014-06-01

Full Text Available The unique yield of collecting observational data on human movement has received increasing attention in a number of domains, including the study of decision-making style. As such, interest has grown in the nuances of core methodological issues, including the best ways of assessing inter-rater reliability. In this paper we focus on one key topic – the distinction between establishing reliability for the patterning of behaviors as opposed to the computation of raw counts – and suggest that reliability for each be compared empirically rather than determined a priori. We illustrate by assessing inter-rater reliability for key outcome measures derived from Movement Pattern Analysis (MPA, an observational methodology that records body movements as indicators of decision-making style with demonstrated predictive validity. While reliability ranged from moderate to good for raw counts of behaviors reflecting each of two Overall Factors generated within MPA (Assertion and Perspective, inter-rater reliability for patterning (proportional indicators of each factor was significantly higher and excellent (ICC = .89. Furthermore, patterning, as compared to raw counts, provided better prediction of observable decision-making process assessed in the laboratory. These analyses support the utility of using an empirical approach to inform the consideration of measuring discrete behavioral counts versus patterning of behaviors when determining inter-rater reliability of observable behavior. They also speak to the substantial reliability that may be achieved via application of theoretically grounded observational systems such as MPA that reveal thinking and action motivations via visible movement patterns.
High inter-tester reliability of the new mobility score in patients with hip fracture

DEFF Research Database (Denmark)

Kristensen, M.T.; Bandholm, T.; Foss, N.B.

2008-01-01

OBJECTIVE: To assess the inter-tester reliability of the New Mobility Score in patients with acute hip fracture. DESIGN: An inter-tester reliability study. SUBJECTS: Forty-eight consecutive patients with acute hip fracture at a median age of 84 (interquartile range, 76-89) years; 40 admitted from...
Intra- and inter-observer agreement and reliability of bone mineral density measurements around acetabular cup

DEFF Research Database (Denmark)

Mussmann, Bo Redder; Overgaard, Soren; Torfing, Trine

2017-01-01

in measuring bone density (BMD) in complex anatomic structures which might be overcome using dual-energy computed tomography (DECT).PurposeTo test inter- and intra-observer agreement and reliability of in-house segmentation software measuring BMD adjacent to acetabular cup and to compare measurements performed...... with single-energy CT (SECT) and DECT in cemented and cementless cups.Material and Methods: Twenty-four acetabular cups inserted in porcine hip specimens were scanned with SECT and DECT. Bone density was measured in a three-dimensional volume adjacent to the cup. Double measurements were performed.......Results: BMD derived from SECT was approximately four times higher than that of DECT. In both scan modes, intraclass correlation coefficient (ICC) was >0.90 with no differences between repeated measurements, except for uncemented cups where a statistically significant difference of 11 mg/cm3 was found...
Test-re-test reliability and inter-rater reliability of a digital pelvic inclinometer in young, healthy males and females.

Science.gov (United States)

Beardsley, Chris; Egerton, Tim; Skinner, Brendon

2016-01-01

Objective. The purpose of this study was to investigate the reliability of a digital pelvic inclinometer (DPI) for measuring sagittal plane pelvic tilt in 18 young, healthy males and females. Method. The inter-rater reliability and test-re-test reliabilities of the DPI for measuring pelvic tilt in standing on both the right and left sides of the pelvis were measured by two raters carrying out two rating sessions of the same subjects, three weeks apart. Results. For measuring pelvic tilt, inter-rater reliability was designated as good on both sides (ICC = 0.81-0.88), test-re-test reliability within a single rating session was designated as good on both sides (ICC = 0.88-0.95), and test-re-test reliability between two rating sessions was designated as moderate on the left side (ICC = 0.65) and good on the right side (ICC = 0.85). Conclusion. Inter-rater reliability and test-re-test reliability within a single rating session of the DPI in measuring pelvic tilt were both good, while test-re-test reliability between rating sessions was moderate-to-good. Caution is required regarding the interpretation of the test-re-test reliability within a single rating session, as the raters were not blinded. Further research is required to establish validity.
Inter-rater reliability of direct observations of the physical and psychosocial working conditions in eldercare

DEFF Research Database (Denmark)

Karstad, Kristina; Rugulies, Reiner; Skotte, Jørgen

2018-01-01

The aim of the study was to develop and evaluate the reliability of the "Danish observational study of eldercare work and musculoskeletal disorders" (DOSES) observation instrument to assess physical and psychosocial risk factors for musculoskeletal disorders (MSD) in eldercare work. During 1.5 ye...... is appropriate for assessing physical and psychosocial risk factors for MSD among eldercare workers.......The aim of the study was to develop and evaluate the reliability of the "Danish observational study of eldercare work and musculoskeletal disorders" (DOSES) observation instrument to assess physical and psychosocial risk factors for musculoskeletal disorders (MSD) in eldercare work. During 1...
3-D high-frequency endovaginal ultrasound of female urethral complex and assessment of inter-observer reliability

International Nuclear Information System (INIS)

Wieczorek, A.P.; Wozniak, M.M.; Stankiewicz, A.; Santoro, G.A.; Bogusiewicz, M.; Rechberger, T.

2012-01-01

Objectives: Assessment of the urethral complex and defining its morphological characteristics with 3-dimensional endovaginal ultrasonography with the use of high frequency rotational 360° transducer. Defining inter-observer reliability of the performed measurements. Materials and methods: Twenty-four asymptomatic, nulliparous females (aged 18–55, mean 32 years) underwent high-frequency (12 MHz) endovaginal ultrasound with rotational 360° and automated 3D data acquisition (type 2050, B-K Medical, Herlev, Denmark). Measurements of the urethral thickness, width and length, bladder neck-symphysis distance, intramural part of the urethra as well as rhabdosphincter thickness, width and length were taken by three investigators. Descriptive statistics for continuous data was performed. The results were given as mean values with standard deviation. The relationships among different variables were assessed with ANOVA for repeated measures factors, as well as T-test for dependent samples. Intraclass correlation (ICC) was calculated for each parameter. Intra- and interobserver reliability was assessed. Statistical significance was assigned to a P value of 0.8) and good reliability for rhabdosphincter measurements (ICC > 0.6) between all three investigators. Conclusions: Advanced EVUS provides detailed information on anatomy and morphology of the female urethral complex. Our results show that 360° rotational transducer with automated 3D acquisition, currently routinely used for proctological scanning is suitable for the reliable assessment of the urethral complex and can be applied in a routine diagnostics of pelvic floor disturbances in females.
Grant Peer Review: Improving Inter-Rater Reliability with Training.

Science.gov (United States)

Sattler, David N; McKnight, Patrick E; Naney, Linda; Mathis, Randy

2015-01-01

This study developed and evaluated a brief training program for grant reviewers that aimed to increase inter-rater reliability, rating scale knowledge, and effort to read the grant review criteria. Enhancing reviewer training may improve the reliability and accuracy of research grant proposal scoring and funding recommendations. Seventy-five Public Health professors from U.S. research universities watched the training video we produced and assigned scores to the National Institutes of Health scoring criteria proposal summary descriptions. For both novice and experienced reviewers, the training video increased scoring accuracy (the percentage of scores that reflect the true rating scale values), inter-rater reliability, and the amount of time reading the review criteria compared to the no video condition. The increase in reliability for experienced reviewers is notable because it is commonly assumed that reviewers--especially those with experience--have good understanding of the grant review rating scale. The findings suggest that both experienced and novice reviewers who had not received the type of training developed in this study may not have appropriate understanding of the definitions and meaning for each value of the rating scale and that experienced reviewers may overestimate their knowledge of the rating scale. The results underscore the benefits of and need for specialized peer reviewer training.
Reliability and criterion validity of an observation protocol for working technique assessments in cash register work.

Science.gov (United States)

Palm, Peter; Josephson, Malin; Mathiassen, Svend Erik; Kjellberg, Katarina

2016-06-01

We evaluated the intra- and inter-observer reliability and criterion validity of an observation protocol, developed in an iterative process involving practicing ergonomists, for assessment of working technique during cash register work for the purpose of preventing upper extremity symptoms. Two ergonomists independently assessed 17 15-min videos of cash register work on two occasions each, as a basis for examining reliability. Criterion validity was assessed by comparing these assessments with meticulous video-based analyses by researchers. Intra-observer reliability was acceptable (i.e. proportional agreement >0.7 and kappa >0.4) for 10/10 questions. Inter-observer reliability was acceptable for only 3/10 questions. An acceptable inter-observer reliability combined with an acceptable criterion validity was obtained only for one working technique aspect, 'Quality of movements'. Thus, major elements of the cashiers' working technique could not be assessed with an acceptable accuracy from short periods of observations by one observer, such as often desired by practitioners. Practitioner Summary: We examined an observation protocol for assessing working technique in cash register work. It was feasible in use, but inter-observer reliability and criterion validity were generally not acceptable when working technique aspects were assessed from short periods of work. We recommend the protocol to be used for educational purposes only.
Translation, adaptation and inter-rater reliability of the administration manual for the Fugl-Meyer assessment.

Science.gov (United States)

Michaelsen, Stella M; Rocha, André S; Knabben, Rodrigo J; Rodrigues, Luciano P; Fernandes, Claudia G C

2011-01-01

Recently, the reliability of the Brazilian version of the Fugl-Meyer Assessment (FMA) was assessed through the scoring given according to observations made by a single evaluator who applied the test. When different raters apply the scale, the reliability may depend on the interpretation given to the assessment sheet. In such cases, a clear administration manual is essential for ensuring homogeneity of application. To translate and adapt the French Canadian version of the FMA administration manual into Brazilian Portuguese and to evaluate the inter-rater reliability when different evaluators apply the FMA on the basis of the information contained in the manual. Eighteen adults (59±10 years) with chronic hemiparesis (38±35 months after a stroke) took part in this study. Eight patients participated in the first part of the study and 10 in the second part. Based on analyzing the results from part 1, an adapted version was developed, in which information and photos were added to illustrate the positions of the patient and evaluator. The inter-rater reliability was assessed using the intraclass correlation coefficient (ICC). The reliability of the FMA based on the adapted version of the manual was excellent for the total motor scores for the upper limbs (ICC=0.98) and lower limbs (ICC=0.90), as well as for movement sense (ICC=0.98) and upper and lower-limb passive range of motion (ICC=0.84 and 0.90, respectively). The reliability was moderate for tactile sensitivity (0.75). The joint pain assessment presented low reliability. The results showed that, except for pain assessment, application of the FMA based on the adapted version of the application manual for Brazilian Portuguese presented adequate inter-rater reliability.
Test–re-test reliability and inter-rater reliability of a digital pelvic inclinometer in young, healthy males and females

Directory of Open Access Journals (Sweden)

Chris Beardsley

2016-03-01

Full Text Available Objective. The purpose of this study was to investigate the reliability of a digital pelvic inclinometer (DPI for measuring sagittal plane pelvic tilt in 18 young, healthy males and females. Method. The inter-rater reliability and test–re-test reliabilities of the DPI for measuring pelvic tilt in standing on both the right and left sides of the pelvis were measured by two raters carrying out two rating sessions of the same subjects, three weeks apart. Results. For measuring pelvic tilt, inter-rater reliability was designated as good on both sides (ICC = 0.81–0.88, test–re-test reliability within a single rating session was designated as good on both sides (ICC = 0.88–0.95, and test–re-test reliability between two rating sessions was designated as moderate on the left side (ICC = 0.65 and good on the right side (ICC = 0.85. Conclusion. Inter-rater reliability and test–re-test reliability within a single rating session of the DPI in measuring pelvic tilt were both good, while test–re-test reliability between rating sessions was moderate-to-good. Caution is required regarding the interpretation of the test–re-test reliability within a single rating session, as the raters were not blinded. Further research is required to establish validity.
Intra and Inter-Rater Reliability of Screening for Movement Impairments: Movement Control Tests from The Foundation Matrix

Science.gov (United States)

Mischiati, Carolina R.; Comerford, Mark; Gosford, Emma; Swart, Jacqueline; Ewings, Sean; Botha, Nadine; Stokes, Maria; Mottram, Sarah L.

2015-01-01

Pre-season screening is well established within the sporting arena, and aims to enhance performance and reduce injury risk. With the increasing need to identify potential injury with greater accuracy, a new risk assessment process has been produced; The Performance Matrix (battery of movement control tests). As with any new method of objective testing, it is fundamental to establish whether the same results can be reproduced between examiners and by the same examiner on consecutive occasions. This study aimed to determine the intra-rater test re-test and inter-rater reliability of tests from a component of The Performance Matrix, The Foundation Matrix. Twenty participants were screened by two experienced musculoskeletal therapists using nine tests to assess the ability to control movement during specific tasks. Movement evaluation criteria for each test were rated as pass or fail. The therapists observed participants real-time and tests were recorded on video to enable repeated ratings four months later to examine intra-rater reliability (videos rated two weeks apart). Overall test percentage agreement was 87% for inter-rater reliability; 98% Rater 1, 94% Rater 2 for test re-test reliability; and 75% for real-time versus video. Intraclass-correlation coefficients (ICCs) were excellent between raters (0.81) and within raters (Rater 1, 0.96; Rater 2, 0.88) but poor for real-time versus video (0.23). Reliability for individual components of each test was more variable: inter-rater, 68-100%; intra-rater, 88-100% Rater 1, 75-100% Rater 2; and real-time versus video 31-100%. Cohen’s Kappa values for inter-rater reliability were 0.0-1.0; intra-rater 0.6-1.0 for Rater 1; -0.1-1.0 for Rater 2; and -0.1-1 for real-time versus video. It is concluded that both inter and intra-rater reliability of tests in The Foundation Matrix are acceptable when rated by experienced therapists. Recommendations are made for modifying some of the criteria to improve reliability where
Inter-rater reliability of AMSTAR is dependent on the pair of reviewers.

Science.gov (United States)

Pieper, Dawid; Jacobs, Anja; Weikert, Beate; Fishta, Alba; Wegewitz, Uta

2017-07-11

Inter-rater reliability (IRR) is mainly assessed based on only two reviewers of unknown expertise. The aim of this paper is to examine differences in the IRR of the Assessment of Multiple Systematic Reviews (AMSTAR) and R(evised)-AMSTAR depending on the pair of reviewers. Five reviewers independently applied AMSTAR and R-AMSTAR to 16 systematic reviews (eight Cochrane reviews and eight non-Cochrane reviews) from the field of occupational health. Responses were dichotomized and reliability measures were calculated by applying Holsti's method (r) and Cohen's kappa (κ) to all potential pairs of reviewers. Given that five reviewers participated in the study, there were ten possible pairs of reviewers. Inter-rater reliability varied for AMSTAR between r = 0.82 and r = 0.98 (median r = 0.88) using Holsti's method and κ = 0.41 and κ = 0.69 (median κ = 0.52) using Cohen's kappa and for R-AMSTAR between r = 0.77 and r = 0.89 (median r = 0.82) and κ = 0.32 and κ = 0.67 (median κ = 0.45) depending on the pair of reviewers. The same pair of reviewers yielded the highest IRR for both instruments. Pairwise Cohen's kappa reliability measures showed a moderate correlation between AMSTAR and R-AMSTAR (Spearman's ρ =0.50). The mean inter-rater reliability for AMSTAR was highest for item 1 (κ = 1.00) and item 5 (κ = 0.78), while lowest values were found for items 3, 8, 9 and 11, which showed only fair agreement. Inter-rater reliability varies widely depending on the pair of reviewers. There may be some shortcomings associated with conducting reliability studies with only two reviewers. Further studies should include additional reviewers and should probably also take account of their level of expertise.
Inter-rater reliability of nursing home quality indicators in the U.S

Directory of Open Access Journals (Sweden)

Roy Jason

2003-11-01

Full Text Available Abstract Background In the US, Quality Indicators (QI's profiling and comparing the performance of hospitals, health plans, nursing homes and physicians are routinely published for consumer review. We report the results of the largest study of inter-rater reliability done on nursing home assessments which generate the data used to derive publicly reported nursing home quality indicators. Methods We sampled nursing homes in 6 states, selecting up to 30 residents per facility who were observed and assessed by research nurses on 100 clinical assessment elements contained in the Minimum Data Set (MDS and compared these with the most recent assessment in the record done by facility nurses. Kappa statistics were generated for all data items and derived for 22 QI's over the entire sample and for each facility. Finally, facilities with many QI's with poor Kappa levels were compared to those with many QI's with excellent Kappa levels on selected characteristics. Results A total of 462 facilities in 6 states were approached and 219 agreed to participate, yielding a response rate of 47.4%. A total of 5758 residents were included in the inter-rater reliability analyses, around 27.5 per facility. Patients resembled the traditional nursing home resident, only 43.9% were continent of urine and only 25.2% were rated as likely to be discharged within the next 30 days. Results of resident level comparative analyses reveal high inter-rater reliability levels (most items >.75. Using the research nurses as the "gold standard", we compared composite quality indicators based on their ratings with those based on facility nurses. All but two QI's have adequate Kappa levels and 4 QI's have average Kappa values in excess of .80. We found that 16% of participating facilities performed poorly (Kappa .75 on 12 or more QI's. No facility characteristics were related to reliability of the data on which Qis are based. Conclusion While a few QI's being used for public reporting
Introducing a new definition of a near fall: intra-rater and inter-rater reliability.

Science.gov (United States)

Maidan, I; Freedman, T; Tzemah, R; Giladi, N; Mirelman, A; Hausdorff, J M

2014-01-01

Near falls (NFs) are more frequent than falls, and may occur before falls, potentially predicting fall risk. As such, identification of a NF is important. We aimed to assess intra and inter-rater reliability of the traditional definition of a NF and to demonstrate the potential utility of a new definition. To this end, 10 older adults, 10 idiopathic elderly fallers, and 10 patients with Parkinson's disease (PD) walked in an obstacle course while wearing a safety harness. All walks were videotaped. Forty-nine video segments were extracted to create 2 clips each of 8.48 min. Four raters scored each event using the traditional definition and, two weeks later, using the new definition. A fifth rater used only the new definition. Intra-rater reliability was determined using Kappa (K) statistics and inter-rater reliability was determined using ICC. Using the traditional definition, three raters had poor intra-rater reliability (K0.137) and one rater had moderate intra-rater reliability (K=0.624, pdefinition, inter-rater reliability between the four raters was moderate (ICC=0.667, pdefinition showed high intra-rater (K>0.601, pdefinition of NF is required. The results of the present study suggest that the proposed new definition increases intra and inter-rater reliability, a critical step for using NFs to quantify fall risk. Copyright © 2013 Elsevier B.V. All rights reserved.
The Berg Balance Scale has high intra- and inter-rater reliability but absolute reliability varies across the scale: a systematic review.

Science.gov (United States)

Downs, Stephen; Marquez, Jodie; Chiarelli, Pauline

2013-06-01

What is the intra-rater and inter-rater relative reliability of the Berg Balance Scale? What is the absolute reliability of the Berg Balance Scale? Does the absolute reliability of the Berg Balance Scale vary across the scale? Systematic review with meta-analysis of reliability studies. Any clinical population that has undergone assessment with the Berg Balance Scale. Relative intra-rater reliability, relative inter-rater reliability, and absolute reliability. Eleven studies involving 668 participants were included in the review. The relative intrarater reliability of the Berg Balance Scale was high, with a pooled estimate of 0.98 (95% CI 0.97 to 0.99). Relative inter-rater reliability was also high, with a pooled estimate of 0.97 (95% CI 0.96 to 0.98). A ceiling effect of the Berg Balance Scale was evident for some participants. In the analysis of absolute reliability, all of the relevant studies had an average score of 20 or above on the 0 to 56 point Berg Balance Scale. The absolute reliability across this part of the scale, as measured by the minimal detectable change with 95% confidence, varied between 2.8 points and 6.6 points. The Berg Balance Scale has a higher absolute reliability when close to 56 points due to the ceiling effect. We identified no data that estimated the absolute reliability of the Berg Balance Scale among participants with a mean score below 20 out of 56. The Berg Balance Scale has acceptable reliability, although it might not detect modest, clinically important changes in balance in individual subjects. The review was only able to comment on the absolute reliability of the Berg Balance Scale among people with moderately poor to normal balance. Copyright © 2013 Australian Physiotherapy Association. Published by .. All rights reserved.
Face validity and inter-rater reliability of the Danish version of the modified-Yale Preoperative Anxiety Scale

DEFF Research Database (Denmark)

Skovby, Pernille; Rask, Charlotte Ulrikka; Dall, Rolf

2014-01-01

-YPAS to Danish cultural and linguistic conditions and to test face validity and inter-reliability in a clinical setting. Materials and methods The translation was performed in accordance with WHO guidelines. Face validity as well as linguistic difficulties of the Danish version was tested and solved in a focus...... of the m-YPAS as suitable and relevant, i.e. the face validity satisfactory. Inter-rater reliability analysis revealed that inter-observer agreement at induction 1 were good to very good (kw: 0.63–0.98) and at induction 2, the agreement was good to very good (kw: 0.72–0.96). ICC for the overall weighted...... anxiety score was in: induction 1:0.92 and induction 2: 0.92 Conclusion Standardized and validated assessment tools are needed to evaluate interventions aiming to reduce preoperative anxiety in children. The Danish m-YPAS had a satisfactory face validity and inter-reliability, based on a minor empirical...
Chest computed tomography-based scoring of thoracic sarcoidosis: Inter-rater reliability of CT abnormalities

Energy Technology Data Exchange (ETDEWEB)

Heuvel, D.A.V. den; Es, H.W. van; Heesewijk, J.P. van; Spee, M. [St. Antonius Hospital Nieuwegein, Department of Radiology, Nieuwegein (Netherlands); Jong, P.A. de [University Medical Center Utrecht, Department of Radiology, Utrecht (Netherlands); Zanen, P.; Grutters, J.C. [University Medical Center Utrecht, Division Heart and Lungs, Utrecht (Netherlands); St. Antonius Hospital Nieuwegein, Center of Interstitial Lung Diseases, Department of Pulmonology, Nieuwegein (Netherlands)

2015-09-15

To determine inter-rater reliability of sarcoidosis-related computed tomography (CT) findings that can be used for scoring of thoracic sarcoidosis. CT images of 51 patients with sarcoidosis were scored by five chest radiologists for various abnormal CT findings (22 in total) encountered in thoracic sarcoidosis. Using intra-class correlation coefficient (ICC) analysis, inter-rater reliability was analysed and reported according to the Guidelines for Reporting Reliability and Agreement Studies (GRRAS) criteria. A pre-specified sub-analysis was performed to investigate the effect of training. Scoring was trained in a distinct set of 15 scans in which all abnormal CT findings were represented. Median age of the 51 patients (36 men, 70 %) was 43 years (range 26 - 64 years). All radiographic stages were present in this group. ICC ranged from 0.91 for honeycombing to 0.11 for nodular margin (sharp versus ill-defined). The ICC was above 0.60 in 13 of the 22 abnormal findings. Sub-analysis for the best-trained observers demonstrated an ICC improvement for all abnormal findings and values above 0.60 for 16 of the 22 abnormalities. In our cohort, reliability between raters was acceptable for 16 thoracic sarcoidosis-related abnormal CT findings. (orig.)
Chest computed tomography-based scoring of thoracic sarcoidosis: Inter-rater reliability of CT abnormalities

International Nuclear Information System (INIS)

Heuvel, D.A.V. den; Es, H.W. van; Heesewijk, J.P. van; Spee, M.; Jong, P.A. de; Zanen, P.; Grutters, J.C.

2015-01-01

To determine inter-rater reliability of sarcoidosis-related computed tomography (CT) findings that can be used for scoring of thoracic sarcoidosis. CT images of 51 patients with sarcoidosis were scored by five chest radiologists for various abnormal CT findings (22 in total) encountered in thoracic sarcoidosis. Using intra-class correlation coefficient (ICC) analysis, inter-rater reliability was analysed and reported according to the Guidelines for Reporting Reliability and Agreement Studies (GRRAS) criteria. A pre-specified sub-analysis was performed to investigate the effect of training. Scoring was trained in a distinct set of 15 scans in which all abnormal CT findings were represented. Median age of the 51 patients (36 men, 70 %) was 43 years (range 26 - 64 years). All radiographic stages were present in this group. ICC ranged from 0.91 for honeycombing to 0.11 for nodular margin (sharp versus ill-defined). The ICC was above 0.60 in 13 of the 22 abnormal findings. Sub-analysis for the best-trained observers demonstrated an ICC improvement for all abnormal findings and values above 0.60 for 16 of the 22 abnormalities. In our cohort, reliability between raters was acceptable for 16 thoracic sarcoidosis-related abnormal CT findings. (orig.)
Inter-rater reliability in the classification of supraspinatus tendon tears using 3D ultrasound – a question of experience?

Directory of Open Access Journals (Sweden)

Giorgio Tamborrini

2016-09-01

Full Text Available Background: Three-dimensional (3D ultrasound of the shoulder is characterized by a comparable accuracy to two-dimensional (2D ultrasound. No studies investigating 2D versus 3D inter-rater reliability in the detection of supraspinatus tendon tears taking into account the level of experience of the raters have been carried out so far. Objectives: The aim of this study was to determine the inter-rater reliability in the analysis of 3D ultrasound image sets of the supraspinatus tendon between sonographer with different levels of experience. Patients and methods: Non-interventional, prospective, observational pilot study of 2309 images of 127 adult patients suffering from unilateral shoulder pain. 3D ultrasound image sets were scored by three raters independently. The intra-and interrater reliabilities were calculated. Results: There was an excellent intra-rater reliability of rater A in the overall classification of supraspinatus tendon tears (2D vs 3D κ = 0.892, pairwise reliability 93.81%, 3D scoring round 1 vs 3D scoring round 2 κ = 0.875, pairwise reliability 92.857%. The inter-rater reliability was only moderate compared to rater B on 3D (κ = 0.497, pairwise reliability 70.95% and fair compared to rater C (κ = 0.238, pairwise reliability 42.38%. Conclusions: The reliability of 3D ultrasound of the supraspinatus tendon depends on the level of experience of the sonographer. Experience in 2D ultrasound does not seem to be sufficient for the analysis of 3D ultrasound imaging sets. Therefore, for a 3D ultrasound analysis new diagnostic criteria have to be established and taught even to experienced 2D sonographers to improve reproducibility.
Intra- and inter-observer variation in histological criteria used in age at death determination based on femoral cortical bone

DEFF Research Database (Denmark)

Lynnerup, N; Thomsen, J L; Frohlich, B

1998-01-01

been carried out dealing with the intra- and inter-observer error. Furthermore, when such studies have been completed, the statistical tools for assessing variability have not been adequate. This study presents the results of applying simple quantitative statistics on several counts of microscopic...... elements as observed on photographic images of cortical bone, in order to assess intra- and inter-observer error. Overall, substantial error was present at the level of identifying and counting secondary osteons, osteon fragments and Haversian canals. Only secondary osteons can be reliably identified...

Development and reliability testing of a food store observation form.

Science.gov (United States)

Rimkus, Leah; Powell, Lisa M; Zenk, Shannon N; Han, Euna; Ohri-Vachaspati, Punam; Pugach, Oksana; Barker, Dianne C; Resnick, Elissa A; Quinn, Christopher M; Myllyluoma, Jaana; Chaloupka, Frank J

2013-01-01

To develop a reliable food store observational data collection instrument to be used for measuring product availability, pricing, and promotion. Observational data collection. A total of 120 food stores (26 supermarkets, 34 grocery stores, 54 gas/convenience stores, and 6 mass merchandise stores) in the Chicago metropolitan statistical area. Inter-rater reliability for product availability, pricing, and promotion measures on a food store observational data collection instrument. Cohen's kappa coefficient and proportion of overall agreement for dichotomous variables and intra-class correlation coefficient for continuous variables. Inter-rater reliability, as measured by average kappa coefficient, was 0.84 for food and beverage product availability measures, 0.80 for interior store characteristics, and 0.70 for exterior store characteristics. For continuous measures, average intra-class correlation coefficient was 0.82 for product pricing measures; 0.90 for counts of fresh, frozen, and canned fruit and vegetable options; and 0.85 for counts of advertisements on the store exterior and property. The vast majority of measures demonstrated substantial or almost perfect agreement. Although some items may require revision, results suggest that the instrument may be used to reliably measure the food store environment. Copyright © 2013 Society for Nutrition Education and Behavior. Published by Elsevier Inc. All rights reserved.
Intra- and inter-observer reproducibility and generalizability of first trimester uterine artery pulsatility index by transabdominal and transvaginal ultrasound

NARCIS (Netherlands)

Marchi, Laura; Zwertbroek, Eva; Snelder, Judith; Kloosterman, Maaike; Bilardo, Caterina Maddalena

2016-01-01

Objectives The primary aim of the study was to assess intra-observer and inter-observer reproducibility and generalizability (general reliability) of first trimester Doppler measurements of uterine arteries (UtA) performed both transabdominally (TA) and transvaginally (TV). Secondary aims were to
Feasibility and Inter-Rater Reliability of Physical Performance Measures in Acutely Admitted Older Medical Patients

DEFF Research Database (Denmark)

Bodilsen, Ann Christine; Juul-Larsen, Helle Gybel; Petersen, Janne

2015-01-01

OBJECTIVE: Physical performance measures can be used to predict functional decline and increased dependency in older persons. However, few studies have assessed the feasibility or reliability of such measures in hospitalized older patients. Here we assessed the feasibility and inter-rater reliabi......OBJECTIVE: Physical performance measures can be used to predict functional decline and increased dependency in older persons. However, few studies have assessed the feasibility or reliability of such measures in hospitalized older patients. Here we assessed the feasibility and inter......-rater reliability of four simple measures of physical performance in acutely admitted older medical patients. DESIGN: During the first 24 hours of hospitalization, the following were assessed twice by different raters in 52 (≥ 65 years) patients admitted for acute medical illness: isometric hand grip strength, 4......, and 30-s chair stand were 8%, 7%, and 18%, and the SRD95% values were 22%, 17%, and 49%. CONCLUSION: In acutely admitted older medical patients, grip strength, gait speed, and the Cumulated Ambulation Score measurements were feasible and showed high inter-rater reliability when administered by different...
Inter-rater reliability of direct observations of the physical and psychosocial working conditions in eldercare

DEFF Research Database (Denmark)

Karstad, Kristina; Rugulies, Reiner; Skotte, Jørgen

2018-01-01

The aim of the study was to develop and evaluate the reliability of the "Danish observational study of eldercare work and musculoskeletal disorders" (DOSES) observation instrument to assess physical and psychosocial risk factors for musculoskeletal disorders (MSD) in eldercare work. During 1...... is appropriate for assessing physical and psychosocial risk factors for MSD among eldercare workers....
Intra- and inter-rater reliability of 3D passive intervertebral motion in subjects with nonspecific neck pain assessed by physical therapy students: A pilot study.

Science.gov (United States)

Rossettini, Giacomo; Rondoni, Angie; Lovato, Tommaso; Strobe, Marco; Verzè, Elisa; Vicentini, Marco; Testa, Marco

2016-06-03

Passive Intervertebral Movements (PIVMs) are commonly used to assess and treat patients with nonspecific neck pain. Only very few studies have investigated 3D movements until now. This study assessed intra- and inter-rater reliability of three-dimensional (3D) cervical PIVMs performed by physical therapy students in patients with nonspecific neck pain. Thirty-one patients, mean age 47.2 ± 7.2 years, were independently evaluated by 2 physical therapy students. The raters (A and B) assessed mobility, end-feel and pain provocation performing bilaterally the 3D cervical segmental side-bending test (3D CSSB) from levels C2-C3 to C6-C7. Percentage agreement (raw, positive and negative), Cohen's kappa (95% CI), prevalence index and bias index were calculated to estimate intra- and inter-reliability. Intra-rater reliability showed kappa values ranging between fair and substantial (k 0.29-0.80) for pain provocation, mobility and end-feel, with percentage agreements between 61%-90%. Inter-rater reliability presented kappa values ranging between fair and substantial (k 0.22-0.62) for pain provocation, mobility and end-feel, with percentage agreements between 61% and 80%. Intra-rater reliability of 3D PIVMs was superior to inter-rater reliability in patients with nonspecific neck pain. The most repeatable evaluation parameter was pain. However overall poor reliability suggests avoiding the use of these techniques alone to examine patients and measure their outcome. Further studies are needed to investigate PIVMs reliability in combination with other assessment procedure in symptomatic patients.
Ultrasound assessment for grading structural tendon changes in supraspinatus tendinopathy: an inter-rater reliability study

DEFF Research Database (Denmark)

Ingwersen, Kim Gordon; Hjarbæk, John; Eshøj, Henrik

2016-01-01

Aim To evaluate the inter-rater reliability of measuring structural changes in the tendon of patients, clinically diagnosed with supraspinatus tendinopathy (cases) and healthy participants (controls), on ultrasound (US) images captured by standardised procedures. Methods A total of 40 participant...
Inter-rater and intra-rater reliability of a clinical protocol for measuring turnout in collegiate dancers.

Science.gov (United States)

Greene, Amanda; Lasner, Andrea; Deu, Rajwinder; Oliphant, Seth; Johnson, Kenneth

2018-02-02

Reliable methods of measuring turnout in dancers and comparing active turnout (used in class) with functional (uncompensated) turnout are needed. Authors have suggested measurement techniques but there is no clinically useful, easily reproducible technique with established inter-rater and intra-rater reliability. We adapted a technique based on previous research, which is easily reproducible. We hypothesized excellent inter-rater and intra-rater reliability between experienced physical therapists (PTs) and a briefly trained faculty member from a university's department of dance. Thirty-two participants were recruited from the same dance department. Dancers' active and functional turnout was measured by each rater. We found that our technique for measuring active and functional turnout has excellent inter-rater and intra-rater reliability when performed by two experienced PTs and by one briefly trained university-level dance faculty member. For active turnout, inter-rater reliability was 0.78 among all raters and 0.82 among only the PT raters; intra-rater reliability was 0.82 among all raters and 0.85 among only the PT raters. For functional turnout, inter-rater reliability was 0.86 among all raters and 0.88 among only the PT raters; intra-rater reliability was 0.87 among all raters and 0.88 among only the PT raters. The measurement technique described provides a standardized protocol with excellent inter-rater and intra-rater reliability when performed by experienced PTs or by a briefly trained university-level dance faculty member.
Inter-rater reliability of three standardized functional tests in patients with low back pain

Science.gov (United States)

Tidstrand, Johan; Horneij, Eva

2009-01-01

Background Of all patients with low back pain, 85% are diagnosed as "non-specific lumbar pain". Lumbar instability has been described as one specific diagnosis which several authors have described as delayed muscular responses, impaired postural control as well as impaired muscular coordination among these patients. This has mostly been measured and evaluated in a laboratory setting. There are few standardized and evaluated functional tests, examining functional muscular coordination which are also applicable in the non-laboratory setting. In ordinary clinical work, tests of functional muscular coordination should be easy to apply. The aim of this present study was to therefore standardize and examine the inter-rater reliability of three functional tests of muscular functional coordination of the lumbar spine in patients with low back pain. Methods Nineteen consecutive individuals, ten men and nine women were included. (Mean age 42 years, SD ± 12 yrs). Two independent examiners assessed three tests: "single limb stance", "sitting on a Bobath ball with one leg lifted" and "unilateral pelvic lift" on the same occasion. The standardization procedure took altered positions of the spine or pelvis and compensatory movements of the free extremities into account. The inter-rater reliability was analyzed by Cohen's kappa coefficient (κ) and by percentage agreement. Results The inter-rater reliability for the right and the left leg respectively was: for the single limb stance very good (κ: 0.88–1.0), for sitting on a Bobath ball good (κ: 0.79) and very good (κ: 0.88) and for the unilateral pelvic lift: good (κ: 0.61) and moderate (κ: 0.47). Conclusion The present study showed good to very good inter-rater reliability for two standardized tests, that is, the single-limb stance and sitting on a Bobath-ball with one leg lifted. Inter-rater reliability for the unilateral pelvic lift test was moderate to good. Validation of the tests in their ability to evaluate lumbar
Inter-rater reliability of three standardized functional tests in patients with low back pain

Directory of Open Access Journals (Sweden)

Tidstrand Johan

2009-06-01

Full Text Available Abstract Background Of all patients with low back pain, 85% are diagnosed as "non-specific lumbar pain". Lumbar instability has been described as one specific diagnosis which several authors have described as delayed muscular responses, impaired postural control as well as impaired muscular coordination among these patients. This has mostly been measured and evaluated in a laboratory setting. There are few standardized and evaluated functional tests, examining functional muscular coordination which are also applicable in the non-laboratory setting. In ordinary clinical work, tests of functional muscular coordination should be easy to apply. The aim of this present study was to therefore standardize and examine the inter-rater reliability of three functional tests of muscular functional coordination of the lumbar spine in patients with low back pain. Methods Nineteen consecutive individuals, ten men and nine women were included. (Mean age 42 years, SD ± 12 yrs. Two independent examiners assessed three tests: "single limb stance", "sitting on a Bobath ball with one leg lifted" and "unilateral pelvic lift" on the same occasion. The standardization procedure took altered positions of the spine or pelvis and compensatory movements of the free extremities into account. The inter-rater reliability was analyzed by Cohen's kappa coefficient (κ and by percentage agreement. Results The inter-rater reliability for the right and the left leg respectively was: for the single limb stance very good (κ: 0.88–1.0, for sitting on a Bobath ball good (κ: 0.79 and very good (κ: 0.88 and for the unilateral pelvic lift: good (κ: 0.61 and moderate (κ: 0.47. Conclusion The present study showed good to very good inter-rater reliability for two standardized tests, that is, the single-limb stance and sitting on a Bobath-ball with one leg lifted. Inter-rater reliability for the unilateral pelvic lift test was moderate to good. Validation of the tests in their
Inter- and intra-examiner reliability of footprint pattern analysis obtained from diabetics using the Harris mat.

Science.gov (United States)

Cisneros, Lígia de Loiola; Fonseca, Tiago H S; Abreu, Vivianni C

2010-01-01

High plantar pressure is a proven risk factor for ulceration among individuals with diabetes mellitus. The Harris and Beath footprinting mat is one of the tools used in screening for foot ulceration risk among these subjects. There are no reports in the literature on the reliability of footprint analysis using print pattern criteria. The aim of this study was to evaluate the inter- and intra-examiner reliability of the analysis of footprint patterns obtained using the Harris and Beath footprinting mat. Footprints were taken from 41 subjects using the footprinting mat. The images were subjected to analysis by three independent examiners. To investigate the intra-examiner reliability, the analysis was repeated by one of the examiners one week later. The weighted kappa coefficient was excellent (K(w) > 0.80) for the inter- and intra-examiner analyses for most of the points studied on both feet. The criteria for analyzing footprint patterns obtained using the Harris and Beath footprinting mat presented good reliability and high to excellent inter- and intra-examiner agreement. This method is reliable for analyses involving one or more examiners. Article registered in the Australian New Zealand Clinical Trials Registry (ANZCTR) under the number ACTRN12609000693224.
Development of a quality-assessment tool for experimental bruxism studies: reliability and validity.

Science.gov (United States)

Dawson, Andreas; Raphael, Karen G; Glaros, Alan; Axelsson, Susanna; Arima, Taro; Ernberg, Malin; Farella, Mauro; Lobbezoo, Frank; Manfredini, Daniele; Michelotti, Ambra; Svensson, Peter; List, Thomas

2013-01-01

To combine empirical evidence and expert opinion in a formal consensus method in order to develop a quality-assessment tool for experimental bruxism studies in systematic reviews. Tool development comprised five steps: (1) preliminary decisions, (2) item generation, (3) face-validity assessment, (4) reliability and discriminitive validity assessment, and (5) instrument refinement. The kappa value and phi-coefficient were calculated to assess inter-observer reliability and discriminative ability, respectively. Following preliminary decisions and a literature review, a list of 52 items to be considered for inclusion in the tool was compiled. Eleven experts were invited to join a Delphi panel and 10 accepted. Four Delphi rounds reduced the preliminary tool-Quality-Assessment Tool for Experimental Bruxism Studies (Qu-ATEBS)- to 8 items: study aim, study sample, control condition or group, study design, experimental bruxism task, statistics, interpretation of results, and conflict of interest statement. Consensus among the Delphi panelists yielded good face validity. Inter-observer reliability was acceptable (k = 0.77). Discriminative validity was excellent (phi coefficient 1.0; P reviews of experimental bruxism studies, exhibits face validity, excellent discriminative validity, and acceptable inter-observer reliability. Development of quality assessment tools for many other topics in the orofacial pain literature is needed and may follow the described procedure.
Pre-operative Duplex Ultrasonography in Arteriovenous Fistula Creation: Intra- and Inter-observer Agreement.

Science.gov (United States)

Zonnebeld, Niek; Maas, Tommy M G; Huberts, Wouter; van Loon, Magda M; Delhaas, Tammo; Tordoir, Jan H M

2017-11-01

Although clinical guidelines on arteriovenous fistula (AVF) creation advocate minimum luminal arterial and venous diameters, assessed by duplex ultrasonography (DUS), the clinical value of routine DUS examination is under debate. DUS might be an insufficiently repeatable and/or reproducible imaging modality because of its operator dependency. The present study aimed to assess intra- and inter-observer agreement of DUS examination in support of AVF surgery planning. Ten end stage renal disease patients were included, to assess intra- and inter-observer agreement of pre-operative DUS measurements. All measurements were performed by two trained and experienced vascular technicians, blinded to measurement readings. From the routine DUS protocol, representative measurements (venous diameters, and arterial diameters and volume flow in the upper arm and forearm) were selected. For intra-observer agreement the measurements were performed in triplicate, with the probe released from the skin between each. Intraclass correlation coefficients were calculated for intra- and inter-observer agreement, and Bland-Altman plots used to graphically display mean measurement differences and limits of agreement. Ten patients (6 male, 59.4±19.7 years) consented to participate, and all predefined measurements were obtained. Intraclass correlation coefficients for intra-observer agreement of diameter measurements were at least 0.90 (95% CI 0.74-0.97; radial artery). Inter-observer agreement was at least 0.83 (0.46-0.96; lateral diameter upper arm cephalic vein). The Bland-Altman plots showed acceptable mean measurement differences and limits of agreement. In experienced hands, excellent intra- and inter-observer agreement can be reached for the discrete pre-operative DUS measurements advocated in clinical guidelines. DUS is therefore a reliable imaging modality to support AVF surgery planning. The content of DUS protocols, however, needs further standardisation. Copyright © 2017 European
Test–re-test reliability and inter-rater reliability of a digital pelvic inclinometer in young, healthy males and females

OpenAIRE

Chris Beardsley; Tim Egerton; Brendon Skinner

2016-01-01

Objective. The purpose of this study was to investigate the reliability of a digital pelvic inclinometer (DPI) for measuring sagittal plane pelvic tilt in 18 young, healthy males and females.\\ud \\ud Method. The inter-rater reliability and test–re-test reliabilities of the DPI for measuring pelvic tilt in standing on both the right and left sides of the pelvis were measured by two raters carrying out two rating sessions of the same subjects, three weeks apart.\\ud \\ud Results. For measuring pel...
Development and inter-rater reliability of a standardized verbal instruction manual for the Chinese Geriatric Depression Scale-short form.

Science.gov (United States)

Wong, M T P; Ho, T P; Ho, M Y; Yu, C S; Wong, Y H; Lee, S Y

2002-05-01

The Geriatric Depression Scale (GDS) is a common screening tool for elderly depression in Hong Kong. This study aimed at (1) developing a standardized manual for the verbal administration and scoring of the GDS-SF, and (2) comparing the inter-rater reliability between the standardized and non-standardized verbal administration of GDS-SF. Two studies were reported. In Study 1, the process of developing the manual was described. In Study 2, we compared the inter-rater reliabilities of GDS-SF scores using the standardized verbal instructions and the traditional non-standardized administration. Results of Study 2 indicated that the standardized procedure in verbal administration and scoring improved the inter-rater reliabilities of GDS-SF. Copyright 2002 John Wiley & Sons, Ltd.
Validity and reliability of the Mastication Observation and Evaluation (MOE) instrument.

Science.gov (United States)

Remijn, Lianne; Speyer, Renée; Groen, Brenda E; van Limbeek, Jacques; Nijhuis-van der Sanden, Maria W G

2014-07-01

The Mastication Observation and Evaluation (MOE) instrument was developed to allow objective assessment of a child's mastication process. It contains 14 items and was developed over three Delphi rounds. The present study concerns the further development of the MOE using the COSMIN (Consensus based Standard for the Selection of Measurement Instruments) and investigated the instrument's internal consistency, inter-observer reliability, construct validity and floor and ceiling effects. Consumption of three bites of bread and biscuit was evaluated using the MOE. Data of 59 healthy children (6-48 mths) and 38 children (bread) and 37 children (biscuit) with cerebral palsy (24-72 mths) were used. Four items were excluded before analysis due to zero variance. Principal Components Analysis showed one factor with 8 items. Internal consistency was >0.70 (Chronbach's alpha) for both food consistencies and for both groups of children. Inter-observer reliability varied from 0.51 to 0.98 (weighted Gwet's agreement coefficient). The total MOE scores for both groups showed normal distribution for the population. There were no floor or ceiling effects. The revised MOE now contains 8 items that (a) have a consistent concept for mastication and can be scored on a 4-point scale with sufficient reliability and (b) are sensitive to stages of chewing development in young children. The removed items are retained as part of a criterion referenced list within the MOE. Copyright © 2014 Elsevier Ltd. All rights reserved.
Inter-examiner reliability of a standardized Ultra-sonographic method for classification of changes related to supraspinatus tendinopathy – a pilot study

DEFF Research Database (Denmark)

Larsen, Camilla Marie; Ingwersen, Kim Gordon; Hjarnbæk, John

2015-01-01

Inter-examiner reliability of a standardized Ultra-sonographic method for classification of changes related to supraspinatus tendinopathy – a pilot study Ingwersen KG1, 2, Hjarbaek J3, Eshøj H1, Larsen CM1, 4, Vobbe J5, Juul-Kristensen B1, 6 1Institute of Sports Science and Clinical Biomechanics......, University of Southern Denmark, Odense, Denmark. 2Physiotherapy Department, Hospital Lillebaelt, Vejle Hospital, Vejle, Denmark 3Department of Radiology, Musculoskeletal section, Odense University Hospital, Odense, Denmark 4Health Sciences Research Centre, University College Lillebaelt, Odense Denmark 5...... athletes. For optimizing rehabilitation to the different stages of tendinopathy (1) ultra-sonography (US) may be used. Reliability of such method for RT is lacking. Aims. To develop and test inter-examiner reliability of US for classifying RT. Materials and Methods. A three-phased standardized protocol...
Inter-rater and test-retest reliability of quality assessments by novice student raters using the Jadad and Newcastle-Ottawa Scales.

Science.gov (United States)

Oremus, Mark; Oremus, Carolina; Hall, Geoffrey B C; McKinnon, Margaret C

2012-01-01

Quality assessment of included studies is an important component of systematic reviews. The authors investigated inter-rater and test-retest reliability for quality assessments conducted by inexperienced student raters. Student raters received a training session on quality assessment using the Jadad Scale for randomised controlled trials and the Newcastle-Ottawa Scale (NOS) for observational studies. Raters were randomly assigned into five pairs and they each independently rated the quality of 13-20 articles. These articles were drawn from a pool of 78 papers examining cognitive impairment following electroconvulsive therapy to treat major depressive disorder. The articles were randomly distributed to the raters. Two months later, each rater re-assessed the quality of half of their assigned articles. McMaster Integrative Neuroscience Discovery and Study Program. 10 students taking McMaster Integrative Neuroscience Discovery and Study Program courses. The authors measured inter-rater reliability using κ and the intraclass correlation coefficient type 2,1 or ICC(2,1). The authors measured test-retest reliability using ICC(2,1). Inter-rater reliability varied by scale question. For the six-item Jadad Scale, question-specific κs ranged from 0.13 (95% CI -0.11 to 0.37) to 0.56 (95% CI 0.29 to 0.83). The ranges were -0.14 (95% CI -0.28 to 0.00) to 0.39 (95% CI -0.02 to 0.81) for the NOS cohort and -0.20 (95% CI -0.49 to 0.09) to 1.00 (95% CI 1.00 to 1.00) for the NOS case-control. For overall scores on the six-item Jadad Scale, ICC(2,1)s for inter-rater and test-retest reliability (accounting for systematic differences between raters) were 0.32 (95% CI 0.08 to 0.52) and 0.55 (95% CI 0.41 to 0.67), respectively. Corresponding ICC(2,1)s for the NOS cohort were -0.19 (95% CI -0.67 to 0.35) and 0.62 (95% CI 0.25 to 0.83), and for the NOS case-control, the ICC(2,1)s were 0.46 (95% CI -0.13 to 0.92) and 0.83 (95% CI 0.48 to 0.95). Inter-rater reliability was generally poor
High inter-tester reliability of the new mobility score in patients with hip fracture

DEFF Research Database (Denmark)

Kristensen, M.T.; Bandholm, T.; Foss, N.B.

2008-01-01

OBJECTIVE: To assess the inter-tester reliability of the New Mobility Score in patients with acute hip fracture. DESIGN: An inter-tester reliability study. SUBJECTS: Forty-eight consecutive patients with acute hip fracture at a median age of 84 (interquartile range, 76-89) years; 40 admitted from...... their own home and 8 from nursing homes to an acute orthopaedic hip fracture unit at a university hospital. METHODS: The New Mobility Score, which evaluates the prefracture functional level with a score from 0 (not able to walk at all) to 9 (fully independent), was assessed by 2 independent physiotherapists...... the prefracture functional level in patients with acute hip fracture Udgivelsesdato: 2008/7...
Inter-rater reliability of measures to characterize the tobacco retail environment in Mexico

Directory of Open Access Journals (Sweden)

Marissa G Hall

2015-11-01

Full Text Available Objective. To evaluate the inter-rater reliability of a data collection instrument to assess the tobacco retail environ- ment in Mexico, after major marketing regulations were implemented. Materials and methods. In 2013, two data collectors independently evaluated 21 stores in two census tracts, through a data collection instrument that assessed the presence of price promotions, whether single cigarettes were sold, the number of visible advertisements, the pre- sence of signage prohibiting the sale of cigarettes to minors, and characteristics of cigarette pack displays. We evaluated the inter-rater reliability of the collected data, through the calculation of metrics such as intraclass correlation coefficient, percent agreement, Cohen’s kappa and Krippendorff’s alpha. Results. Most measures demonstrated substantial or perfect inter-rater reliability. Conclusions. Our results indicate the potential utility of the data collection instrument for future point-of-sale research.
Standardization, Validity and Reliability Study of Gülhane Aphasia Test-2 (GAT-2

Directory of Open Access Journals (Sweden)

İlknur Maviş

2007-04-01

Full Text Available OBJECTIVE: Gülhane Aphasia Test-2 (GAT-2 has been developed to show the presence of a language disorder ‘aphasia’ and to give the clinician implications for the accompanying speech disorders such as apraxia and dysarthria. OBJECTIVE: The aim of the study was to report standardization, validity and reliability study of GAT-2. METHODS: : 10 healthy individuals were tested initially for the pilot study. 134 healthy individual was included to the standardization study and 30 individuals with aphasia and 11 individuals with right brain injury was included to the validation study. The inter group GAT-2 score differentiations and the effects of age, years of education, sex variances were observed. GAT-2 cut-off scores were calculated by the scores of healthy individuals. GAT-2 test-retest reliability and inter-observer reliability was calculated. RESULTS: Healthy individuals’ GAT-2 scores were significantly different from the GAT-2 scores of aphasic patients, but not from right brain injured patients’. Healthy individuals’ GAT-2 scores were not affected from the sex, age variances but from years of education, so cut-off scores were calculated by this variance. GAT-2 scores of aphasic patients were not affected from age, sex and years of education. Test-retest and inter-observer reliability and internal consistency results showed that GAT-2 is a highly reliable aphasia screening test. CONCLUSION: GAT-2 was found to be a standardized, highly reliable and a valid aphasia test for Turkish stroke patients with aphasia

THE INTRA- AND INTER-RATER RELIABILITY OF THE SOCCER INJURY MOVEMENT SCREEN (SIMS).

Science.gov (United States)

McCunn, Robert; Aus der Fünten, Karen; Govus, Andrew; Julian, Ross; Schimpchen, Jan; Meyer, Tim

2017-02-01

The growing volume of movement screening research reveals a belief among practitioners and researchers alike that movement quality may have an association with injury risk. However, existing movement screening tools have not considered the sport-specific movement and injury patterns relevant to soccer. The present study introduces the Soccer Injury Movement Screen (SIMS), which has been designed specifically for use within soccer. Furthermore, the purpose of the present study was to assess the intra- and inter-rater reliability of the SIMS and determine its suitability for use in further research. The study utilized a test-retest design to discern reliablility. Twenty-five (11 males, 14 females) healthy, recreationally active university students (age 25.5 ± 4.0 years, height 171 ± 9 cm, weight 64.7 ± 12.6 kg) agreed to participate. The SIMS contains five sub-tests: the anterior reach, single-leg deadlift, in-line lunge, single-leg hop for distance and tuck jump. Each movement was scored out of 10 points and summed to produce a composite score out of 50. The anterior reach and single-leg hop for distance were scored in real-time while the remaining tests were filmed and scored retrospectively. Three raters conducted the SIMS with each participant on three occasions separated by an average of three and a half days (minimum one day, maximum seven days). Rater 1 re-scored the filmed movements for all participants on all occasions six months later to establish the 'pure' intra-rater (intra-occasion) reliability for those movements. Intraclass correlation coefficient (ICC) values for intra- and inter-rater composite score reliability ranged from 0.66-0.72 and 0.79-0.86 respectively. Weighted kappa values representing the intra- and inter-rater reliability of the individual sub-tests ranged from 0.35-0.91 indicating fair to almost perfect agreement. Establishing the reliability of the SIMS is a prerequisite for further research seeking to investigate
Inter-tester reliability of selected clinical tests for long-lasting temporomandibular disorders.

Science.gov (United States)

Julsvoll, Elisabeth Heggem; Vøllestad, Nina Køpke; Opseth, Gro; Robinson, Hilde Stendal

2017-09-01

Clinical tests used to examine patients with temporomandibular disorders vary in methodological quality, and some are not tested for reliability. The purpose of this cross-sectional study was to evaluate inter-tester reliability of clinical tests and a cluster of tests used to examine patients with long-lasting temporomandibular disorders. Forty patients with pain in the temporomandibular area treated by health-professionals were included. They were between 18-70 years, had 65 symptomatic (33 right/32 left) and 15 asymptomatic joints. Two manual therapists examined all participants with selected tests. Percentage agreement and the kappa coefficient ( k ) with 95% confidence interval (CI) were used to evaluate the tests with categorical outcomes. For tests with continuous outcomes, the relative inter-tester reliability was assessed by the intraclass-correlation-coefficient (ICC 3,1 , 95% CI) and the absolute reliability was calculated by the smallest detectable change (SDC). The best reliability among single tests was found for the dental stick test, the joint-sound test ( k = 0.80-1.0) and range of mouth-opening (ICC 3,1 (95% CI) = 0.97 (0.95-0.98) and SDC = 4 mm). The reliability of cluster of tests was excellent with both four and five positive tests out of seven. The reliability was good to excellent for the clinical tests and the cluster of tests when performed by experienced therapists. The tests are feasible for use in the clinical setting. They require no advanced equipment and are easy to perform.
Test-retest, inter-assessor and intra-assessor reliability of the modified Touwen examination

NARCIS (Netherlands)

Peters, Lieke H. J.; Maathuis, Karel G. B.; Kouw, Eva; Hamming, Marjolein; Hadders-Algra, Mijna

Interest in the Touwen examination (1979) for the assessment of minor neurological dysfunction (MND) is growing. However, information on psychometric properties of this assessment is scarce. Therefore the present study aimed at assessing the test's test-retest, inter- and intra-assessor reliability.
Inter- and Intraexaminer Reliability in Identifying and Classifying Myofascial Trigger Points in Shoulder Muscles.

Science.gov (United States)

Nascimento, José Diego Sales do; Alburquerque-Sendín, Francisco; Vigolvino, Lorena Passos; Oliveira, Wandemberg Fortunato de; Sousa, Catarina de Oliveira

2018-01-01

To determine inter- and intraexaminer reliability of examiners without clinical experience in identifying and classifying myofascial trigger points (MTPs) in the shoulder muscles of subjects asymptomatic and symptomatic for unilateral subacromial impact syndrome (SIS). Within-day inter- and intraexaminer reliability study. Physical therapy department of a university. Fifty-two subjects participated in the study, 26 symptomatic and 26 asymptomatic for unilateral SIS. Two examiners, without experience for assessing MTPs, independent and blind to the clinical conditions of the subjects, assessed bilaterally the presence of MTPs (present or absent) in 6 shoulder muscles and classified them (latent or active) on the affected side of the symptomatic group. Each examiner performed the same assessment twice in the same day. Reliability was calculated through percentage agreement, prevalence- and bias-adjusted kappa (PABAK) statistics, and weighted kappa. Intraexaminer reliability in identifying MTPs for the symptomatic and asymptomatic groups was moderate to perfect (PABAK, .46-1 and .60-1, respectively). Interexaminer reliability was between moderate and almost perfect in the 2 groups (PABAK, .46-.92), except for the muscles of the symptomatic group, which were below these values. With respect to MTP classification, intraexaminer reliability was moderate to high for most muscles, but interexaminer reliability was moderate for only 1 muscle (weighted κ=.45), and between weak and reasonable for the rest (weighted κ=.06-.31). Intraexaminer reliability is acceptable in clinical practice to identify and classify MTPs. However, interexaminer reliability proved to be reliable only to identify MTPs, with the symptomatic side exhibiting lower values of reliability. Copyright © 2017 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Inter-rater and test–retest reliability of quality assessments by novice student raters using the Jadad and Newcastle–Ottawa Scales

Science.gov (United States)

Oremus, Carolina; Hall, Geoffrey B C; McKinnon, Margaret C

2012-01-01

Introduction Quality assessment of included studies is an important component of systematic reviews. Objective The authors investigated inter-rater and test–retest reliability for quality assessments conducted by inexperienced student raters. Design Student raters received a training session on quality assessment using the Jadad Scale for randomised controlled trials and the Newcastle–Ottawa Scale (NOS) for observational studies. Raters were randomly assigned into five pairs and they each independently rated the quality of 13–20 articles. These articles were drawn from a pool of 78 papers examining cognitive impairment following electroconvulsive therapy to treat major depressive disorder. The articles were randomly distributed to the raters. Two months later, each rater re-assessed the quality of half of their assigned articles. Setting McMaster Integrative Neuroscience Discovery and Study Program. Participants 10 students taking McMaster Integrative Neuroscience Discovery and Study Program courses. Main outcome measures The authors measured inter-rater reliability using κ and the intraclass correlation coefficient type 2,1 or ICC(2,1). The authors measured test–retest reliability using ICC(2,1). Results Inter-rater reliability varied by scale question. For the six-item Jadad Scale, question-specific κs ranged from 0.13 (95% CI −0.11 to 0.37) to 0.56 (95% CI 0.29 to 0.83). The ranges were −0.14 (95% CI −0.28 to 0.00) to 0.39 (95% CI −0.02 to 0.81) for the NOS cohort and −0.20 (95% CI −0.49 to 0.09) to 1.00 (95% CI 1.00 to 1.00) for the NOS case–control. For overall scores on the six-item Jadad Scale, ICC(2,1)s for inter-rater and test–retest reliability (accounting for systematic differences between raters) were 0.32 (95% CI 0.08 to 0.52) and 0.55 (95% CI 0.41 to 0.67), respectively. Corresponding ICC(2,1)s for the NOS cohort were −0.19 (95% CI −0.67 to 0.35) and 0.62 (95% CI 0.25 to 0.83), and for the NOS case–control, the ICC(2
Assessment of scapular positioning and function as future effect measure of shoulder interventions – an inter-examiner reliability study of the clinical assessment methods

DEFF Research Database (Denmark)

Larsen, Camilla Marie; Eshøj, Henrik; Ingwersen, Kim Gordon

2015-01-01

Assessment of scapular positioning and function as future effect measure of shoulder interventions – an inter-examiner reliability study of the clinical assessment methods Eshøj H1, Ingwersen KG1, Larsen CM1, 2, Søgaard K1, Juul-Kristensen B1, 3 1 University of Southern Denmark, Institute of Sports...... only been tested for intra-examiner reliability. The objective was to investigate the inter-examiner reliability of an extended battery of clinical tests for assessing scapular positioning and function. Methods A standardized three-phase protocol for clinical reliability studies was conducted...... coefficients (ICC) and kappa values were interpreted as: 0.0-0.40 (poor); 0.40-0.75 (fair to good); and 0.75-1.00 (good to excellent). Results A total of 41 subjects (23 males, yrs 25±9), were recruited among adult overhead athletes from the municipality of Odense, DK. Prevalence of the index condition was 54...
Approaches to describing inter-rater reliability of the overall clinical appearance of febrile infants and toddlers in the emergency department

Directory of Open Access Journals (Sweden)

Paul Walsh

2014-11-01

Full Text Available Objectives. To measure inter-rater agreement of overall clinical appearance of febrile children aged less than 24 months and to compare methods for doing so.Study Design and Setting. We performed an observational study of inter-rater reliability of the assessment of febrile children in a county hospital emergency department serving a mixed urban and rural population. Two emergency medicine healthcare providers independently evaluated the overall clinical appearance of children less than 24 months of age who had presented for fever. They recorded the initial ‘gestalt’ assessment of whether or not the child was ill appearing or if they were unsure. They then repeated this assessment after examining the child. Each rater was blinded to the other’s assessment. Our primary analysis was graphical. We also calculated Cohen’s κ, Gwet’s agreement coefficient and other measures of agreement and weighted variants of these. We examined the effect of time between exams and patient and provider characteristics on inter-rater agreement.Results. We analyzed 159 of the 173 patients enrolled. Median age was 9.5 months (lower and upper quartiles 4.9–14.6, 99/159 (62% were boys and 22/159 (14% were admitted. Overall 118/159 (74% and 119/159 (75% were classified as well appearing on initial ‘gestalt’ impression by both examiners. Summary statistics varied from 0.223 for weighted κ to 0.635 for Gwet’s AC2. Inter rater agreement was affected by the time interval between the evaluations and the age of the child but not by the experience levels of the rater pairs. Classifications of ‘not ill appearing’ were more reliable than others.Conclusion. The inter-rater reliability of emergency providers’ assessment of overall clinical appearance was adequate when described graphically and by Gwet’s AC. Different summary statistics yield different results for the same dataset.
Analyses of inter-rater reliability between professionals, medical students and trained school children as assessors of basic life support skills.

Science.gov (United States)

Beck, Stefanie; Ruhnke, Bjarne; Issleib, Malte; Daubmann, Anne; Harendza, Sigrid; Zöllner, Christian

2016-10-07

Training of lay-rescuers is essential to improve survival-rates after cardiac arrest. Multiple campaigns emphasise the importance of basic life support (BLS) training for school children. Trainings require a valid assessment to give feedback to school children and to compare the outcomes of different training formats. Considering these requirements, we developed an assessment of BLS skills using MiniAnne and tested the inter-rater reliability between professionals, medical students and trained school children as assessors. Fifteen professional assessors, 10 medical students and 111-trained school children (peers) assessed 1087 school children at the end of a CPR-training event using the new assessment format. Analyses of inter-rater reliability (intraclass correlation coefficient; ICC) were performed. Overall inter-rater reliability of the summative assessment was high (ICC = 0.84, 95 %-CI: 0.84 to 0.86, n = 889). The number of comparisons between peer-peer assessors (n = 303), peer-professional assessors (n = 339), and peer-student assessors (n = 191) was adequate to demonstrate high inter-rater reliability between peer- and professional-assessors (ICC: 0.76), peer- and student-assessors (ICC: 0.88) and peer- and other peer-assessors (ICC: 0.91). Systematic variation in rating of specific items was observed for three items between professional- and peer-assessors. Using this assessment and integrating peers and medical students as assessors gives the opportunity to assess hands-on skills of school children with high reliability.
Inter-rater reliability of diagnostic criteria for sacroiliac joint-, disc- and facet joint pain.

Science.gov (United States)

van Tilburg, Cornelis W J; Groeneweg, Johannes G; Stronks, Dirk L; Huygen, Frank J P M

2017-01-01

Several diagnostic criteria sets are described in the literature to identify low back pain subtypes, but very little is known about the inter-rater reliability of these criteria. We conducted a study to determine the reliability of diagnostic tests that point towards SI joint-, disc- or facet joint pain. Inter-rater reliability study alongside three randomized clinical trials. Multidisciplinary pain center of general hospital. Patients aged 18 or more with medical history and physical examination suggestive of sacroiliac joint-, disc- and facet joint pain on lumbar level. Making use of nowadays most common used diagnostic criteria, a physical examination is taken independently by three physicians (two pain physicians and one orthopedic surgeon). Inter-rater reliability (Kappa (κ) measure of agreement) and significance (p) between raters are presented. Strengths of agreement, indicated with κ values above 0,20, are presented in order of agreement. One hundred patients were included. None of the parameters from the physical investigation had κ values of more than 0.21 (fair) in all pairs of raters. Between two raters (C and D), there was an almost perfect agreement on three parameters, more specifically ``Abnormal sensory and motor examination, hyperactive or diminished reflexes'', ``Sitting exam shows no reflex, motor or sensory signs in the legs'' and ``Straight leg raising (Laségue) negative between 30 and 70 degrees of flexion''. The ``Drop test positive'' parameters had moderate strength of agreement between raters A and D and fair strength between raters A and B. The ``Digital interspinous pressure test positive'' had moderate strength of agreement between raters C and D and fair strength of agreement between raters A and B as well as raters B and C. Three other parameters had a fair strength of agreement between two raters, all other parameters had a slight or poor strength of agreement. Inter-rater reliability, confidence intervals and significance of
Intra- and inter-rater reliability of the Knee Society Knee Score when used by two physiotherapists in patients post total knee arthroplasty

Directory of Open Access Journals (Sweden)

S. Gopal

2010-01-01

Full Text Available Background and Purpose: It has yet to be shown whether routine physiotherapy plays a role in the rehabilitation of patients post totalknee arthroplasty (Rajan et al 2004. Physiotherapists should be using validoutcome measures to provide evidence of the benefit of their intervention. The aim of this study was to establish the intra and inter-rater reliability of the Knee Society Knee Score, a scoring system developed by Insall et al(1989. The Knee Society Knee Score can be used to assess the integrity of theknee joint of patients undergoing total knee arthroplasty. Since the scoreinvolves clinical testing, the intra-rater reliability of the clinician should be established prior to using the scores as datain clinical research. W here multiple clinicians are involved, inter-rater reliability should also be established.Design: This was a correlation study.Subjects: A sample of thirty patients post total knee arthroplasty attending the arthroplasty clinic at Johannesburg Hospital between six weeks and twelve months postoperatively.M ethod: Recruited patients were evaluated twice with a time interval of one hour between each assessment. Statistical A nalysis: The intra- and inter-rater reliability were estimated using Intraclass Correlation Coefficient (ICC. R esults: The intra-rater reliability showed excellent reliability (h= 0.95 for Examiner A and good reliability (h= 0.71for Examiner B. The inter-rater reliability showed moderate reliability (h= 0.67 during test one and h= 0.66 during test two.Conclusion: The KSKS has good intra-rater reliability when tested within a period of one hour. The KSKS demonstrated moderate agreement for inter rater reliability.
Inter-rater reliability of healthcare professional skills' portfolio assessments: The Andalusian Agency for Healthcare Quality model

Directory of Open Access Journals (Sweden)

Antonio Almuedo-Paz

2014-07-01

Full Text Available This study aims to determine the reliability of assessment criteria used for a portfolio at the Andalusian Agency for Healthcare Quality (ACSA. Data: all competences certification processes, regardless of their discipline. Period: 2010-2011. Three types of tests are used: 368 certificates, 17,895 reports and 22,642 clinical practice reports (N = 3,010 candidates. The tests were evaluated in pairs by the ACSA team of raters using two categories: valid and invalid. Results: The percentage agreement in assessments of certificates was 89,9%, while for the reports of clinical practice was 85,1 % and for clinical practice reports was 81,7%. The inter-rater agreement coefficients (kappa ranged from 0,468 to 0,711. Discussion: The results of this study show that the inter-rater reliability of assessments varies from fair to good. Compared with other similar studies, the results put the reliability of the model in a comfortable position. Among the improvements incorporated, progressive automation of evaluations must be highlighted.
Reliability of joint count assessment in rheumatoid arthritis: a systematic literature review.

Science.gov (United States)

Cheung, Peter P; Gossec, Laure; Mak, Anselm; March, Lyn

2014-06-01

Joint counts are central to the assessment of rheumatoid arthritis (RA) but reliability is an issue. To evaluate the reliability and agreement of joint counts (intra-observer and inter-observer) by health care professionals (physicians, nurses, and metrologists) and patients in RA, and the impact of training and standardization on joint count reliability through a systematic literature review. Articles reporting joint count reliability or agreement in RA in PubMed, EMBase, and the Cochrane library between 1960 and 2012 were selected. Data were extracted regarding tender joint counts (TJCs) and swollen joint counts (SJCs) derived by physicians, metrologists, or patients for intra-observer and inter-observer reliability. In addition, methods and effects of training or standardization were extracted. Statistics expressing reliability such as intraclass correlation coefficients (ICCs) were extracted. Data analysis was primarily descriptive due to high heterogeneity. Twenty-eight studies on health care professionals (HCP) and 20 studies on patients were included. Intra-observer reliability for TJCs and SJCs was good for HCPs and patients (range of ICC: 0.49-0.98). Inter-observer reliability between HCPs for TJCs was higher than for SJCs (range of ICC: 0.64-0.88 vs. 0.29-0.98). Patient inter-observer reliability with HCPs as comparators was better for TJCs (range of ICC: 0.31-0.91) compared to SJCs (0.16-0.64). Nine studies (7 with HCPs and 2 with patients) evaluated consensus or training, with improvement in reliability of TJCs but conflicting evidence for SJCs. Intra- and inter-observer reliability was high for TJCs for HCPs and patients: among all groups, reliability was better for TJCs than SJCs. Inter-observer reliability of SJCs was poorer for patients than HCPs. Data were inconclusive regarding the potential for training to improve SJC reliability. Overall, the results support further evaluation for patient-reported joint counts as an outcome measure. © 2013
The inter-rater reliability and prognostic value of coma scales in Nepali children with acute encephalitis syndrome.

Science.gov (United States)

Ray, Stephen; Rayamajhi, Ajit; Bonnett, Laura J; Solomon, Tom; Kneen, Rachel; Griffiths, Michael J

2018-02-01

Background Acute encephalitis syndrome (AES) is a common cause of coma in Nepali children. The Glasgow coma scale (GCS) is used to assess the level of coma in these patients and predict outcome. Alternative coma scales may have better inter-rater reliability and prognostic value in encephalitis in Nepali children, but this has not been studied. The Adelaide coma scale (ACS), Blantyre coma scale (BCS) and the Alert, Verbal, Pain, Unresponsive scale (AVPU) are alternatives to the GCS which can be used. Methods Children aged 1-14 years who presented to Kanti Children's Hospital, Kathmandu with AES between September 2010 and November 2011 were recruited. All four coma scales (GCS, ACS, BCS and AVPU) were applied on admission, 48 h later and on discharge. Inter-rater reliability (unweighted kappa) was measured for each. Correlation and agreement between total coma score and outcome (Liverpool outcome score) was measured by Spearman's rank and Bland-Altman plot. The prognostic value of coma scales alone and in combination with physiological variables was investigated in a subgroup (n = 22). A multivariable logistic regression model was fitted by backward stepwise. Results Fifty children were recruited. Inter-rater reliability using the variables scales was fair to moderate. However, the scales poorly predicted clinical outcome. Combining the scales with physiological parameters such as systolic blood pressure improved outcome prediction. Conclusion This is the first study to compare four coma scales in Nepali children with AES. The scales exhibited fair to moderate inter-rater reliability. However, the study is inadequately powered to answer the question on the relationship between coma scales and outcome. Further larger studies are required.
Inter-observer agreement in audit of quality of radiology requests and reports

International Nuclear Information System (INIS)

Stavem, K.; Foss, T.; Botnmark, O.; Andersen, O.K.; Erikssen, J.

2004-01-01

AIMS: To assess the quality of the imaging procedure requests and radiologists' reports using an auditing tool, and to assess the agreement between different observers of the quality parameters. MATERIALS AND METHODS: In an audit using a standardized scoring system, three observers reviewed request forms for 296 consecutive radiological examinations, and two observers reviewed a random sample of 150 of the corresponding radiologists' reports. We present descriptive statistics from the audit and pairwise inter-observer agreement, using the proportion agreement and kappa statistics. RESULTS: The proportion of acceptable item scores (0 or +1) was above 70% for all items except the requesting physician's bleep or extension number, legibility of the physician's name, or details about previous investigations. For pairs of observers, the inter-observer agreement was generally high, however, the corresponding kappa values were consistently low with only 14 of 90 ratings >0.60 and 6 >0.80 on the requests/reports. For the quality of the clinical information, the appropriateness of the request, and the requested priority/timing of the investigation items, the mean percentage agreement ranged 67-76, and the corresponding kappa values ranged 0.08-0.24. CONCLUSION: The inter-observer reliability of scores on the different items showed a high degree of agreement, although the kappa values were low, which is a well-known paradox. Current routines for requesting radiology examinations appeared satisfactory, although several problem areas were identified
Surgeon Reliability for the Assessment of Lumbar Spinal Stenosis on MRI: The Impact of Surgeon Experience.

Science.gov (United States)

Marawar, Satyajit V; Madom, Ian A; Palumbo, Mark; Tallarico, Richard A; Ordway, Nathaniel R; Metkar, Umesh; Wang, Dongliang; Green, Adam; Lavelle, William F

2017-01-01

Treating surgeon's visual assessment of axial MRI images to ascertain the degree of stenosis has a critical impact on surgical decision-making. The purpose of this study was to prospectively analyze the impact of surgeon experience on inter-observer and intra-observer reliability of assessing severity of spinal stenosis on MRIs by spine surgeons directly involved in surgical decision-making. Seven fellowship trained spine surgeons reviewed MRI studies of 30 symptomatic patients with lumbar stenosis and graded the stenosis in the central canal, the lateral recess and the foramen at T12-L1 to L5-S1 as none, mild, moderate or severe. No specific instructions were provided to what constituted mild, moderate, or severe stenosis. Two surgeons were "senior" (>fifteen years of practice experience); two were "intermediate" (>four years of practice experience), and three "junior" (< one year of practice experience). The concordance correlation coefficient (CCC) was calculated to assess inter-observer reliability. Seven MRI studies were duplicated and randomly re-read to evaluate inter-observer reliability. Surgeon experience was found to be a strong predictor of inter-observer reliability. Senior inter-observer reliability was significantly higher assessing central(p<0.001), foraminal p=0.005 and lateral p=0.001 than "junior" group.Senior group also showed significantly higher inter-observer reliability that intermediate group assessing foraminal stenosis (p=0.036). In intra-observer reliability the results were contrary to that found in inter-observer reliability. Inter-observer reliability of assessing stenosis on MRIs increases with surgeon experience. Lower intra-observer reliability values among the senior group, although not clearly explained, may be due to the small number of MRIs evaluated and quality of MRI images.Level of evidence: Level 3.
The reliability of four widely used patellar height ratios.

Science.gov (United States)

van Duijvenbode, Dennis; Stavenuiter, Michel; Burger, Bart; van Dijke, Cees; Spermon, Jacco; Hoozemans, Marco

2016-03-01

The objective of this study was to evaluate the inter-observer reliability and the intra-observer reliability of four patellar height ratios: Insall-Salvati (IS), modified Insall-Salvati (MIS), Blackburne-Peel (BP) and Caton-Deschamps (CD). The patellar height ratios were assessed by four independent examiners using weight-bearing lateral knee radiographs in 30° flexion. Intra-class correlation coefficients and Fleiss' kappa's were determined. The inter-observer reliability was excellent for the IS and moderate for the other ratios. When the ratio values were categorized, the inter-observer reliability was strong for the IS, moderate for the MIS and BP, and poor for the CD. The intra-observer reliability was excellent for the IS, MIS and CD, and strong for the BP. When the ratio values were categorized, the intra-observer reliability was strong for the IS and MIS, and moderate for the other ratios. Although the IS showed best reliability, we advise to use the MIS as it showed the second best reliability but is, according to the literature, associated with better validity.
A Study on the Reliability of Sasang Constitutional Body Trunk Measurement

Directory of Open Access Journals (Sweden)

Eunsu Jang

2012-01-01

Full Text Available Objective. Body trunk measurement for human plays an important diagnostic role not only in conventional medicine but also in Sasang constitutional medicine (SCM. The Sasang constitutional body trunk measurement (SCBTM consists of the 5-widths and the 8-circumferences which are standard locations currently employed in the SCM society. This study suggests to what extent a comprehensive training can improve the reliability of the SCBTM. Methods. We recruited 10 male subjects and 5 male observers with no experience of anthropometric measurement. We conducted measurements twice before and after a comprehensive training. Relative technical error of measurement (%TEMs was produced to assess intra and inter observer reliabilities. Results. Post-training intra-observer %TEMs of the SCBTM were 0.27% to 1.85% reduced from 0.27% to 6.26% in pre-training, respectively. Post-training inter-observer %TEMs of those were 0.56% to 1.66% reduced from 1.00% to 9.60% in pre-training, respectively. Post-training % total TEMs which represent the whole reliability were 0.68% to 2.18% reduced from maximum value of 10.18%. Conclusion. A comprehensive training makes the SCBTM more reliable, hence giving a sufficiently confident diagnostic tool. It is strongly recommended to give a comprehensive training in advance to take the SCBTM.
A study on the reliability of sasang constitutional body trunk measurement.

Science.gov (United States)

Jang, Eunsu; Kim, Jong Yeol; Lee, Haejung; Kim, Honggie; Baek, Younghwa; Lee, Siwoo

2012-01-01

Objective. Body trunk measurement for human plays an important diagnostic role not only in conventional medicine but also in Sasang constitutional medicine (SCM). The Sasang constitutional body trunk measurement (SCBTM) consists of the 5-widths and the 8-circumferences which are standard locations currently employed in the SCM society. This study suggests to what extent a comprehensive training can improve the reliability of the SCBTM. Methods. We recruited 10 male subjects and 5 male observers with no experience of anthropometric measurement. We conducted measurements twice before and after a comprehensive training. Relative technical error of measurement (%TEMs) was produced to assess intra and inter observer reliabilities. Results. Post-training intra-observer %TEMs of the SCBTM were 0.27% to 1.85% reduced from 0.27% to 6.26% in pre-training, respectively. Post-training inter-observer %TEMs of those were 0.56% to 1.66% reduced from 1.00% to 9.60% in pre-training, respectively. Post-training % total TEMs which represent the whole reliability were 0.68% to 2.18% reduced from maximum value of 10.18%. Conclusion. A comprehensive training makes the SCBTM more reliable, hence giving a sufficiently confident diagnostic tool. It is strongly recommended to give a comprehensive training in advance to take the SCBTM.
High inter-observer agreement of observer-perceived pain assessment in the emergency department

DEFF Research Database (Denmark)

Hangaard, Martin Høhrmann; Malling, Brian; Mogensen, Christian Backer

2018-01-01

degree of inter-observer agreement. The aim of the present study was to assess the inter-observer agreement of perceived pain among emergency department nurses and to evaluate if it was influenced by predetermined factors like age and gender. Method: A project assistant randomly recruited two nurses, who...... of 0.05 and 95% limits of agreement of +/-1 category. Patient age, gender, localization of pain, examination room or presence of a significant other did not affect the inter-observer agreement. Conclusion: We found 70% agreement on pain category between the nurses and it is justified that nurse......Background: Triage is used to prioritize the patients in the emergency department. The majority of the triage systems include the patients' pain score to assess their level of acuity by using a combination of patient reported pain and observer-perceived pain; the latter therefore requires a certain...
Gait in children with cerebral palsy : observer reliability of Physician Rating Scale and Edinburgh Visual Gait Analysis Interval Testing scale

NARCIS (Netherlands)

Maathuis, KGB; van der Schans, CP; van Iperen, A; Rietman, HS; Geertzen, JHB

2005-01-01

The aim of this study was to test the inter- and intra-observer reliability of the Physician Rating Scale (PRS) and the Edinburgh Visual Gait Analysis Interval Testing (GAIT) scale for use in children with cerebral palsy (CP). Both assessment scales are quantitative observational scales, evaluating

Quality of nursing intensity data: inter-rater reliability of the patient classification after two decades in clinical use.

Science.gov (United States)

Liljamo, Pia; Kinnunen, Ulla-Mari; Ohtonen, Pasi; Saranto, Kaija

2017-09-01

The aim of this study was to measure the inter-rater reliability of the Oulu Patient Classification and to discuss existing methods of reliability testing. The Oulu Patient Classification, part of the RAFAELA ® System, has been developed to assist nursing managers with the proper allocation of nursing resources. Due to the increased intensity of inpatient care during recent years, there is a need for the reliability testing of the classification, which has been in clinical use for 20 years. Retrospective statistical study. To test inter-rater reliability, a pair of nurses classified the same patients, without knowledge of each other's ratings, as a part of annually conducted standardization. Data on the parallel classifications (n = 19,997) was obtained from inpatient units (n = 32) with different specialties at a university hospital in Finland during 2010-2015. Parallel classification practices were also analysed. The reliability of the overall classification and its subareas were calculated using suitable statistical coefficients. Inter-rater reliability coefficients were a reliable or almost perfect means of considering the nursing intensity category and various practices, but there were detectable differences between subareas. The lowest agreement levels occurred in the subareas 'Planning and Coordination of Nursing Care' and 'Guiding of Care/Continued Care and Emotional Support'. There is a need to develop the descriptions of subareas and to clarify the related concepts. Precise nursing documentation can promote a high level of agreement and reliable results. The traditional overall proportion of agreement does not provide an adequate picture of reliability - weighted kappa coefficients should be used instead. © 2017 John Wiley & Sons Ltd.
High inter-rater reliability, agreement, and convergent validity of Constant score in patients with clavicle fractures

DEFF Research Database (Denmark)

Ban, Ilija; Troelsen, Anders; Kristensen, Morten Tange

2016-01-01

BACKGROUND: The Constant score (CS) has been the primary endpoint in most studies on clavicle fractures. However, the CS was not developed to assess patients with clavicle fractures. Our aim was to examine inter-rater reliability and agreement of the CS in patients with clavicle fractures...... standardized CS assessment at a mean of 6.8 weeks (SD, 1.0 weeks) after injury. Reliability and agreement of the CS were determined by 2 raters. The interclass correlation coefficient (ICC2,1), standard error of measurement, minimal detectable change, Cronbach α coefficient, and Pearson correlation coefficient...... were estimated. RESULTS: Inter-rater reliability of the total CS was excellent (interclass correlation coefficient, 0.94; 95% confidence interval, 0.88-0.97), with no systematic difference between the 2 raters (P = .75). The standard error of measurement (measurement error at the group level) was 4...
Inter-rater reliability of three musculoskeletal physical examination techniques used to assess motion in three planes while standing.

Science.gov (United States)

Prather, Heidi; Hunt, Devyani; Steger-May, Karen; Hayes, Marcie Harris; Knaus, Evan; Clohisy, John

2009-07-01

The objective of the study was to measure the reliability between examiners of 3 basic maneuvers of the Total Body Functional Profile physical examination test. The hypothesis was musculoskeletal health care providers of different disciplines could reliably use the 3 basic maneuvers as part of the musculoskeletal physical examination. A prospective observational study was conducted. Twenty-eight adult volunteers were measured on both the left and right side by 2 independent raters on a single occasion. The subjects were recruited through advertisements placed by the orthopedic department at a tertiary university. Twenty-eight volunteers were recruited and completed the study. The volunteers were between the ages of 18 and 51 years of age, had no symptoms in the lower extremity or spine, had no previous history of surgery or tumor involving the lower extremity, and no medical conditions that would preclude participation. On a single occasion, 2 examiners per 1 volunteer were blinded to their own and each others' measurements. Each examiner assessed the distance of frontal and sagittal plane lunge and angle of motion for transverse plane testing. Inter-rater agreement is expressed with intraclass correlation coefficients (ICCs) and corresponding 95% confidence intervals (CIs). The difference between raters is reported with 95% CIs. Baseline demographics, University of California Los Angeles (UCLA), and Harris hip questionnaires were completed by all participants. The UCLA and Harris hip scores showed no significant activity restrictions or pain limitations in all participants. The inter-rater reliability for sagittal, frontal, and transverse plane matrix testing was good with ICCs of 0.86 (95% CI 0.77-0.91), 0.90 (95% CI 0.84-0.94), and 0.85 (95% CI 0.75-0.91), respectively. The rater reliability between disciplines for transverse, sagittal, and frontal plane matrix testing was good with ICCs of 0.89 (95% CI 0.80-0.94), 0.88 (95% CI 0.79-0.94), and 0.90 (95% CI 0
Inter-expert and intra-expert reliability in sleep spindle scoring

DEFF Research Database (Denmark)

Wendt, Sabrina Lyngbye; Welinder, Peter; Sørensen, Helge Bjarup Dissing

2015-01-01

Objectives To measure the inter-expert and intra-expert agreement in sleep spindle scoring, and to quantify how many experts are needed to build a reliable dataset of sleep spindle scorings. Methods The EEG dataset was comprised of 400 randomly selected 115 s segments of stage 2 sleep from 110...... with higher reliability than the estimation of spindle duration. Reliability of sleep spindle scoring can be improved by using qualitative confidence scores, rather than a dichotomous yes/no scoring system. Conclusions We estimate that 2–3 experts are needed to build a spindle scoring dataset...... with ‘substantial’ reliability (κ: 0.61–0.8), and 4 or more experts are needed to build a dataset with ‘almost perfect’ reliability (κ: 0.81–1). Significance Spindle scoring is a critical part of sleep staging, and spindles are believed to play an important role in development, aging, and diseases of the nervous...
Stability of FDG-PET Radiomics features - An integrated analysis of test-retest and inter-observer variability

Energy Technology Data Exchange (ETDEWEB)

Leijenaar, Ralph T. H.; Carvalho, Sara; Rios Velazquez, Emmanuel [Dept. of Radiation Oncology (MAASTRO), GROW-School for Oncology and Developmental Biology, Maastricht Univ. Medical Center, Maastricht (Netherlands)] [and others

2013-10-15

Purpose: Besides basic measurements as maximum standardized uptake value (SUV){sub max} or SUV{sub mean} derived from 18F-FDG positron emission tomography (PET) scans, more advanced quantitative imaging features (i.e. 'Radiomics' features) are increasingly investigated for treatment monitoring, outcome prediction, or as potential biomarkers. With these prospected applications of Radiomics features, it is a requisite that they provide robust and reliable measurements. The aim of our study was therefore to perform an integrated stability analysis of a large number of PET-derived features in non-small cell lung carcinoma (NSCLC), based on both a test-retest and an inter-observer setup. Methods: Eleven NSCLC patients were included in the test-retest cohort. Patients underwent repeated PET imaging within a one day interval, before any treatment was delivered. Lesions were delineated by applying a threshold of 50 % of the maximum uptake value within the tumor. Twenty-three NSCLC patients were included in the inter-observer cohort. Patients underwent a diagnostic whole body PET-computed tomography (CT). Lesions were manually delineated based on fused PET-CT, using a standardized clinical delineation protocol. Delineation was performed independently by five observers, blinded to each other. Fifteen first order statistics, 39 descriptors of intensity volume histograms, eight geometric features and 44 textural features were extracted. For every feature, test-retest and inter-observer stability was assessed with the intra-class correlation coefficient (ICC) and the coefficient of variability, normalized to mean and range. Similarity between test-retest and inter-observer stability rankings of features was assessed with Spear man's rank correlation coefficient. Results: Results showed that the majority of assessed features had both a high test-retest (71%) and inter-observer (91%) stability in terms of their ICC. Overall, features more stable in repeated PET
Inter-observer variability in fetal biometric measurements.

Science.gov (United States)

Kilani, Rami; Aleyadeh, Wesam; Atieleh, Luay Abu; Al Suleimat, Abdul Mane; Khadra, Maysa; Hawamdeh, Hassan M

2018-02-01

To evaluate inter-observer variability and reproducibility of ultrasound measurements for fetal biometric parameters. A prospective cohort study was implemented in two tertiary care hospitals in Amman, Jordan; Prince Hamza Hospital and Albashir Hospital. 192 women with a singleton pregnancy at a gestational age of 18-36 weeks were the participants in the study. Transabdominal scans for fetal biometric parameter measurement were performed on study participants from the period of November 2014 to March 2015. Women who agreed to participate in the study were administered two ultrasound scans for head circumference, abdominal circumference and femur length. The correlation coefficient was calculated. Bland-Altman plots were used to analyze the degree of measurement agreement between observers. Limits of agreement ± 2 SD for the differences in fetal biometry measurements in proportions of the mean of the measurements were derived. Main outcome measures examine the reproducibility of fetal biometric measurements by different observers. High inter-observer inter-class correlation coefficient (ICC) was found for femur length (0.990) and abdominal circumference (0.996) where Bland-Altman plots showed high degrees of agreement. The highest degrees of agreement were noted in the measurement of abdominal circumference followed by head circumference. The lowest degree of agreement was found for femur length measurement. We used a paired-sample t-test and found that the mean difference between duplicate measurements was not significant (P > 0.05). Biometric fetal parameter measurements may be reproducible by different operators in the clinical setting with similar results. Fetal head circumference, abdominal circumference and femur length were highly reproducible. Large organized studies are needed to ensure accurate fetal measurements due to the important clinical implications of inaccurate measurements. Copyright © 2018. Published by Elsevier B.V.
Inter-Rater Reliability and Validity of the Australian Football League’s Kicking and Handball Tests

Science.gov (United States)

Cripps, Ashley J.; Hopper, Luke S.; Joyce, Christopher

2015-01-01

Talent identification tests used at the Australian Football League’s National Draft Combine assess the capacities of athletes to compete at a professional level. Tests created for the National Draft Combine are also commonly used for talent identification and athlete development in development pathways. The skills tests created by the Australian Football League required players to either handball (striking the ball with the hand) or kick to a series of 6 randomly generated targets. Assessors subjectively rate each skill execution giving a 0-5 score for each disposal. This study aimed to investigate the inter-rater reliability and validity of the skills tests at an adolescent sub-elite level. Male Australian footballers were recruited from sub-elite adolescent teams (n = 121, age = 15.7 ± 0.3 years, height = 1.77 ± 0.07 m, mass = 69.17 ± 8.08 kg). The coaches (n = 7) of each team were also recruited. Inter-rater reliability was assessed using Inter-class correlations (ICC) and Limits of Agreement statistics. Both the kicking (ICC = 0.96, p handball tests (ICC = 0.89, p handball test. Key points The skill tests created by the AFL demonstrated acceptable levels of relative and absolute inter-rater reliability. Both the AFL’s skills tests are able to differentiate between athletes dominant and non-dominant limbs. However, only the kicking test could consistently differentiated between score outcomes over a range of Australian Football specific disposal distances. Both tests demonstrated poor concurrent validity, with no correlation found between coaches’ perceptions of technical skills and actual skill outcomes measured. PMID:26336356
A Turkish Version of the Critical-Care Pain Observation Tool: Reliability and Validity Assessment.

Science.gov (United States)

Aktaş, Yeşim Yaman; Karabulut, Neziha

2017-08-01

The study aim was to evaluate the validity and reliability of the Critical-Care Pain Observation Tool in critically ill patients. A repeated measures design was used for the study. A convenience sample of 66 patients who had undergone open-heart surgery in the cardiovascular surgery intensive care unit in Ordu, Turkey, was recruited for the study. The patients were evaluated by using the Critical-Care Pain Observation Tool at rest, during a nociceptive procedure (suctioning), and 20 minutes after the procedure while they were conscious and intubated after surgery. The Turkish version of the Critical-Care Pain Observation Tool has shown statistically acceptable levels of validity and reliability. Inter-rater reliability was supported by moderate-to-high-weighted κ coefficients (weighted κ coefficient = 0.55 to 1.00). For concurrent validity, significant associations were found between the scores on the Critical-Care Pain Observation Tool and the Behavioral Pain Scale scores. Discriminant validity was also supported by higher scores during suctioning (a nociceptive procedure) versus non-nociceptive procedures. The internal consistency of the Critical-Care Pain Observation Tool was 0.72 during a nociceptive procedure and 0.71 during a non-nociceptive procedure. The validity and reliability of the Turkish version of the Critical-Care Pain Observation Tool was determined to be acceptable for pain assessment in critical care, especially for patients who cannot communicate verbally. Copyright © 2016 American Society of PeriAnesthesia Nurses. Published by Elsevier Inc. All rights reserved.
Inter-day Reliability of the IDEEA Activity Monitor for Measuring Movement and Non-Movement Behaviors in Older Adults.

Science.gov (United States)

de la Cámara, Miguel Ángel; Higueras-Fresnillo, Sara; Martinez-Gomez, David; Veiga, Oscar L

2018-05-29

The inter-day reliability of the Intelligent Device for Energy Expenditure and Activity (IDEEA) has not been studied to date. The study purpose was to examine the inter-day variability and reliability on two consecutive days collected with the IDEEA, as well as to predict the number of days needed to provide a reliable estimate of several movement (walking and climbing stairs) and non-movement behaviors (lying, reclining, sitting) and standing in older adults. The sample included 126 older adults (74 women) who wore the IDEEA for 48-h. Results showed low variability between the two days and its reliability was from moderate (ICC=0.34) to high (ICC=0.80) in most of movement and non-movement behaviors analyzed. The Bland-Altman plots showed a high-moderate agreement between days and the Spearman-Brown formula estimated ranged from 1.2 and 9.1 days of monitoring with the IDEEA are needed to achieve ICCs≥0.70 in older adults for sitting and climbing stairs, respectively.
Inter-rater reliability of data elements from a prototype of the Paul Coverdell National Acute Stroke Registry

Directory of Open Access Journals (Sweden)

Wehner Susan

2008-06-01

Full Text Available Abstract Background The Paul Coverdell National Acute Stroke Registry (PCNASR is a U.S. based national registry designed to monitor and improve the quality of acute stroke care delivered by hospitals. The registry monitors care through specific performance measures, the accuracy of which depends in part on the reliability of the individual data elements used to construct them. This study describes the inter-rater reliability of data elements collected in Michigan's state-based prototype of the PCNASR. Methods Over a 6-month period, 15 hospitals participating in the Michigan PCNASR prototype submitted data on 2566 acute stroke admissions. Trained hospital staff prospectively identified acute stroke admissions, abstracted chart information, and submitted data to the registry. At each hospital 8 randomly selected cases were re-abstracted by an experienced research nurse. Inter-rater reliability was estimated by the kappa statistic for nominal variables, and intraclass correlation coefficient (ICC for ordinal and continuous variables. Factors that can negatively impact the kappa statistic (i.e., trait prevalence and rater bias were also evaluated. Results A total of 104 charts were available for re-abstraction. Excellent reliability (kappa or ICC > 0.75 was observed for many registry variables including age, gender, black race, hemorrhagic stroke, discharge medications, and modified Rankin Score. Agreement was at least moderate (i.e., 0.75 > kappa ≥; 0.40 for ischemic stroke, TIA, white race, non-ambulance arrival, hospital transfer and direct admit. However, several variables had poor reliability (kappa Conclusion The excellent reliability of many of the data elements supports the use of the PCNASR to monitor and improve care. However, the poor reliability for several variables, particularly time-related events in the emergency department, indicates the need for concerted efforts to improve the quality of data collection. Specific recommendations
Reliability of routine clinical measurements of neonatal circumferences and research measurements of neonatal skinfold thicknesses: findings from the Born in Bradford study

Science.gov (United States)

West, Jane; Manchester, Ben; Wright, John; Lawlor, Debbie A; Waiblinger, Dagmar

2011-01-01

Summary West J, Manchester B, Wright J, Lawlor DA, Waiblinger D. Reliability of routine clinical measurements of neonatal circumferences and research measurements of neonatal skinfold thicknesses: findings from the Born in Bradford study. Paediatric and Perinatal Epidemiology 2011. Assessing neonatal size reliably is important for research and clinical practice. The aim of this study was to examine the reliability of routine clinical measurements of neonatal circumferences and of skinfold thicknesses assessed for research purposes. All measurements were undertaken on the same population of neonates born in a large maternity unit in Bradford, UK. Technical error of measurement (TEM), relative TEM and the coefficient of reliability are reported. Intra-observer TEMs for routine circumference measurements were all below 0.4 cm and were generally within ±2-times the mean. Inter-observer TEM ranged from 0.20 to 0.36 cm for head circumference, 0.19 to 0.39 cm for mid upper arm circumference and from 0.39 to 0.77 cm for abdominal circumference. Intra and inter-observer TEM for triceps skinfold thickness ranged from 0.22 to 0.35 mm and 0.15 to 0.54 mm, respectively. Subscapular skinfold thickness TEM values were 0.14 to 0.25 mm for intra-observer measurements and 0.17 to 0.63 mm for inter-observer measurements. Relative TEM values for routine circumferences were all below 4.00% but varied between 2.88% and 14.23% for research skinfold measurements. Reliability was mostly between 80% and 99% for routine circumference measurements and ≥70% for most research skinfold measurements. Routine clinical measurements of neonatal circumferences are reliably assessed in Bradford. Assessing skinfolds in neonates has variable reliability, but on the whole is good. The greater intra-observer, compared with inter-observer, reliability for both sets of measurements highlights the importance of having a minimal number of assessors whenever possible. PMID:21281329
Simulated patient training: Using inter-rater reliability to evaluate simulated patient consistency in nursing education.

Science.gov (United States)

MacLean, Sharon; Geddes, Fiona; Kelly, Michelle; Della, Phillip

2018-03-01

Simulated patients (SPs) are frequently used for training nursing students in communication skills. An acknowledged benefit of using SPs is the opportunity to provide a standardized approach by which participants can demonstrate and develop communication skills. However, relatively little evidence is available on how to best facilitate and evaluate the reliability and accuracy of SPs' performances. The aim of this study is to investigate the effectiveness of an evidenced based SP training framework to ensure standardization of SPs. The training framework was employed to improve inter-rater reliability of SPs. A quasi-experimental study was employed to assess SP post-training understanding of simulation scenario parameters using inter-rater reliability agreement indices. Two phases of data collection took place. Initially a trial phase including audio-visual (AV) recordings of two undergraduate nursing students completing a simulation scenario is rated by eight SPs using the Interpersonal Communication Assessments Scale (ICAS) and Quality of Discharge Teaching Scale (QDTS). In phase 2, eight SP raters and four nursing faculty raters independently evaluated students' (N=42) communication practices using the QDTS. Intraclass correlation coefficients (ICC) were >0.80 for both stages of the study in clinical communication skills. The results support the premise that if trained appropriately, SPs have a high degree of reliability and validity to both facilitate and evaluate student performance in nurse education. Crown Copyright © 2018. Published by Elsevier Ltd. All rights reserved.
Reliability of visual and instrumental color matching.

Science.gov (United States)

Igiel, Christopher; Lehmann, Karl Martin; Ghinea, Razvan; Weyhrauch, Michael; Hangx, Ysbrand; Scheller, Herbert; Paravina, Rade D

2017-09-01

The aim of this investigation was to evaluate intra-rater and inter-rater reliability of visual and instrumental shade matching. Forty individuals with normal color perception participated in this study. The right maxillary central incisor of a teaching model was prepared and restored with 10 feldspathic all-ceramic crowns of different shades. A shade matching session consisted of the observer (rater) visually selecting the best match by using VITA classical A1-D4 (VC) and VITA Toothguide 3D Master (3D) shade guides and the VITA Easyshade Advance intraoral spectrophotometer (ES) to obtain both VC and 3D matches. Three shade matching sessions were held with 4 to 6 weeks between sessions. Intra-rater reliability was assessed based on the percentage of agreement for the three sessions for the same observer, whereas the inter-rater reliability was calculated as mean percentage of agreement between different observers. The Fleiss' Kappa statistical analysis was used to evaluate visual inter-rater reliability. The mean intra-rater reliability for the visual shade selection was 64(11) for VC and 48(10) for 3D. The corresponding ES values were 96(4) for both VC and 3D. The percentages of observers who matched the same shade with VC and 3D were 55(10) and 43(12), respectively, while corresponding ES values were 88(8) for VC and 92(4) for 3D. The results for visual shade matching exhibited a high to moderate level of inconsistency for both intra-rater and inter-rater comparisons. The VITA Easyshade Advance intraoral spectrophotometer exhibited significantly better reliability compared with visual shade selection. This study evaluates the ability of observers to consistently match the same shade visually and with a dental spectrophotometer in different sessions. The intra-rater and inter-rater reliability (agreement of repeated shade matching) of visual and instrumental tooth color matching strongly suggest the use of color matching instruments as a supplementary tool in
The intra- and inter-rater reliability of five clinical muscle performance tests in patients with and without neck pain

Science.gov (United States)

2013-01-01

Background This study investigates the reliability of muscle performance tests using cost- and time-effective methods similar to those used in clinical practice. When conducting reliability studies, great effort goes into standardising test procedures to facilitate a stable outcome. Therefore, several test trials are often performed. However, when muscle performance tests are applied in the clinical setting, clinicians often only conduct a muscle performance test once as repeated testing may produce fatigue and pain, thus variation in test results. We aimed to investigate whether cervical muscle performance tests, which have shown promising psychometric properties, would remain reliable when examined under conditions similar to those of daily clinical practice. Methods The intra-rater (between-day) and inter-rater (within-day) reliability was assessed for five cervical muscle performance tests in patients with (n = 33) and without neck pain (n = 30). The five tests were joint position error, the cranio-cervical flexion test, the neck flexor muscle endurance test performed in supine and in a 45°-upright position and a new neck extensor test. Results Intra-rater reliability ranged from moderate to almost perfect agreement for joint position error (ICC ≥ 0.48-0.82), the cranio-cervical flexion test (ICC ≥ 0.69), the neck flexor muscle endurance test performed in supine (ICC ≥ 0.68) and in a 45°-upright position (ICC ≥ 0.41) with the exception of a new test (neck extensor test), which ranged from slight to moderate agreement (ICC = 0.14-0.41). Likewise, inter-rater reliability ranged from moderate to almost perfect agreement for joint position error (ICC ≥ 0.51-0.75), the cranio-cervical flexion test (ICC ≥ 0.85), the neck flexor muscle endurance test performed in supine (ICC ≥ 0.70) and in a 45°-upright position (ICC ≥ 0.56). However, only slight to fair agreement was found for the neck extensor test (ICC�
Observer reliability of CT angiography in the assessment of acute ischaemic stroke: data from the Third International Stroke Trial

International Nuclear Information System (INIS)

Mair, Grant; Farrall, Andrew J.; Sellar, Robin J.; Mollison, Daisy; Sakka, Eleni; Palmer, Jeb; Wardlaw, Joanna M.; Kummer, Ruediger von; Adami, Alessandro; White, Philip M.; Adams, Matthew E.; Yan, Bernard; Demchuk, Andrew M.; Ramaswamy, Rajesh; Rodrigues, Mark A.; Samji, Karim; Baird, Andrew J.; Boyd, Elena V.; Cohen, Geoff; Perry, David; Sandercock, Peter A.G.; Lindley, Richard

2015-01-01

CT angiography (CTA) is often used for assessing patients with acute ischaemic stroke. Only limited observer reliability data exist. We tested inter- and intra-observer reliability for the assessment of CTA in acute ischaemic stroke. We selected 15 cases from the Third International Stroke Trial (IST-3, ISRCTN25765518) with various degrees of arterial obstruction in different intracranial locations on CTA. To assess inter-observer reliability, seven members of the IST-3 expert image reading panel (>5 years experience reading CTA) and seven radiology trainees (<2 years experience) rated all 15 scans independently and blind to clinical data for: presence (versus absence) of any intracranial arterial abnormality (stenosis or occlusion), severity of arterial abnormality using relevant scales (IST-3 angiography score, Thrombolysis in Cerebral Infarction (TICI) score, Clot Burden Score), collateral supply and visibility of a perfusion defect on CTA source images (CTA-SI). Intra-observer reliability was assessed using independently repeated expert panel scan ratings. We assessed observer agreement with Krippendorff's-alpha (K-alpha). Among experienced observers, inter-observer agreement was substantial for the identification of any angiographic abnormality (K-alpha = 0.70) and with an angiography assessment scale (K-alpha = 0.60-0.66). There was less agreement for grades of collateral supply (K-alpha = 0.56) or for identification of a perfusion defect on CTA-SI (K-alpha = 0.32). Radiology trainees performed as well as expert readers when additional training was undertaken (neuroradiology specialist trainees). Intra-observer agreement among experts provided similar results (K-alpha = 0.33-0.72). For most imaging characteristics assessed, CTA has moderate to substantial observer agreement in acute ischaemic stroke. Experienced readers and those with specialist training perform best. (orig.)
Inter-rater reliability of the Sødring Motor Evaluation of Stroke patients (SMES).

Science.gov (United States)

Halsaa, K E; Sødring, K M; Bjelland, E; Finsrud, K; Bautz-Holter, E

1999-12-01

The Sødring Motor Evaluation of Stroke patients is an instrument for physiotherapists to evaluate motor function and activities in stroke patients. The rating reflects quality as well as quantity of the patient's unassisted performance within three domains: leg, arm and gross function. The inter-rater reliability of the method was studied in a sample of 30 patients admitted to a stroke rehabilitation unit. Three therapists were involved in the study; two therapists assessed the same patient on two consecutive days in a balanced design. Cohen's weighted kappa and McNemar's test of symmetry were used as measures of item reliability, and the intraclass correlation coefficient was used to express the reliability of the sumscores. For 24 out of 32 items the weighted kappa statistic was excellent (0.75-0.98), while 7 items had a kappa statistic within the range 0.53-0.74 (fair to good). The reliability of one item was poor (0.13). The intraclass correlation coefficient for the three sumscores was 0.97, 0.91 and 0.97. We conclude that the Sødring Motor Evaluation of Stroke patients is a reliable measure of motor function in stroke patients undergoing rehabilitation.
Trunk Muscle Size and Composition Assessment in Older Adults with Chronic Low Back Pain: An Intra-Examiner and Inter-Examiner Reliability Study.

Science.gov (United States)

Sions, Jaclyn Megan; Smith, Andrew Craig; Hicks, Gregory Evan; Elliott, James Matthew

2016-08-01

To evaluate intra- and inter-examiner reliability for the assessment of relative cross-sectional area, muscle-to-fat infiltration indices, and relative muscle cross-sectional area, i.e., total cross-sectional area minus intramuscular fat, from T1-weighted magnetic resonance images obtained in older adults with chronic low back pain. Reliability study. n = 13 (69.3 ± 8.2 years old) After lumbar magnetic resonance imaging, two examiners produced relative cross-sectional area measurements of multifidi, erector spinae, psoas, and quadratus lumborum by tracing regions of interest just inside fascial borders. Pixel-intensity summaries were used to determine muscle-to-fat infiltration indices; relative muscle cross-sectional area was calculated. Intraclass correlation coefficients were used to estimate intra- and inter-examiner reliability; standard error of measurement was calculated. Intra-examiner intraclass correlation coefficient point estimates for relative cross-sectional area, muscle-to-fat infiltration indices, and relative muscle cross-sectional area were excellent for multifidi and erector spinae across levels L2-L5 (ICC = 0.77-0.99). At L3, intra-examiner reliability was excellent for relative cross-sectional area, muscle-to-fat infiltration indices, and relative muscle cross-sectional area for both psoas and quadratus lumborum (ICC = 0.81-0.99). Inter-examiner intraclass correlation coefficients ranged from poor to excellent for relative cross-sectional area, muscle-to-fat infiltration indices, and relative muscle cross-sectional area. Assessment of relative cross-sectional area, muscle-to-fat infiltration indices, and relative muscle cross-sectional area in older adults with chronic low back pain can be reliably determined by one examiner from T1-weighted images. Such assessments provide valuable information, as muscle-to-fat infiltration indices and relative muscle cross-sectional area indicate that a substantial amount of
Inter-Rater and Test-Retest (Between-Sessions) Reliability of the 4-Skills Scan for Dutch Elementary School Children

Science.gov (United States)

van Kernebeek, Willem G.; de Schipper, Antoine W.; Savelsbergh, Geert J. P.; Toussaint, Huub M.

2018-01-01

In The Netherlands, the 4-Skills Scan is an instrument for physical education teachers to assess gross motor skills of elementary school children. Little is known about its reliability. Therefore, in this study the test-retest and inter-rater reliability was determined. Respectively, 624 and 557 Dutch 6- to 12-year-old children were analyzed for…
Three-dimensional Mass Measurement of Subsolid Pulmonary Nodules on Chest CT: Intra and Inter-observer Variability

Directory of Open Access Journals (Sweden)

Huiting LIU

2015-05-01

Full Text Available Background and objective Subsolid pulmonary nodules tend to exhibit considerably slower growth rates than solid lesions, nevertheless, higher malignancy probability. The diagnosis of indeterminate nodules largely depends on the growth evaluation during follow-up. The growth can manifested as an increase in size or the appearance and/or subsequent increase of solid components. The mass reflect the product of volume and density and can be more sensitive in growth evaluation. However, the repeatability needs a further validation. The purpose of this study is to assess the intra and inter-observer variability of mass measurement for subsolid nodules. Methods 80 subsolid nodules in 44 patients were retrospectively enrolled. Both the volume and mass were measured by two radiologists using blind method independently. Intra-observer and inter-observer variability were analyzed and compared by Bland-Altman method intra-class correlation test and Wilcoxon test. Results Software achieved satisfied segmentation for 92.5% nodules. Of them, 35% underwent manual modification. The 95% limits of agreement for intra-observer variability were -11.5%-10.4% for mass and -8.4%-8.8% for volume. The 95% limits of agreement for inter-observer variability were -17.4%-19.3% for mass and -17.9%-19.4% for volume.The intra-class correlation foefficients between volume and mass measument was 0.95 and 0.93 (both P<0.001 and no significant differences (P=0.78, 0.09 was found for intra- and inter-observer variability. Manual modification of the segmentation caused the worse mass measurement repeatability in spite of the reader satisfaction. Conclusion The repeatability of mass measurement has no significant difference with that of volume measurement and may act as a reliable method in the follow-up of subsolid nodules.
Inter-Rater Reliability of Provider Interpretations of Irritable Bowel Syndrome Food and Symptom Journals.

Science.gov (United States)

Zia, Jasmine; Chung, Chia-Fang; Xu, Kaiyuan; Dong, Yi; Schenk, Jeanette M; Cain, Kevin; Munson, Sean; Heitkemper, Margaret M

2017-11-04

There are currently no standardized methods for identifying trigger food(s) from irritable bowel syndrome (IBS) food and symptom journals. The primary aim of this study was to assess the inter-rater reliability of providers' interpretations of IBS journals. A second aim was to describe whether these interpretations varied for each patient. Eight providers reviewed 17 IBS journals and rated how likely key food groups (fermentable oligo-di-monosaccharides and polyols, high-calorie, gluten, caffeine, high-fiber) were to trigger IBS symptoms for each patient. Agreement of trigger food ratings was calculated using Krippendorff's α-reliability estimate. Providers were also asked to write down recommendations they would give to each patient. Estimates of agreement of trigger food likelihood ratings were poor (average α = 0.07). Most providers gave similar trigger food likelihood ratings for over half the food groups. Four providers gave the exact same written recommendation(s) (range 3-7) to over half the patients. Inter-rater reliability of provider interpretations of IBS food and symptom journals was poor. Providers favored certain trigger food likelihood ratings and written recommendations. This supports the need for a more standardized method for interpreting these journals and/or more rigorous techniques to accurately identify personalized IBS food triggers.

Inter-Rater Reliability of Provider Interpretations of Irritable Bowel Syndrome Food and Symptom Journals

Directory of Open Access Journals (Sweden)

Jasmine Zia

2017-11-01

Full Text Available There are currently no standardized methods for identifying trigger food(s from irritable bowel syndrome (IBS food and symptom journals. The primary aim of this study was to assess the inter-rater reliability of providers’ interpretations of IBS journals. A second aim was to describe whether these interpretations varied for each patient. Eight providers reviewed 17 IBS journals and rated how likely key food groups (fermentable oligo-di-monosaccharides and polyols, high-calorie, gluten, caffeine, high-fiber were to trigger IBS symptoms for each patient. Agreement of trigger food ratings was calculated using Krippendorff’s α-reliability estimate. Providers were also asked to write down recommendations they would give to each patient. Estimates of agreement of trigger food likelihood ratings were poor (average α = 0.07. Most providers gave similar trigger food likelihood ratings for over half the food groups. Four providers gave the exact same written recommendation(s (range 3–7 to over half the patients. Inter-rater reliability of provider interpretations of IBS food and symptom journals was poor. Providers favored certain trigger food likelihood ratings and written recommendations. This supports the need for a more standardized method for interpreting these journals and/or more rigorous techniques to accurately identify personalized IBS food triggers.
The Surgical Safety Checklist and Teamwork Coaching Tools: a study of inter-rater reliability.

Science.gov (United States)

Huang, Lyen C; Conley, Dante; Lipsitz, Stu; Wright, Christopher C; Diller, Thomas W; Edmondson, Lizabeth; Berry, William R; Singer, Sara J

2014-08-01

To assess the inter-rater reliability (IRR) of two novel observation tools for measuring surgical safety checklist performance and teamwork. Data surgical safety checklists can promote adherence to standards of care and improve teamwork in the operating room. Their use has been associated with reductions in mortality and other postoperative complications. However, checklist effectiveness depends on how well they are performed. Authors from the Safe Surgery 2015 initiative developed a pair of novel observation tools through literature review, expert consultation and end-user testing. In one South Carolina hospital participating in the initiative, two observers jointly attended 50 surgical cases and independently rated surgical teams using both tools. We used descriptive statistics to measure checklist performance and teamwork at the hospital. We assessed IRR by measuring percent agreement, Cohen's κ, and weighted κ scores. The overall percent agreement and κ between the two observers was 93% and 0.74 (95% CI 0.66 to 0.79), respectively, for the Checklist Coaching Tool and 86% and 0.84 (95% CI 0.77 to 0.90) for the Surgical Teamwork Tool. Percent agreement for individual sections of both tools was 79% or higher. Additionally, κ scores for six of eight sections on the Checklist Coaching Tool and for two of five domains on the Surgical Teamwork Tool achieved the desired 0.7 threshold. However, teamwork scores were high and variation was limited. There were no significant changes in the percent agreement or κ scores between the first 10 and last 10 cases observed. Both tools demonstrated substantial IRR and required limited training to use. These instruments may be used to observe checklist performance and teamwork in the operating room. However, further refinement and calibration of observer expectations, particularly in rating teamwork, could improve the utility of the tools. Published by the BMJ Publishing Group Limited. For permission to use (where not already
Supersonic shear imaging provides a reliable measurement of resting muscle shear elastic modulus

International Nuclear Information System (INIS)

Lacourpaille, Lilian; Hug, François; Bouillard, Killian; Nordez, Antoine; Hogrel, Jean-Yves

2012-01-01

The aim of the present study was to assess the reliability of shear elastic modulus measurements performed using supersonic shear imaging (SSI) in nine resting muscles (i.e. gastrocnemius medialis, tibialis anterior, vastus lateralis, rectus femoris, triceps brachii, biceps brachii, brachioradialis, adductor pollicis obliquus and abductor digiti minimi) of different architectures and typologies. Thirty healthy subjects were randomly assigned to the intra-session reliability (n = 20), inter-day reliability (n = 21) and the inter-observer reliability (n = 16) experiments. Muscle shear elastic modulus ranged from 2.99 (gastrocnemius medialis) to 4.50 kPa (adductor digiti minimi and tibialis anterior). On the whole, very good reliability was observed, with a coefficient of variation (CV) ranging from 4.6% to 8%, except for the inter-operator reliability of adductor pollicis obliquus (CV = 11.5%). The intraclass correlation coefficients were good (0.871 ± 0.045 for the intra-session reliability, 0.815 ± 0.065 for the inter-day reliability and 0.709 ± 0.141 for the inter-observer reliability). Both the reliability and the ease of use of SSI make it a potentially interesting technique that would be of benefit to fundamental, applied and clinical research projects that need an accurate assessment of muscle mechanical properties. (note)
A reliability study of the new sensors for movement analysis (SHARIF-HMIS).

Science.gov (United States)

Abedi, Mohen; Manshadi, Farideh Dehghan; Zavieh, Minoo Khalkhali; Ashouri, Sajad; Azimi, Hadi; Parnanpour, Mohamad

2016-04-01

SHARIF-HMIS is a new inertial sensor designed for movement analysis. The aim of the present study was to assess the inter-tester and intra-tester reliability of some kinematic parameters in different lumbar motions making use of this sensor. 24 healthy persons and 28 patients with low back pain participated in the current reliability study. The test was performed in five different lumbar motions consisting of lumbar flexion in 0, 15, and 30° in the right and left directions. For measuring inter-tester reliability, all the tests were carried out twice on the same day separately by two physiotherapists. Intra-tester reliability was assessed by reproducing the tests after 3 days by the same physiotherapist. The present study revealed satisfactory inter- and intra-tester reliability indices in different positions. ICCs for intra-tester reliability ranged from 0.65 to 0.98 and 0.59 to 0.81 for healthy and patient participants, respectively. Also, ICCs for inter-tester reliability ranged from 0.65 to 0.92 for the healthy and 0.65 to 0.87 for patient participants. In general, it can be inferred from the results that measuring the kinematic parameters in lumbar movements using inertial sensors enjoys acceptable reliability. Copyright © 2015 Elsevier Ltd. All rights reserved.
Observer reliability of CT angiography in the assessment of acute ischaemic stroke: data from the Third International Stroke Trial

Energy Technology Data Exchange (ETDEWEB)

Mair, Grant; Farrall, Andrew J.; Sellar, Robin J.; Mollison, Daisy; Sakka, Eleni; Palmer, Jeb; Wardlaw, Joanna M. [University of Edinburgh, Western General Hospital, Division of Neuroimaging Sciences, Edinburgh (United Kingdom); Kummer, Ruediger von [Dresden University Stroke Centre, University Hospital, Department of Neuroradiology, Dresden (Germany); Adami, Alessandro [Sacro Cuore-Don Calabria Hospital, Stroke Center, Department of Neurology, Negrar (Italy); White, Philip M. [Stroke Research Group, Newcastle upon Tyne (United Kingdom); Adams, Matthew E. [National Hospital for Neurology and Neurosurgery, Department of Neuroradiology, London (United Kingdom); Yan, Bernard [Royal Melbourne Hospital, Neurovascular Research Group, Parkville (Australia); Demchuk, Andrew M. [Calgary Stroke Program, Department of Clinical Neurosciences, Calgary (Canada); Ramaswamy, Rajesh; Rodrigues, Mark A.; Samji, Karim; Baird, Andrew J. [Royal Infirmary of Edinburgh, Department of Radiology, Edinburgh (United Kingdom); Boyd, Elena V. [Northwick Park Hospital, Department of Radiology, Harrow (United Kingdom); Cohen, Geoff; Perry, David; Sandercock, Peter A.G. [University of Edinburgh, Western General Hospital, Division of Clinical Neurosciences, Edinburgh (United Kingdom); Lindley, Richard [University of Sydney, Westmead Hospital Clinical School and The George Institute for Global Health, Sydney (Australia); Collaboration: The IST-3 Collaborative Group

2014-10-07

CT angiography (CTA) is often used for assessing patients with acute ischaemic stroke. Only limited observer reliability data exist. We tested inter- and intra-observer reliability for the assessment of CTA in acute ischaemic stroke. We selected 15 cases from the Third International Stroke Trial (IST-3, ISRCTN25765518) with various degrees of arterial obstruction in different intracranial locations on CTA. To assess inter-observer reliability, seven members of the IST-3 expert image reading panel (>5 years experience reading CTA) and seven radiology trainees (<2 years experience) rated all 15 scans independently and blind to clinical data for: presence (versus absence) of any intracranial arterial abnormality (stenosis or occlusion), severity of arterial abnormality using relevant scales (IST-3 angiography score, Thrombolysis in Cerebral Infarction (TICI) score, Clot Burden Score), collateral supply and visibility of a perfusion defect on CTA source images (CTA-SI). Intra-observer reliability was assessed using independently repeated expert panel scan ratings. We assessed observer agreement with Krippendorff's-alpha (K-alpha). Among experienced observers, inter-observer agreement was substantial for the identification of any angiographic abnormality (K-alpha = 0.70) and with an angiography assessment scale (K-alpha = 0.60-0.66). There was less agreement for grades of collateral supply (K-alpha = 0.56) or for identification of a perfusion defect on CTA-SI (K-alpha = 0.32). Radiology trainees performed as well as expert readers when additional training was undertaken (neuroradiology specialist trainees). Intra-observer agreement among experts provided similar results (K-alpha = 0.33-0.72). For most imaging characteristics assessed, CTA has moderate to substantial observer agreement in acute ischaemic stroke. Experienced readers and those with specialist training perform best. (orig.)
Improving inter-observer variability in the evaluation of ultrasonographic features of polycystic ovaries

Directory of Open Access Journals (Sweden)

Leswick David A

2008-07-01

Full Text Available Abstract Background We recently reported poor inter-observer agreement in identifying and quantifying individual ultrasonographic features of polycystic ovaries. Our objective was to determine the effect of a training workshop on reducing inter-observer variation in the ultrasonographic evaluation of polycystic ovaries. Methods Transvaginal ultrasound recordings from thirty women with polycystic ovary syndrome (PCOS were evaluated by three radiologists and three reproductive endocrinologists both before and after an ultrasound workshop. The following endpoints were assessed: 1 follicle number per ovary (FNPO, 2 follicle number per single cross-section (FNPS, 3 largest follicle diameter, 4 ovarian volume, 5 follicle distribution pattern and 6 presence of a corpus luteum (CL. Lin's concordance correlation coefficients (rho and kappa statistics for multiple raters (kappa were used to assess level of inter-observer agreement (>0.80 good, 0.60 – 0.80 moderate/fair, Results Following the workshop, inter-observer agreement improved for the evaluation of FNPS (rho = 0.70, delta rho = +0.11, largest follicle diameter (rho = 0.77, delta rho = +0.10, ovarian volume (rho = 0.84, delta rho = +0.12, follicle distribution pattern (kappa = 0.80, delta kappa = +0.21 and presence of a CL (kappa = 0.87, delta kappa = +0.05. No improvement was evident for FNPO (rho = 0.54, delta rho = -0.01. Both radiologists and reproductive endocrinologists demonstrated improvement in scores (p Conclusion Reliability in evaluating ultrasonographic features of polycystic ovaries can be significantly improved following participation in a training workshop. If ultrasonographic evidence of polycystic ovaries is to be used as an objective measure in the diagnosis of PCOS, then standardized training modules should be implemented to unify the approach to evaluating polycystic ovarian morphology.
Inter-rater reliability of direct observations of the physical and psychosocial working conditions in eldercare: An evaluation in the DOSES project

NARCIS (Netherlands)

Karstad, K. (Kristina); Rugulies, R. (Reiner); Skotte, J. (Jørgen); Munch, P.K. (Pernille Kold); Greiner, B.A. (Birgit A.); Burdorf, A. (Alex); Søgaard, K. (Karen); A. Holtermann (Andreas)

2018-01-01

textabstractThe aim of the study was to develop and evaluate the reliability of the “Danish observational study of eldercare work and musculoskeletal disorders” (DOSES) observation instrument to assess physical and psychosocial risk factors for musculoskeletal disorders (MSD) in eldercare work.
Inter-rater reliability and stability of diagnoses of autism spectrum disorder in children identified through screening at a very young age

NARCIS (Netherlands)

van Daalen, Emma; Kemner, Chantal; Dietz, Claudine; Swinkels, Sophie H. N.; Buitelaar, Jan K.; van Engeland, Herman

2009-01-01

To examine the inter-rater reliability and stability of autism spectrum disorder (ASD) diagnoses made at a very early age in children identified through a screening procedure around 14 months of age. In a prospective design, preschoolers were recruited from a screening study for ASD. The inter-rater
Inter-rater reliability and stability of diagnoses of autism spectrum disorder in children identified through screening at a very young age.

NARCIS (Netherlands)

Daalen, E. van; Kemner, C.; Dietz, C.; Swinkels, S.H.N.; Buitelaar, J.K.; Engeland, H.M. van

2009-01-01

To examine the inter-rater reliability and stability of autism spectrum disorder (ASD) diagnoses made at a very early age in children identified through a screening procedure around 14 months of age. In a prospective design, preschoolers were recruited from a screening study for ASD. The inter-rater
Inter-Observer Agreement on Diffusion-Weighted Magnetic Resonance Imaging Interpretation for Diagnosis of Acute Ischemic Stroke Among Emergency Physicians

Directory of Open Access Journals (Sweden)

Deniz ORAY

2015-06-01

Full Text Available SUMMARY: Objectives: Diffusion-weighted magnetic resonance imaging (DW-MRI is a highly sensitive tool for the detection of early ischemic stroke and is excellent at detecting small and early infarcts. Nevertheless, conflict may arise and judgments may differ among different interpreters. Inter-observer variability shows the systematic difference among different observers and is expressed as the kappa (Κ coefficient. In this study, we aimed to determinate the inter-observer variability among emergency physicians in the use of DW-MRI for the diagnosis of acute ischemic stroke. Methods: Cranial DW-MRI images of 50 patients were interpreted in this retrospective observational cross-sectional study. Patients who were submitted to DW-MRI imaging for a suspected acute ischemic stroke were included in the study, unless the scans were ordered by any of the reviewers or they were absent in the system. The scans were blindly and randomly interpreted by four emergency physicians. Inter-observer agreement between reviewers was evaluated using Fleiss’ Κ statistics. Results: The mean kappa value for high signal on diffusion-weighted images (DWI and for reduction on apparent diffusion coefficient (ADC were substantial (k=0.67 and moderate (k=0.60 respectively. The correlation for detection of the presence of ischemia and location was substantial (k: 0.67. There were 18 false-positive and 4 false-negative evaluations of DWI, 15 false positive and 8 false-negative evaluations of ADC. Conclusions: Our data suggest that DW-MRI is reliable in screening for ischemic stroke when interpreted by emergency physicians in the emergency department. The levels of stroke identification and variability show that emergency physicians may have an acceptable level of agreement. Key words: Emergency department, diffusion weighted magnetic resonance imaging, inter-observer agreement, ischemic stroke
The reliability of routine anthropometric data collected by health workers: a cross-sectional study.

Science.gov (United States)

Johnson, William; Cameron, Noël; Dickson, Peter; Emsley, Stuart; Raynor, Pauline; Seymour, Claire; Wright, John

2009-03-01

Reliable data on child growth is a prerequisite for monitoring and improving child health. Despite the extensive resources invested in recording anthropometry, there has been little research into the reliability of these data. If these measurements are unreliable growth may be misreported, and health problems may go undetected. To assess the reliability of routine infant growth data, following anthropometric training of health workers responsible for collecting these data, in Bradford, UK. To determine whether being observed by an external administrator influenced reliability. A test-retest design was used. All health workers (n=192) responsible for growth monitoring in Bradford were included in the study, of which 36.5% (n=70) had complete data. Following training in basic anthropometry all health workers were asked to complete a test-retest study, using infants aged 0-2 years. Health workers took two recordings of weight, length, head circumference, and abdominal circumferences on five infants. A peer health worker recorded a third set of measurements on each infant. Twenty-two individuals were selected to be observed by an external administrator during data collection. Technical error of measurements (TEMs) were produced to assess intra-observer and inter-observer reliability. Differences between groups were tested to determine whether external observation influences reliability. None of the TEMs were excessively large, and coefficients of reliability ranged from 0.96 to 1.00. All intra-observer and inter-observer TEMs for the observed group were larger than those for the non-observed group. For example, the observed group's intra-observer TEMs for weight, length, abdominal circumference, and head circumference (46.18 g, 0.60 cm, 0.65 cm, 0.47 cm) were larger than the non-observed group's TEMS (9.14 g, 0.35 cm, 0.34 cm, 0.19 cm). TEMs for weight, abdominal circumference, and head circumference were significantly larger for the observed group, compared to the non-observed
Inter-observer and intra-observer agreement on interpretation of uroflowmetry curves of kindergarten children.

Science.gov (United States)

Chang, Shang-Jen; Yang, Stephen S D

2008-12-01

To evaluate the inter-observer and intra-observer agreement on the interpretation of uroflowmetry curves of children. Healthy kindergarten children were enrolled for evaluation of uroflowmetry. Uroflowmetry curves were classified as bell-shaped, tower, plateau, staccato and interrupted. Only the bell-shaped curves were regarded as normal. Two urodynamists evaluated the curves independently after reviewing the definitions of the different types of uroflowmetry curve. The senior urodynamist evaluated the curves twice 3 months apart. The final conclusion was made when consensus was reached. Agreement among observers was analyzed using kappa statistics. Of 190 uroflowmetry curves eligible for analysis, the intra-observer agreement in interpreting each type of curve and interpreting normalcy vs abnormality was good (kappa=0.71 and 0.68, respectively). Very good inter-observer agreement (kappa=0.81) on normalcy and good inter-observer agreement (kappa=0.73) on types of uroflowmetry were observed. Poor inter-observer agreement existed on the classification of specific types of abnormal uroflowmetry curves (kappa=0.07). Uroflowmetry is a good screening tool for normalcy of kindergarten children, while not a good tool to define the specific types of abnormal uroflowmetry.
Inter and intra-observer reliability in assessment of the position of the lateral sesamoid in determining the severity of hallux valgus.

Science.gov (United States)

Panchani, Sunil; Reading, Jonathan; Mehta, Jaysheel

2016-06-01

The position of the lateral sesamoid on standard dorso-plantar weight bearing radiographs, with respect to the lateral cortex of the first metatarsal, has been shown to correlate well with the degree of the hallux valgus angle. This study aimed to assess the inter- and intra-observer error of this new classification system. Five orthopaedic consultants and five trainee orthopaedic surgeons were recruited to assess and document the degree of displacement of the lateral sesamoid on 144 weight-bearing dorso-plantar radiographs on two separate occasions. The severity of hallux valgus was defined as normal (0%), mild (≤50%), moderate (51-≤99%) or severe (≥100%) depending on the percentage displacement of the lateral sesamoid body from the lateral cortical border of the first metatarsal. Consultant intra-observer variability showed good agreement between repeated assessment of the radiographs (mean Kappa=0.75). Intra-observer variability for trainee orthopaedic surgeons also showed good agreement with a mean Kappa=0.73. Intraclass correlations for consultants and trainee surgeons was also high. The new classification system of assessing the severity of hallux valgus shows high inter- and intra-observer variability with good agreement and reproducibility between surgeons of consultant and trainee grades. Copyright © 2015 Elsevier Ltd. All rights reserved.
Inter-rater reliability of the German version of the Nurses' Global Assessment of Suicide Risk scale.

Science.gov (United States)

Kozel, Bernd; Grieser, Manuela; Abderhalden, Christoph; Cutcliffe, John R

2016-10-01

In comparison to the general population, the suicide rates of psychiatric inpatient populations in Germany and Switzerland are very high. An important preventive contribution to the lowering of the suicide rates in mental health care is to ensure that the risk of suicide of psychiatric inpatients is assessed as accurately as possible. While risk-assessment instruments can serve an important function in determining such risk, very few have been translated to German. Therefore, in the present study, we reported on the German version of Nurses' Global Assessment of Suicide Risk (NGASR) scale. After translating the original instrument into German and pretesting the German version, we tested the inter-rater reliability of the instrument. Twelve video case studies were evaluated by 13 raters with the NGASR scale in a 'laboratory' trial. In each case, the observer's agreement was calculated for the single items, the overall scale, the risk levels, and the sum scores. The statistical data analysis was conducted with kappa and AC1 statistics for dichotomous (items, scale) scales. A high-to-very high observers' agreement (AC1: 0.62-1.00, kappa: 0.00-1.00) was determined for 16 items of the German version of the NGASR scale. We conclude that the German version of the NGASR scale is a reliable instrument for evaluating risk factors for suicide. A reliable application in the clinical practise appears to be enhanced by training in the use of the instrument and the right implementation instructions. © 2016 Australian College of Mental Health Nurses Inc.
Measurement of the Inter-Rater Reliability Rate Is Mandatory for Improving the Quality of a Medical Database: Experience with the Paulista Lung Cancer Registry.

Science.gov (United States)

Lauricella, Leticia L; Costa, Priscila B; Salati, Michele; Pego-Fernandes, Paulo M; Terra, Ricardo M

2018-06-01

Database quality measurement should be considered a mandatory step to ensure an adequate level of confidence in data used for research and quality improvement. Several metrics have been described in the literature, but no standardized approach has been established. We aimed to describe a methodological approach applied to measure the quality and inter-rater reliability of a regional multicentric thoracic surgical database (Paulista Lung Cancer Registry). Data from the first 3 years of the Paulista Lung Cancer Registry underwent an audit process with 3 metrics: completeness, consistency, and inter-rater reliability. The first 2 methods were applied to the whole data set, and the last method was calculated using 100 cases randomized for direct auditing. Inter-rater reliability was evaluated using percentage of agreement between the data collector and auditor and through calculation of Cohen's κ and intraclass correlation. The overall completeness per section ranged from 0.88 to 1.00, and the overall consistency was 0.96. Inter-rater reliability showed many variables with high disagreement (>10%). For numerical variables, intraclass correlation was a better metric than inter-rater reliability. Cohen's κ showed that most variables had moderate to substantial agreement. The methodological approach applied to the Paulista Lung Cancer Registry showed that completeness and consistency metrics did not sufficiently reflect the real quality status of a database. The inter-rater reliability associated with κ and intraclass correlation was a better quality metric than completeness and consistency metrics because it could determine the reliability of specific variables used in research or benchmark reports. This report can be a paradigm for future studies of data quality measurement. Copyright © 2018 American College of Surgeons. Published by Elsevier Inc. All rights reserved.
Inter-rater reliability of kinesthetic measurements with the KINARM robotic exoskeleton.

Science.gov (United States)

Semrau, Jennifer A; Herter, Troy M; Scott, Stephen H; Dukelow, Sean P

2017-05-22

Kinesthesia (sense of limb movement) has been extremely difficult to measure objectively, especially in individuals who have survived a stroke. The development of valid and reliable measurements for proprioception is important to developing a better understanding of proprioceptive impairments after stroke and their impact on the ability to perform daily activities. We recently developed a robotic task to evaluate kinesthetic deficits after stroke and found that the majority (~60%) of stroke survivors exhibit significant deficits in kinesthesia within the first 10 days post-stroke. Here we aim to determine the inter-rater reliability of this robotic kinesthetic matching task. Twenty-five neurologically intact control subjects and 15 individuals with first-time stroke were evaluated on a robotic kinesthetic matching task (KIN). Subjects sat in a robotic exoskeleton with their arms supported against gravity. In the KIN task, the robot moved the subjects' stroke-affected arm at a preset speed, direction and distance. As soon as subjects felt the robot begin to move their affected arm, they matched the robot movement with the unaffected arm. Subjects were tested in two sessions on the KIN task: initial session and then a second session (within an average of 18.2 ± 13.8 h of the initial session for stroke subjects), which were supervised by different technicians. The task was performed both with and without the use of vision in both sessions. We evaluated intra-class correlations of spatial and temporal parameters derived from the KIN task to determine the reliability of the robotic task. We evaluated 8 spatial and temporal parameters that quantify kinesthetic behavior. We found that the parameters exhibited moderate to high intra-class correlations between the initial and retest conditions (Range, r-value = [0.53-0.97]). The robotic KIN task exhibited good inter-rater reliability. This validates the KIN task as a reliable, objective method for quantifying
Inter-rater and intrarater reliability of the South African Triage Scale in low-resource settings of Haiti and Afghanistan.

Science.gov (United States)

Dalwai, Mohammed; Tayler-Smith, Katie; Twomey, Michèle; Nasim, Masood; Popal, Abdul Qayum; Haqdost, Waliul Haq; Gayraud, Olivia; Cheréstal, Sophia; Wallis, Lee; Valles, Pola

2018-03-16

The South African Triage Scale (SATS) has demonstrated good validity in the EDs of Médecins Sans Frontières (MSF)-supported sites in Afghanistan and Haiti; however, corresponding reliability in these settings has not yet been reported on. This study set out to assess the inter-rater and intrarater reliability of the SATS in four MSF-supported EDs in Afghanistan and Haiti (two trauma-only EDs and two mixed (including both medical and trauma cases) EDs). Under classroom conditions between December 2013 and February 2014, ED nurses at each site assigned triage ratings to a set of context-specific vignettes (written case reports of ED patients). Inter-rater reliability was assessed by comparing triage ratings among nurses; intrarater reliability was assessed by asking the nurses to retriage 10 random vignettes from the original set and comparing these duplicate ratings. Inter-rater reliability was calculated using the unweighted kappa, linearly weighted kappa and quadratically weighted kappa (QWK) statistics, and the intraclass correlation coefficient (ICC). Intrarater reliability was calculated according to the percentage of exact agreement and the percentage of agreement allowing for one level of discrepancy in triage ratings. The correlation between years of nursing experience and reliability of the SATS was assessed based on comparison of ICCs and the respective 95% CIs. A total of 67 nurses agreed to participate in the study: In Afghanistan there were 19 nurses from Kunduz Trauma Centre and nine from Ahmed Shah Baba; in Haiti, there were 20 nurses from Martissant Emergency Centre and 19 from Tabarre Surgical and Trauma Centre. Inter-rater agreement was moderate across all sites (ICC range: 0.50-0.60; QWK range: 0.50-0.59) apart from the trauma ED in Haiti where it was moderate to substantial (ICC: 0.58; QWK: 0.61). Intrarater agreement was similar across the four sites (68%-74% exact agreement); when allowing for a one-level discrepancy in triage ratings
Variabilidade intra-avaliador e inter-avaliadores de medidas antropométricas - DOI: 10.4025/actascihealthsci.v29i1.98 Intra-observer and inter-observers variability of anthropometric measures- DOI: 10.4025/actascihealthsci.v29i1.98

Directory of Open Access Journals (Sweden)

Edna Regina de Oliveira

2007-12-01

Full Text Available O objetivo do estudo foi avaliar a variabilidade intra e inter-avaliadores de medidas antropométricas realizadas por três antropometristas considerados experientes mediante o cálculo dos erros técnicos de medida (ETM. Para esse fim, foi selecionada uma amostra de 21 voluntários (25,7 ± 7,5 anos, sendo 12 homens e 9 mulheres. Foram consideradas as medidas de peso corporal (kg, estatura (cm, circunferências (cm do braço direito relaxado, abdômen, quadril e coxa, e da espessura das dobras cutâneas (mm tricipital, subescapular, suprailíaca, abdômen, coxa e panturrilha medial. As medidas foram realizadas em 2 dias consecutivos, sempre no período da tarde, envolvendo os mesmos equipamentos e os mesmos voluntários. Os resultados apontaram a ocorrência de ETMs acima dos recomendados para aceitabilidade, tanto a variabilidade intra-avaliador como inter-avaliadores, sugerindo, portanto, a importância e a necessidade de treinamento específico dos antropometristas.The aim of this study was to evaluate the intra and inter-observer reliability in the anthropometric measures of three expert observers through the technical error of measure (TEM. For this end, 21 healthy volunteers (25.7 ± 7.5 years took part in the study, being 12 males and 9 females. The following aspects were analyzed: body weigh (kg, stature (cm, circumferences (cm of relaxed right arm, abdomen, hip and thigh, and skinfold thickness (mm of triceps, subescapular, supra-iliac, abdomen, thigh and calf medial. The measures were made in 2 different days, always in the afternoon, with the same equipments and in the same volunteers. The results pointed the occurrence of TEMs out of the recommended patterns, in both intra-observer and inter-observers variability. This fact showed the importance and need of specific training of the observers.
Validity and inter-rater reliability of medio-lateral knee motion observed during a single-limb mini squat

Directory of Open Access Journals (Sweden)

Simic Milena

2010-11-01

Full Text Available Abstract Background Muscle function may influence the risk of knee injury and outcomes following injury. Clinical tests, such as a single-limb mini squat, resemble conditions of daily life and are easy to administer. Fewer squats per 30 seconds indicate poorer function. However, the quality of movement, such as the medio-lateral knee motion may also be important. The aim was to validate an observational clinical test of assessing the medio-lateral knee motion, using a three-dimensional (3-D motion analysis system. In addition, the inter-rater reliability was evaluated. Methods Twenty-five (17 women non-injured participants (mean age 25.6 years, range 18-37 were included. Visual analysis of the medio-lateral knee motion, scored as knee-over-foot or knee-medial-to-foot by two raters, and 3-D kinematic data were collected simultaneously during a single-limb mini squat. Frontal plane 2-D peak tibial, thigh, and knee varus-valgus angles, and 3-D peak hip internal-external rotation, and knee varus-valgus angles were calculated. Results Ten subjects were scored as having a knee-medial-to-foot position and 15 subjects a knee-over-foot position assessed by visual inspection. In 2-D, the peak tibial angle (mean 89.0 (SE 0.7 vs mean 86.3 (SE 0.4 degrees, p = 0.001 and peak thigh angle (mean 77.4 (SE 1.0 vs mean 81.2 (SE 0.5 degrees, p = 0.001 with respect to the horizontal, indicated that the knee was more medially placed than the ankle and thigh, respectively. Thus, the knee was in more valgus (mean 11.6 (SE 1.5 vs 5.0 (SE 0.8 degrees, p 0.90 and 96 between raters. Conclusions Medio-lateral motion of the knee can reliably be assessed during a single-leg mini-squat. The test is valid in 2-D, while the actual movement, in 3-D, is mainly exhibited as increased internal hip rotation. The single-limb mini squat is feasible and easy to administer in the clinical setting and in research to address lower extremity movement quality.
Inter-rater reliability of the evaluation of muscular chains associated with posture alterations in scoliosis

Directory of Open Access Journals (Sweden)

Fortin Carole

2012-05-01

Full Text Available Abstract Background In the Global postural re-education (GPR evaluation, posture alterations are associated with anterior or posterior muscular chain impairments. Our goal was to assess the reliability of the GPR muscular chain evaluation. Methods Design: Inter-rater reliability study. Fifty physical therapists (PTs and two experts trained in GPR assessed the standing posture from photographs of five youths with idiopathic scoliosis using a posture analysis grid with 23 posture indices (PI. The PTs and experts indicated the muscular chain associated with posture alterations. The PTs were also divided into three groups according to their experience in GPR. Experts’ results (after consensus were used to verify agreement between PTs and experts for muscular chain and posture assessments. We used Kappa coefficients (K and the percentage of agreement (%A to assess inter-rater reliability and intra-class coefficients (ICC for determining agreement between PTs and experts. Results For the muscular chain evaluation, reliability was moderate to substantial for 12 PI for the PTs (%A: 56 to 82; K: 0.42 to 0.76 and perfect for 19 PI for the experts. For posture assessment, reliability was moderate to substantial for 12 PI for the PTs (%A > 60%; K: 0.42 to 0.75 and moderate to perfect for 18 PI for the experts (%A: 80 to 100; K: 0.55 to 1.00. The agreement between PTs and experts was good for most muscular chain evaluations (18 PI; ICC: 0.82 to 0.99 and PI (19 PI; ICC: 0.78 to 1.00. Conclusions The GPR muscular chain evaluation has good reliability for most posture indices. GPR evaluation should help guide physical therapists in targeting affected muscles for treatment of abnormal posture patterns.

The definition of radiological signs in gastric ulcer and assessment of their validity by inter-observer variation study.

Science.gov (United States)

Schulman, A; Simpkins, K C

1975-07-01

The initial aim was to program a computer with information on the frequency of radiological signs in benign and malignant gastric ulcers in order to obtain a percentage probability of benignancy or malignancy in succeeding ulcers in clinical practice. However, only four of the many signs described in gastric ulcer were confirmed to be of validity (i.e. reliable existence) by an inter-observer variation study using two observers and the films from 69 barium meal examinations. These were projection or non-projection of the in-profile ulcer, presence or absence of adjacent mucosal folds, good or poor definition of the in-face ulcer's edge, and extension of radiating folds to the in-face ulcer's edge. A few more remained unassessed due to insufficient numbers of relevant cases. It is condluced that: as defined in the literature the majority of radiological signs in this field are of uncertain existence; and the four that were found to be valid do not fully describe the important appearances that may be seen in benign and malignant ulcers and would be inadequate to differentiate them to a sufficiently high degree of probability.
Inter-observer variation of diagnosis of Alzheimer's disease by SPECT

International Nuclear Information System (INIS)

Oshima, Motoo; Machida, Kikuo; Koizumi, Kiyoshi

2001-01-01

SPECT shows characteristic distribution in Alzheimer's disease. The purpose of this study is to define inter-observer variations in the diagnosis of Alzheimer's disease. Fifty-seven patients, included 19 Alzheimer's disease were collected from four institutions. Five-graded score was used to interprete SPECT in 18 regions. Ten nuclear medicine physicians interpreted SPECT referred with MMSE and clinical information. Among 57 cases 19 Alzheimer's disease were selected in this study. Statistics were performed between SPECT score and MMSE score. In conclusion, inter-observer variation is present in SPECT interpretation. There was a good correlation SPECT and MMSE with proper brain SPECT physicians. They are superior to in the interpretation not only resident, but other specialists. Education in the interpretation of brain SPECT looks important. (author)
Inter-rater reliability and stability of diagnoses of autism spectrum disorder in children identified through screening at a very young age.

Science.gov (United States)

van Daalen, Emma; Kemner, Chantal; Dietz, Claudine; Swinkels, Sophie H N; Buitelaar, Jan K; van Engeland, Herman

2009-11-01

To examine the inter-rater reliability and stability of autism spectrum disorder (ASD) diagnoses made at a very early age in children identified through a screening procedure around 14 months of age. In a prospective design, preschoolers were recruited from a screening study for ASD. The inter-rater reliability of the diagnosis of ASD was measured through an independent assessment of a randomly selected subsample of 38 patients by two other psychiatrists. The diagnoses at 23 months and 42 months of 131 patients, based on the clinical assessment and the diagnostic classifications of standardised instruments, were compared to evaluate stability of the diagnosis of ASD. Inter-rater reliability on a diagnosis of ASD versus non-ASD at 23 months was 87% with a weighted kappa of 0.74 (SE 0.11). The stability of the different diagnoses in the autism spectrum was 63% for autistic disorder, 54% for pervasive developmental disorder, not otherwise specified (PDD-NOS), and 91% for the whole category of ASD. Most diagnostic changes at 42 months were within the autism spectrum from autistic disorder to PDD-NOS and were mainly due to diminished symptom severity. Children who moved outside the ASD category at 42 months made significantly larger gains in cognitive and language skills than children with a stable ASD diagnosis. In conclusion, the inter-rater reliability and stability of the diagnoses of ASD established at 23 months in this population-based sample of very young children are good.
Reliability on intra-laboratory and inter-laboratory data of hair mineral analysis comparing with blood analysis.

Science.gov (United States)

Namkoong, Sun; Hong, Seung Phil; Kim, Myung Hwa; Park, Byung Cheol

2013-02-01

Nowadays, although its clinical value remains controversial institutions utilize hair mineral analysis. Arguments about the reliability of hair mineral analysis persist, and there have been evaluations of commercial laboratories performing hair mineral analysis. The objective of this study was to assess the reliability of intra-laboratory and inter-laboratory data at three commercial laboratories conducting hair mineral analysis, compared to serum mineral analysis. Two divided hair samples taken from near the scalp were submitted for analysis at the same time, to all laboratories, from one healthy volunteer. Each laboratory sent a report consisting of quantitative results and their interpretation of health implications. Differences among intra-laboratory and interlaboratory data were analyzed using SPSS version 12.0 (SPSS Inc., USA). All the laboratories used identical methods for quantitative analysis, and they generated consistent numerical results according to Friedman analysis of variance. However, the normal reference ranges of each laboratory varied. As such, each laboratory interpreted the patient's health differently. On intra-laboratory data, Wilcoxon analysis suggested they generated relatively coherent data, but laboratory B could not in one element, so its reliability was doubtful. In comparison with the blood test, laboratory C generated identical results, but not laboratory A and B. Hair mineral analysis has its limitations, considering the reliability of inter and intra laboratory analysis comparing with blood analysis. As such, clinicians should be cautious when applying hair mineral analysis as an ancillary tool. Each laboratory included in this study requires continuous refinement from now on for inducing standardized normal reference levels.
Definition of gross tumor volume in lung cancer: inter-observer variability

International Nuclear Information System (INIS)

Van de Steene, Jan; Linthout, Nadine; Mey, Johan de; Vinh-Hung, Vincent; Claassens, Cornelia; Noppen, Marc; Bel, Arjan; Storme, Guy

2002-01-01

Background and purpose: To determine the inter-observer variation in gross tumor volume (GTV) definition in lung cancer, and its clinical relevance. Material and methods: Five clinicians involved in lung cancer were asked to define GTV on the planning CT scan of eight patients. Resulting GTVs were compared on the base of geometric volume, dimensions and extensions. Judgement of invasion of lymph node (LN) regions was evaluated using the ATS/LCSG classification of LN. Clinical relevance of the variation was studied through 3D-dosimetry of standard conformal plans: volume of critical organs (heart, lungs, esophagus, spinal cord) irradiated at toxic doses, 95% isodose volumes of GTVs, normal tissue complication probabilities (NTCP) and tumor control probabilities (TCP) were compared for evaluation of observer variability. Results: Before evaluation of observer variability, critical review of planning CT scan led to up- (two cases) and downstaging (one case) of patients as compared to the respective diagnostic scans. The defined GTVs showed an inter-observer variation with a ratio up to more than 7 between maximum and minimum geometric content. The dimensions of the primary tumor had inter-observer ranges of 4.2 (transversal), 7.9 (cranio-caudal) and 5.4 (antero-posterior) cm. Extreme extensions of the GTVs (left, right, cranial, caudal, anterior and posterior) varied with ranges of 2.8-7.3 cm due to inter-observer variation. After common review, only 63% of involved lymph node regions were delineated by the clinicians (i.e. 37% are false negative). Twenty-two percent of drawn in lymph node regions were accepted to be false positive after review. In the conformal plans, inter-observer ranges of irradiated normal tissue volume were on average 12%, with a maximum of 66%. The probability (in the population of all conformal plans) of irradiating at least 95% of the GTV with at least 95% of the nominal treatment dose decreased from 96 to 88% when swapping the matched GTV
Inter-rater reliability of the South African Triage Scale: Assessing two different cadres of health care workers in a real time environment

Directory of Open Access Journals (Sweden)

Michèle Twomey

2011-09-01

Conclusion: The inter-rater reliability of SATS ratings is excellent within individual HCWs, but significantly lower between different HCWs. This confirms previous reliability studies of the SATS using vignettes and if validated by larger studies would support the feasibility of further implementation of the SATS in primary health care settings across the Western Cape.
Reliability of a four-column classification for tibial plateau fractures.

Science.gov (United States)

Martínez-Rondanelli, Alfredo; Escobar-González, Sara Sofía; Henao-Alzate, Alejandro; Martínez-Cano, Juan Pablo

2017-09-01

A four-column classification system offers a different way of evaluating tibial plateau fractures. The aim of this study is to compare the intra-observer and inter-observer reliability between four-column and classic classifications. This is a reliability study, which included patients presenting with tibial plateau fractures between January 2013 and September 2015 in a level-1 trauma centre. Four orthopaedic surgeons blindly classified each fracture according to four different classifications: AO, Schatzker, Duparc and four-column. Kappa, intra-observer and inter-observer concordance were calculated for the reliability analysis. Forty-nine patients were included. The mean age was 39 ± 14.2 years, with no gender predominance (men: 51%; women: 49%), and 67% of the fractures included at least one of the posterior columns. The intra-observer and inter-observer concordance were calculated for each classification: four-column (84%/79%), Schatzker (60%/71%), AO (50%/59%) and Duparc (48%/58%), with a statistically significant difference among them (p = 0.001/p = 0.003). Kappa coefficient for intr-aobserver and inter-observer evaluations: Schatzker 0.48/0.39, four-column 0.61/0.34, Duparc 0.37/0.23, and AO 0.34/0.11. The proposed four-column classification showed the highest intra and inter-observer agreement. When taking into account the agreement that occurs by chance, Schatzker classification showed the highest inter-observer kappa, but again the four-column had the highest intra-observer kappa value. The proposed classification is a more inclusive classification for the posteromedial and posterolateral fractures. We suggest, therefore, that it be used in addition to one of the classic classifications in order to better understand the fracture pattern, as it allows more attention to be paid to the posterior columns, it improves the surgical planning and allows the surgical approach to be chosen more accurately.
Qualitative soil moisture assessment in semi-arid Africa - the role of experience and training on inter-rater reliability

Science.gov (United States)

Rinderer, M.; Komakech, H. C.; Müller, D.; Wiesenberg, G. L. B.; Seibert, J.

2015-08-01

Soil and water management is particularly relevant in semi-arid regions to enhance agricultural productivity. During periods of water scarcity, soil moisture differences are important indicators of the soil water deficit and are traditionally used for allocating water resources among farmers of a village community. Here we present a simple, inexpensive soil wetness classification scheme based on qualitative indicators which one can see or touch on the soil surface. It incorporates the local farmers' knowledge on the best soil moisture conditions for seeding and brick making in the semi-arid environment of the study site near Arusha, Tanzania. The scheme was tested twice in 2014 with farmers, students and experts (April: 40 persons, June: 25 persons) for inter-rater reliability, bias of individuals and functional relation between qualitative and quantitative soil moisture values. During the test in April farmers assigned the same wetness class in 46 % of all cases, while students and experts agreed on about 60 % of all cases. Students who had been trained in how to apply the method gained higher inter-rater reliability than their colleagues with only a basic introduction. When repeating the test in June, participants were given improved instructions, organized in small subgroups, which resulted in a higher inter-rater reliability among farmers. In 66 % of all classifications, farmers assigned the same wetness class and the spread of class assignments was smaller. This study demonstrates that a wetness classification scheme based on qualitative indicators is a robust tool and can be applied successfully regardless of experience in crop growing and education level when an in-depth introduction and training is provided. The use of a simple and clear layout of the assessment form is important for reliable wetness class assignments.
Qualitative soil moisture assessment in semi-arid Africa: the role of experience and training on inter-rater reliability

Science.gov (United States)

Rinderer, M.; Komakech, H.; Müller, D.; Seibert, J.

2015-03-01

Soil and water management is particularly relevant in semi-arid regions to enhance agricultural productivity. During periods of water scarcity soil moisture differences are important indicators of the soil water deficit and are traditionally used for allocating water resources among farmers of a village community. Here we present a simple, inexpensive soil wetness classification scheme based on qualitative indicators which one can see or touch on the soil surface. It incorporates the local farmers' knowledge on the best soil moisture conditions for seeding and brick making in the semi-arid environment of the study site near Arusha, Tanzania. The scheme was tested twice in 2014 with farmers, students and experts (April: 40 persons, June: 25 persons) for inter-rater reliability, bias of individuals and functional relation between qualitative and quantitative soil moisture values. During the test in April farmers assigned the same wetness class in 46% of all cases while students and experts agreed in about 60% of all cases. Students who had been trained in how to apply the method gained higher inter-rater reliability than their colleagues with only a basic introduction. When repeating the test in June, participants were given improved instructions, organized in small sub-groups, which resulted in a higher inter-rater reliability among farmers. In 66% of all classifications farmers assigned the same wetness class and the spread of class assignments was smaller. This study demonstrates that a wetness classification scheme based on qualitative indicators is a robust tool and can be applied successfully regardless of experience in crop growing and education level when an in-depth introduction and training is provided. The use of a simple and clear layout of the assessment form is important for reliable wetness class assignments.
Six of one, half a dozen of the other: A measure of multidisciplinary inter/intra-rater reliability of the society for fetal urology and urinary tract dilation grading systems for hydronephrosis.

Science.gov (United States)

Rickard, Mandy; Easterbrook, Bethany; Kim, Soojin; Farrokhyar, Forough; Stein, Nina; Arora, Steven; Belostotsky, Vladamir; DeMaria, Jorge; Lorenzo, Armando J; Braga, Luis H

2017-02-01

likely explained by the subjective interpretation required to assign grades, which can be impacted by experience, image quality, and scanning technique. As shown in the figure, which demonstrates SFU II (a) and SFU III (b), as assigned by a radiologist, it is possible to make an argument that either of these images can be classified into both categories that were observed during the grading sessions of this study. Although both systems have acceptable reliability, the SFU grading system showed higher overall intra/inter-rater reliability regardless of rater specialty than the UTD classification. Inter-rater reliability for SFU grades II/III and UTD 2 was low, highlighting the limitations of both classifications in regards to properly segregating moderate HN grades. Copyright © 2016 Journal of Pediatric Urology Company. Published by Elsevier Ltd. All rights reserved.
Inter and intra-observer concordance for the diagnosis of portal hypertension gastropathy.

Science.gov (United States)

Casas, Meritxell; Vergara, Mercedes; Brullet, Enric; Junquera, Félix; Martínez-Bauer, Eva; Miquel, Mireia; Sánchez-Delgado, Jordi; Dalmau, Blai; Campo, Rafael; Calvet, Xavier

2018-03-01

At present there is no fully accepted endoscopic classification for the assessment of the severity of portal hypertensive gastropathy (PHG). Few studies have evaluated inter and intra-observer concordance or the degree of concordance between different endoscopic classifications. To evaluate inter and intra-observer agreement for the presence of portal hypertensive gastropathy and enteropathy using different endoscopic classifications. Patients with liver cirrhosis were included into the study. Enteroscopy was performed under sedation. The location of lesions and their severity was recorded. Images were videotaped and subsequently evaluated independently by three different endoscopists, one of whom was the initial endoscopist. The agreement between observations was assessed using the kappa index. Seventy-four patients (mean age 63.2 years, 53 males and 21 females) were included. The agreement between the three endoscopists regarding the presence or absence of PHG using the Tanoue and McCormack classifications was very low (kappa scores = 0.16 and 0.27, respectively). The current classifications of portal hypertensive gastropathy have a very low degree of intra and inter-observer agreement for the diagnosis and assessment of gastropathy severity.
Development of Creative Behavior Observation Form: A Study on Validity and Reliability

Science.gov (United States)

Dere, Zeynep; Ömeroglu, Esra

2018-01-01

This study, Creative Behavior Observation Form was developed to assess creativity of the children. While the study group on the reliability and validity of Creative Behavior Observation Form was being developed, 257 children in total who were at the ages of 5-6 were used as samples with stratified sampling method. Content Validity Index (CVI) and…
Reliability of movement control tests in the lumbar spine

Directory of Open Access Journals (Sweden)

de Bruin Eling D

2007-09-01

Full Text Available Abstract Background Movement control dysfunction [MCD] reduces active control of movements. Patients with MCD might form an important subgroup among patients with non specific low back pain. The diagnosis is based on the observation of active movements. Although widely used clinically, only a few studies have been performed to determine the test reliability. The aim of this study was to determine the inter- and intra-observer reliability of movement control dysfunction tests of the lumbar spine. Methods We videoed patients performing a standardized test battery consisting of 10 active movement tests for motor control in 27 patients with non specific low back pain and 13 patients with other diagnoses but without back pain. Four physiotherapists independently rated test performances as correct or incorrect per observation, blinded to all other patient information and to each other. The study was conducted in a private physiotherapy outpatient practice in Reinach, Switzerland. Kappa coefficients, percentage agreements and confidence intervals for inter- and intra-rater results were calculated. Results The kappa values for inter-tester reliability ranged between 0.24 – 0.71. Six tests out of ten showed a substantial reliability [k > 0.6]. Intra-tester reliability was between 0.51 – 0.96, all tests but one showed substantial reliability [k > 0.6]. Conclusion Physiotherapists were able to reliably rate most of the tests in this series of motor control tasks as being performed correctly or not, by viewing films of patients with and without back pain performing the task.
Intra- and inter-rater reliabilities of measurement of ultrasound imaging for muscle thickness and pennation angle of tibialis anterior muscle in stroke patients.

Science.gov (United States)

Cho, Ki Hun; Lee, Hwang Jae; Lee, Wan Hee

2017-07-01

Dysfunction of skeletal muscle has been commonly reported in stroke patients. The purpose of this study was to investigate the intra- and inter-rater reliabilities of measurement of ultrasound imaging (USI) for pennation angle (PA) and muscle thickness (MT) of tibialis anterior muscle in stroke patients. Thirty-four stroke patients (19 men) participated in this study. USI was used for measurement of PA and MT of the tibialis anterior muscles at rest and during maximum voluntary contraction (MVC). Two examiners acquired images from all participants during two separate testing sessions, seven days apart. Intra-class correlation coefficients (ICCs), confidence interval (CI), standard error of measurement, minimal detectable change, and Bland-Altman plots were used for estimation of reliability. In the intra-rater reliability between measures, for all variables (PA and MT of the paretic and non-paretic sides of tibialis anterior muscles at rest and during MVC), the ICCs ranged between 0.639 and 0.998 and the CI was within an acceptable range of 0.388-0.999. In inter-rater reliability between examiners for the two tests, for all variables, the ICCs ranged between 0.690 and 0.995 and the CI was within an acceptable range of 0.463-0.997. In addition, significant difference was observed between the paretic and non-paretic sides of the tibialis anterior muscle architecture (p stroke patients. In addition, objective and quantitative measurements of tibialis anterior muscle using USI may provide appropriate management for the walking recovery of stroke patients.
Inter-observer reproducibility in reporting on renal drainage in children with hydronephrosis: a large collaborative study

International Nuclear Information System (INIS)

Tondeur, Marianne; Piepsz, Amy; De Palma, Diego; Roca, Isabel; Ham, Hamphrey

2008-01-01

The goal of this study was to evaluate the inter-observer reproducibility in reporting on renal drainage obtained during 99m Tc MAG3 renography in children, when already processed data are offered to the observers. Because web site facilities were used for communication, 57 observers from five continents participated in the study. Twenty-three renograms, including furosemide stimulation and posterect postmicturition views, covering various patterns of drainage, were submitted to the observers. Images, curves and quantitative parameters were provided. Good or almost good drainage, partial drainage and poor or no drainage were the three possible responses for each kidney. An important bias was observed among the observers, some of them more systematically reporting the drainage as being good, while others had a general tendency to consider the drainage as poor. This resulted in rather poor inter-observer reproducibility, as for more than half of the kidneys, less than 80% of the observers agreed on one of the three responses. Analysis of the individual cases identified some obvious causes of discrepancy: the absence of a clear limit between partial and good or almost good drainage, the fact of including or neglecting the effect of micturition and change of patient's position, the underestimation of drainage in the case of a flat renographic curve, and the difficulties of interpretation in the case of a small, not well functioning kidney. There is an urgent need for better standardisation in estimating the quality of drainage. (orig.)
Inter-observer reproducibility in reporting on renal drainage in children with hydronephrosis: a large collaborative study

Energy Technology Data Exchange (ETDEWEB)

Tondeur, Marianne; Piepsz, Amy [CHU Saint-Pierre, Departement des Radio-Isotopes, Brussels (Belgium); De Palma, Diego [Ospedale di Circolo, Nuclear Medicine, Varese (Italy); Roca, Isabel [Vall d' Hebron Hospital, Nuclear Medicine, Barcelona (Spain); Ham, Hamphrey [University Hospital, Department Nuclear Medicine, Ghent (Belgium)

2008-03-15

The goal of this study was to evaluate the inter-observer reproducibility in reporting on renal drainage obtained during {sup 99m}Tc MAG3 renography in children, when already processed data are offered to the observers. Because web site facilities were used for communication, 57 observers from five continents participated in the study. Twenty-three renograms, including furosemide stimulation and posterect postmicturition views, covering various patterns of drainage, were submitted to the observers. Images, curves and quantitative parameters were provided. Good or almost good drainage, partial drainage and poor or no drainage were the three possible responses for each kidney. An important bias was observed among the observers, some of them more systematically reporting the drainage as being good, while others had a general tendency to consider the drainage as poor. This resulted in rather poor inter-observer reproducibility, as for more than half of the kidneys, less than 80% of the observers agreed on one of the three responses. Analysis of the individual cases identified some obvious causes of discrepancy: the absence of a clear limit between partial and good or almost good drainage, the fact of including or neglecting the effect of micturition and change of patient's position, the underestimation of drainage in the case of a flat renographic curve, and the difficulties of interpretation in the case of a small, not well functioning kidney. There is an urgent need for better standardisation in estimating the quality of drainage. (orig.)
Student Precision and Reliability of the Team Sport Assessment in ...

African Journals Online (AJOL)

TSAP) and formative assessment of invasion sport. The specific objectives were to determine the degree of agreement among expert observers, inter-observer reliability (internal consistency), and intra observer reliability (temporal reliability).
BurnCase 3D software validation study: Burn size measurement accuracy and inter-rater reliability.

Science.gov (United States)

Parvizi, Daryousch; Giretzlehner, Michael; Wurzer, Paul; Klein, Limor Dinur; Shoham, Yaron; Bohanon, Fredrick J; Haller, Herbert L; Tuca, Alexandru; Branski, Ludwik K; Lumenta, David B; Herndon, David N; Kamolz, Lars-P

2016-03-01

The aim of this study was to compare the accuracy of burn size estimation using the computer-assisted software BurnCase 3D (RISC Software GmbH, Hagenberg, Austria) with that using a 2D scan, considered to be the actual burn size. Thirty artificial burn areas were pre planned and prepared on three mannequins (one child, one female, and one male). Five trained physicians (raters) were asked to assess the size of all wound areas using BurnCase 3D software. The results were then compared with the real wound areas, as determined by 2D planimetry imaging. To examine inter-rater reliability, we performed an intraclass correlation analysis with a 95% confidence interval. The mean wound area estimations of the five raters using BurnCase 3D were in total 20.7±0.9% for the child, 27.2±1.5% for the female and 16.5±0.1% for the male mannequin. Our analysis showed relative overestimations of 0.4%, 2.8% and 1.5% for the child, female and male mannequins respectively, compared to the 2D scan. The intraclass correlation between the single raters for mean percentage of the artificial burn areas was 98.6%. There was also a high intraclass correlation between the single raters and the 2D Scan visible. BurnCase 3D is a valid and reliable tool for the determination of total body surface area burned in standard models. Further clinical studies including different pediatric and overweight adult mannequins are warranted. Copyright © 2016 Elsevier Ltd and ISBI. All rights reserved.
RELIABILITY AND VALIDITY OF SUBJECTIVE ASSESSMENT OF LUMBAR LORDOSIS IN CONVENTIONAL RADIOGRAPHY.

Science.gov (United States)

Ruhinda, E; Byanyima, R K; Mugerwa, H

2014-10-01

Reliability and validity studies of different lumbar curvature analysis and measurement techniques have been documented however there is limited literature on the reliability and validity of subjective visual analysis. Radiological assessment of lumbar lordotic curve aids in early diagnosis of conditions even before neurologic changes set in. To ascertain the level of reliability and validity of subjective assessment of lumbar lordosis in conventional radiography. A blinded, repeated-measures diagnostic test was carried out on lumbar spine x-ray radiographs. Radiology Department at Joint Clinical Research Centre (JCRC), Mengo-Kampala-Uganda. Seventy (70) lateral lumbar x-ray films were used for this study and were obtained from the archive of JCRC radiology department at Butikiro house, Mengo-Kampala. Poor observer agreement, both inter- and intra-observer, with kappa values of 0.16 was found. Inter-observer agreement was poorer than intra-observer agreement. Kappa values significantly rose when the lumbar lordosis was clustered into four categories without grading each abnormality. The results confirm that subjective assessment of lumbar lordosis has low reliability and validity. Film quality has limited influence on the observer reliability. This study further shows that fewer scale categories of lordosis abnormalities produce better observer reliability.
The Outdoor MEDIA DOT: The development and inter-rater reliability of a tool designed to measure food and beverage outlets and outdoor advertising.

Science.gov (United States)

Poulos, Natalie S; Pasch, Keryn E

2015-07-01

Few studies of the food environment have collected primary data, and even fewer have reported reliability of the tool used. This study focused on the development of an innovative electronic data collection tool used to document outdoor food and beverage (FB) advertising and establishments near 43 middle and high schools in the Outdoor MEDIA Study. Tool development used GIS based mapping, an electronic data collection form on handheld devices, and an easily adaptable interface to efficiently collect primary data within the food environment. For the reliability study, two teams of data collectors documented all FB advertising and establishments within one half-mile of six middle schools. Inter-rater reliability was calculated overall and by advertisement or establishment category using percent agreement. A total of 824 advertisements (n=233), establishment advertisements (n=499), and establishments (n=92) were documented (range=8-229 per school). Overall inter-rater reliability of the developed tool ranged from 69-89% for advertisements and establishments. Results suggest that the developed tool is highly reliable and effective for documenting the outdoor FB environment. Copyright © 2015 Elsevier Ltd. All rights reserved.

Lower limb spasticity assessment using an inertial sensor: a reliability study

International Nuclear Information System (INIS)

Sterpi, I; Colombo, R; Caroli, A; Meazza, E; Maggioni, G; Pistarini, C

2013-01-01

Spasticity is a common motor impairment in patients with neurological disorders that can prevent functional recovery after rehabilitation. In the clinical setting, its assessment is carried out using standardized clinical scales. The aim of this study was to verify the applicability of inertial sensors for an objective measurement of quadriceps spasticity and evaluate its test–retest and inter-rater reliability during the implementation of the Wartenberg pendulum test. Ten healthy subjects and 11 patients in vegetative state with severe brain damage were enrolled in this study. Subjects were evaluated three times on three consecutive days. The test–retest reliability of measurement was assessed in the first two days. The third day was devoted to inter-rater reliability assessment. In addition, the lower limb muscle tone was bilaterally evaluated at the knee joint by the modified Ashworth scale. The factorial ANOVA analysis showed that the implemented method allowed us to discriminate between healthy and pathological conditions. The fairly low SEM and high ICC values obtained for the pendulum parameters indicated a good test–retest and inter-rater reliability of measurement. This study shows that an inertial sensor can be reliably used to characterize leg kinematics during the Wartenberg pendulum test and provide quantitative evaluation of quadriceps spasticity. (paper)
Validity and inter-observer reliability of subjective hand-arm vibration assessments

NARCIS (Netherlands)

Coenen, P.; Formanoy, M.; Douwes, M.; Bosch, T.; Kraker, H. de

2014-01-01

Exposure to mechanical vibrations at work (e.g., due to handling powered tools) is a potential occupational risk as it may cause upper extremity complaints. However, reliable and valid assessment methods for vibration exposure at work are lacking. Measuring hand-arm vibration objectively is often
Estudo da validade e confiabilidade intra e interobservador da versão modificada do teste de Schöber modificado em indivíduos com lombalgia Study of validity and intra and inter-observer reliability of modified-modified Schöber test in subjects with low-back pain

Directory of Open Access Journals (Sweden)

Christiane de Souza Guerino Macedo

2009-09-01

Full Text Available Em pacientes com lombalgia, mensura-se a amplitude de movimento (ADM da coluna lombar por meio da versão modificada do teste de Schöber modificado (MTSM, mas suas propriedades psicométricas não são comprovadas para uso clínico. Este estudo verificou a validade e confiabilidade intra e interobservador do MTSM em indivíduos com lombalgia, comparando as medidas da ADM com as obtidas por meio de radiografia, método considerado padrão-ouro. Participaram 20 voluntários com lombalgia, de ambos os sexos, funcionários de um Hospital Universitário. O MTSM foi aplicado duas vezes por dois avaliadores. As medidas obtidas pelo teste e por radiografia foram comparadas usando o coeficiente de correlação de Pearson, obtendo-se r=0,14, ou seja, correlação fraca. O coeficiente de correlação intraclasse (CCI dos MTSM intra-observador foi 0,96 (IC 95% 0,91;0,98 e interobservador 0,93 (IC 95% 0,84;0,97, indicando alta confiabilidade; o teste de Bland & Altman mostrou alta concordância intra e interobservador, com valores de -0,21 e -0,28, respectivamente. Embora tenha sido encontrada alta confiabilidade intra e interobservador na aplicação da versão modificada do teste de Schöber modificado, este apresentou baixa validade para medir a ADM da coluna lombar, quando comparado ao padrão-ouro.In patients with low-back pain the lumbar spine range of motion (ROM is often measured by the modified version of the modified Schöber test (MMST, but its psychometric properties have not been ascertained for clinical use. The purpose here was to verify intra and inter-observer validity and reliability of the MMST in subjects with low-back pain, and to compare obtained ROM measures to those obtained by radiography, taken as gold standard. The study involved 20 subjects with chronic low-back pain, of both sexes, employees at a university hospital. The MMST was applied twice by two examiners each. The Pearson correlation coefficient found when comparing
A Topology Control Strategy with Reliability Assurance for Satellite Cluster Networks in Earth Observation.

Science.gov (United States)

Chen, Qing; Zhang, Jinxiu; Hu, Ze

2017-02-23

This article investigates the dynamic topology control problemof satellite cluster networks (SCNs) in Earth observation (EO) missions by applying a novel metric of stability for inter-satellite links (ISLs). The properties of the periodicity and predictability of satellites' relative position are involved in the link cost metric which is to give a selection criterion for choosing the most reliable data routing paths. Also, a cooperative work model with reliability is proposed for the situation of emergency EO missions. Based on the link cost metric and the proposed reliability model, a reliability assurance topology control algorithm and its corresponding dynamic topology control (RAT) strategy are established to maximize the stability of data transmission in the SCNs. The SCNs scenario is tested through some numeric simulations of the topology stability of average topology lifetime and average packet loss rate. Simulation results show that the proposed reliable strategy applied in SCNs significantly improves the data transmission performance and prolongs the average topology lifetime.
A Topology Control Strategy with Reliability Assurance for Satellite Cluster Networks in Earth Observation

Directory of Open Access Journals (Sweden)

Qing Chen

2017-02-01

Full Text Available This article investigates the dynamic topology control problemof satellite cluster networks (SCNs in Earth observation (EO missions by applying a novel metric of stability for inter-satellite links (ISLs. The properties of the periodicity and predictability of satellites’ relative position are involved in the link cost metric which is to give a selection criterion for choosing the most reliable data routing paths. Also, a cooperative work model with reliability is proposed for the situation of emergency EO missions. Based on the link cost metric and the proposed reliability model, a reliability assurance topology control algorithm and its corresponding dynamic topology control (RAT strategy are established to maximize the stability of data transmission in the SCNs. The SCNs scenario is tested through some numeric simulations of the topology stability of average topology lifetime and average packet loss rate. Simulation results show that the proposed reliable strategy applied in SCNs significantly improves the data transmission performance and prolongs the average topology lifetime.
MRI assessment of knee osteoarthritis: Knee Osteoarthritis Scoring System (KOSS) - inter-observer and intra-observer reproducibility of a compartment-based scoring system

International Nuclear Information System (INIS)

Kornaat, Peter R.; Ceulemans, Ruth Y.T.; Kroon, Herman M.; Bloem, Johan L.; Riyazi, Naghmeh; Kloppenburg, Margreet; Carter, Wayne O.; Woodworth, Thasia G.

2005-01-01

To develop a scoring system for quantifying osteoarthritic changes of the knee as identified by magnetic resonance (MR) imaging, and to determine its inter- and intra-observer reproducibility, in order to monitor medical therapy in research studies. Two independent observers evaluated 25 consecutive MR examinations of the knee in patients with previously defined clinical symptoms and radiological signs of osteoarthritis. We acquired on a 1.5 T system: coronal and sagittal proton density- and T2-weighted dual spin echo (SE) images, sagittal three-dimensional T1-weighted gradient echo (GE) images with fat suppression, and axial dual turbo SE images with fat suppression. Images were scored for the presence of cartilaginous lesions, osteophytes, subchondral cysts, bone marrow edema, and for meniscal abnormalities. Presence and size of effusion, synovitis and Baker's cyst were recorded. All parameters were ranked on a previously defined, semiquantitative scale, reflecting increasing severity of findings. Kappa, weighted kappa and intraclass correlation coefficient (ICC) were used to determine inter- and intra-observer variability. Inter-observer reproducibility was good (ICC value 0.77). Inter- and intra-observer reproducibility for individual parameters was good to very good (inter-observer ICC value 0.63-0.91; intra-observer ICC value 0.76-0.96). The presented comprehensive MR scoring system for osteoarthritic changes of the knee has a good to very good inter-observer and intra-observer reproducibility. Thus the score form with its definitions can be used for standardized assessment of osteoarthritic changes to monitor medical therapy in research studies. (orig.)
Measuring the quality of life in mild to very severe dementia: testing the inter-rater and intra-rater reliability of the German version of the QUALIDEM.

Science.gov (United States)

Dichter, Martin Nikolaus; Schwab, Christian G G; Meyer, Gabriele; Bartholomeyczik, Sabine; Dortmann, Olga; Halek, Margareta

2014-05-01

Quality of life (Qol) is an increasingly used outcome measure in dementia research. The QUALIDEM is a dementia-specific and proxy-rated Qol instrument. We aimed to determine the inter-rater and intra-rater reliability in residents with dementia in German nursing homes. The QUALIDEM consists of nine subscales that were applied to a sample of 108 people with mild to severe dementia and six consecutive subscales that were applied to a sample of 53 people with very severe dementia. The proxy raters were 49 registered nurses and nursing assistants. Inter-rater and intra-rater reliability scores were calculated on the subscale and item level. None of the QUALIDEM subscales showed strong inter-rater reliability based on the single-measure Intra-Class Correlation Coefficient (ICC) for absolute agreement ≥ 0.70. Based on the average-measure ICC for four raters, eight subscales for people with mild to severe dementia (care relationship, positive affect, negative affect, restless tense behavior, social relations, social isolation, feeling at home and having something to do) and five subscales for very severe dementia (care relationship, negative affect, restless tense behavior, social relations and social isolation) yielded a strong inter-rater agreement (ICC: 0.72-0.86). All of the QUALIDEM subscales, regardless of dementia severity, showed strong intra-rater agreement. The ICC values ranged between 0.70 and 0.79 for people with mild to severe dementia and between 0.75 and 0.87 for people with very severe dementia. This study demonstrated insufficient inter-rater reliability and sufficient intra-rater reliability for all subscales of both versions of the German QUALIDEM. The degree of inter-rater reliability can be improved by collaborative Qol rating by more than one nurse. The development of a measurement manual with accurate item definitions and a standardized education program for proxy raters is recommended.
Inter-observer agreement according to three methods of evaluating mammographic density and parenchymal pattern in a case control study

DEFF Research Database (Denmark)

Winkel, Rikke Rass; von Euler-Chelpin, My Catarina; Nielsen, Mads

2015-01-01

, Tabár's PIV and PV and the upper two quartiles (within density range) of PMD. The relative risk of breast cancer was estimated using logistic regression to calculate odds ratios (ORs) adjusted for age, which were compared between the two readers. RESULTS: Substantial inter-observer agreement was seen......, respectively. Inter-reader variability showed different impact on the relative risk of breast cancer estimated by the two readers on a multiple-category scale, however, not on a high/low-risk scale. Tabár's pattern IV demonstrated the highest ORs of all density patterns investigated. CONCLUSIONS: Our study......BACKGROUND: Mammographic breast density and parenchymal patterns are well-established risk factors for breast cancer. We aimed to report inter-observer agreement on three different subjective ways of assessing mammographic density and parenchymal pattern, and secondarily to examine what potential...
Magnetic resonance imaging for prostate bed radiotherapy planning: An inter- and intra-observer variability study

International Nuclear Information System (INIS)

Barkati, Maroie; Simard, Dany; Taussky, Daniel; Delouya, Guiula

2016-01-01

We assessed the inter- and intra-observer variability in contouring the prostate bed for radiation therapy planning using MRI compared with computed tomography (CT). We selected 15 patients with prior radical prostatectomy. All had CT and MRI simulation for planning purposes. Image fusions were done between CT and MRI. Three radiation oncologists with several years of experience in treating prostate cancer contoured the prostate bed first on CT and then on MRI. Before contouring, each radiation oncologist had to review the Radiation Therapy Oncology Group guidelines for postoperative external beam radiotherapy. The agreement between volumes was calculated using the Dice similarity coefficient (DSC). Analysis was done using the Matlab software. The DSC was compared using non-parametric statistical tests. Contouring on CT alone showed a statistically significant (P = 0.001) higher similarity between observers with a mean DSC of 0.76 (standard deviation ± 0.05) compared with contouring on MRI with a mean of 0.66 (standard deviation ± 0.05). Mean intra-observer variability between CT and MRI was 0.68, 0.75 and 0.78 for the three observers. The clinical target volume was 19 - 74% larger on CT than on MRI. The intra-observer difference in clinical target volume between CT and MRI was statistically significant in two observers and non-significant in the third one (P = 0.09). We found less inter-observer variability when contouring on CT than on MRI. Radiation Therapy Oncology Group contouring guidelines are based on anatomical landmarks readily visible on CT. These landmarks are more inter-observer dependent on MRI. Therefore, present contouring guidelines might not be applicable to MRI planning.
A study of the reliability of the Nociception Coma Scale.

Science.gov (United States)

Riganello, F; Cortese, M D; Arcuri, F; Candelieri, A; Guglielmino, F; Dolce, G; Sannita, W G; Schnakers, C

2015-04-01

In this study, we investigated the reliability of the Nociception Coma Scale which has recently been developed to assess nociception in non-communicative, severely brain-injured patients. Prospective cross-sequential study. Semi-intensive care unit and long-term brain injury care. Forty-four patients diagnosed as being in a vegetative state (n=26) or in a minimally conscious state (n=18). Patients were assessed by two experts (rater A and rater B) on two consecutive weeks to measure inter-rater agreement and test-retest reliability. Total scores and subscores of the Nociception Coma Scale. We performed a total of 176 assessments. The inter-rater agreement was moderate for the total scores (k = 0.57) and fair to substantial for the subscores (0.33 ≤ k ≤ 0.62) on week 2. The test-retest reliability was substantial for the total scores (k = 0.66) and moderate to almost perfect for the subscores (0.53 ≤ k ≤ 0.96) for rater A. The inter-rater agreement was weaker on week 1, whereas the test-retest reliability was lower for the least experienced rater (rater B). This study provides further evidence of the psychometric qualities of the Nociception Coma Scale. Future studies should assess the impact of practical experience and background on administration and scoring of the scale. © The Author(s) 2014.
Peri-acetabular radiolucent lines: inter- and intra-observer agreement on post-operative radiographs

OpenAIRE

Kneif, D.; Downing, M.; Ashcroft, G. P.; Gibson, P.; Knight, D.; Ledingham, W.; Hutchison, J.

2005-01-01

Peri-acetabular radiolucent lines (RLLs) seen on “early” post-operative radiographs have been identified as a potential predictor of long-term implant performance. This study examines the inter- and intra-observer variation encountered when assessing such radiographs. Four consultant orthopaedic surgeons assessed the presence, extent and width of RLLs in 220 radiographs performed on 50 patients taken one to two weeks, six weeks, six months and one year following surgery. Inter-observer agreem...
The relative reliability of actively participating and passively observing raters in a simulation-based assessment for selection to specialty training in anaesthesia.

Science.gov (United States)

Roberts, M J; Gale, T C E; Sice, P J A; Anderson, I R

2013-06-01

Selection to specialty training is a high-stakes assessment demanding valuable consultant time. In one initial entry level and two higher level anaesthesia selection centres, we investigated the feasibility of using staff participating in simulation scenarios, rather than observing consultants, to rate candidate performance. We compared participant and observer scores using four different outcomes: inter-rater reliability; score distributions; correlation of candidate rankings; and percentage of candidates whose selection might be affected by substituting participants' for observers' ratings. Inter-rater reliability between observers was good (correlation coefficient 0.73-0.96) but lower between participants (correlation coefficient 0.39-0.92), particularly at higher level where participants also rated candidates more favourably than did observers. Station rank orderings were strongly correlated between the rater groups at entry level (rho 0.81, p training posts available. We conclude that using participating raters is feasible at initial entry level only. Anaesthesia © 2013 The Association of Anaesthetists of Great Britain and Ireland.
Inter- and intra-rater reliability of 3D kinematics during maximum mouth opening of asymptomatic subjects.

Science.gov (United States)

Calixtre, Leticia Bojikian; Nakagawa, Theresa Helissa; Alburquerque-Sendín, Francisco; da Silva Grüninger, Bruno Leonardo; de Sena Rosa, Lianna Ramalho; Oliveira, Ana Beatriz

2017-11-07

Previous studies evaluated 3D human jaw movements using kinematic analysis systems during mouth opening, but information on the reliability of such measurements is still scarce. The purpose of this study was to analyze within- and between-session reliabilities, inter-rater reliability, standard error of measurement (SEM), minimum detectable change (MDC) and consistency of agreement across raters and sessions of 3D kinematic variables during maximum mouth opening (MMO). Thirty-six asymptomatic subjects from both genders were evaluated on two different days, five to seven days apart. Subjects performed three MMO movements while kinematic data were collected. Intraclass correlation coefficient (ICC), SEM and MDC were calculated for all variables, and Bland-Altman plots were constructed. Jaw radius and width were the most reproducible variables (ICC>0.81) and demonstrated minor error. Incisor displacement during MMO and angular movements in the sagittal plane presented good reliability (ICC from 0.61 to 0.8) and small errors and, consequently, could be used in future studies with the same methodology and population. The variables with smaller amplitudes (condylar translations during mouth opening and closing and mandibular movements on the frontal and transversal planes) were less reliable (ICCmandibular movements in the frontal and transversal planes. Copyright © 2017 Elsevier Ltd. All rights reserved.
The impact of revised DSM-5 criteria on the relative distribution and inter-rater reliability of eating disorder diagnoses in a residential treatment setting.

Science.gov (United States)

Thomas, Jennifer J; Eddy, Kamryn T; Murray, Helen B; Tromp, Marilou D P; Hartmann, Andrea S; Stone, Melissa T; Levendusky, Philip G; Becker, Anne E

2015-09-30

This study evaluated the relative distribution and inter-rater reliability of revised DSM-5 criteria for eating disorders in a residential treatment program. Consecutive adolescent and young adult females (N=150) admitted to a residential eating disorder treatment facility were assigned both DSM-IV and DSM-5 diagnoses by a clinician (n=14) via routine clinical interview and a research assessor (n=4) via structured interview. We compared the frequency of diagnostic assignments under each taxonomy and by type of assessor. We evaluated concordance between clinician and researcher assignment through inter-rater reliability kappa and percent agreement. Significantly fewer patients received either clinician or researcher diagnoses of a residual eating disorder under DSM-5 (clinician-12.0%; researcher-31.3%) versus DSM-IV (clinician-28.7%; researcher-59.3%), with the majority of reassigned DSM-IV residual cases reclassified as DSM-5 anorexia nervosa. Researcher and clinician diagnoses showed moderate inter-rater reliability under DSM-IV (κ=.48) and DSM-5 (κ=.57), though agreement for specific DSM-5 other specified feeding or eating disorder (OSFED) presentations was poor (κ=.05). DSM-5 revisions were associated with significantly less frequent residual eating disorder diagnoses, but not with reduced inter-rater reliability. Findings support specific dimensions of clinical utility for revised DSM-5 criteria for eating disorders. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Validity and reliability of using photography for measuring knee range of motion: a methodological study

Directory of Open Access Journals (Sweden)

Adie Sam

2011-04-01

Full Text Available Abstract Background The clinimetric properties of knee goniometry are essential to appreciate in light of its extensive use in the orthopaedic and rehabilitative communities. Intra-observer reliability is thought to be satisfactory, but the validity and inter-rater reliability of knee goniometry often demonstrate unacceptable levels of variation. This study tests the validity and reliability of measuring knee range of motion using goniometry and photographic records. Methods Design: Methodology study assessing the validity and reliability of one method ('Marker Method' which uses a skin marker over the greater trochanter and another method ('Line of Femur Method' which requires estimation of the line of femur. Setting: Radiology and orthopaedic departments of two teaching hospitals. Participants: 31 volunteers (13 arthritic and 18 healthy subjects. Knee range of motion was measured radiographically and photographically using a goniometer. Three assessors were assessed for reliability and validity. Main outcomes: Agreement between methods and within raters was assessed using concordance correlation coefficient (CCCs. Agreement between raters was assessed using intra-class correlation coefficients (ICCs. 95% limits of agreement for the mean difference for all paired comparisons were computed. Results Validity (referenced to radiographs: Each method for all 3 raters yielded very high CCCs for flexion (0.975 to 0.988, and moderate to substantial CCCs for extension angles (0.478 to 0.678. The mean differences and 95% limits of agreement were narrower for flexion than they were for extension. Intra-rater reliability: For flexion and extension, very high CCCs were attained for all 3 raters for both methods with slightly greater CCCs seen for flexion (CCCs varied from 0.981 to 0.998. Inter-rater reliability: For both methods, very high ICCs (min to max: 0.891 to 0.995 were obtained for flexion and extension. Slightly higher coefficients were obtained
Computerized back postural assessment in physiotherapy practice: Intra-rater and inter-rater reliability of the MIDAS system.

Science.gov (United States)

McAlpine, R T; Bettany-Saltikov, J A; Warren, J G

2009-01-01

Assessment of spinal posture during physiotherapy practice is routine, yet few objective measures exist to this end. The Middlesbrough Integrated Digital Assessment System (MIDAS) is a low cost portable system able to record 3D information on posture. The purpose of this study was to assess both the intra-rater and inter-rater reliability of the MIDAS system. Twenty-five healthy subjects were recruited. A repeated measures design was used to record fifteen pre-palpated landmarks on the back of each subject. To limit the sources of variability, the principal researcher palpated the landmarks for each subject. Each of three raters took two measurements on each subject in a standardized upright posture. X (medio-lateral), Y (antero-posterior) and Z (height) landmark positions were recorded via a computer interface. Both intra-rater agreement (mean ICCs - rater 1 r=0.970, rater 2 r=0.965 and rater 3 r=0.965, pMIDAS demonstrated both high inter-rater and intra-rater reliability and provides an objective method for the assessment of posture in physiotherapy practice.
Intra-rater and inter-rater reliability of the standardized ultrasound protocol for assessing subacromial structures

DEFF Research Database (Denmark)

Hougs Kjær, Birgitte; Ellegaard, Karen; Wieland, Ina

2017-01-01

BACKGROUND: US-examinations related to shoulder impingement (SI) often vary due to methodological differences, examiner positions, transducers, and recording parameters. Reliable US protocols for examination of different structures related to shoulder impingement are therefore needed. OBJECTIVES...... of the supraspinatus tendon (SUPRA) and subacromial subdeltoid (SASD) bursa in two imaging positions, and the acromial humeral distance (AHD) in one position. Additionally, agreement on dynamic impingement (DI) examination was performed. The intra- and inter-rater reliability was carried out on the same day...
Inter comparison of REPAS and APSRA methodologies for passive system reliability analysis

International Nuclear Information System (INIS)

Solanki, R.B.; Krishnamurthy, P.R.; Singh, Suneet; Varde, P.V.; Verma, A.K.

2014-01-01

The increasing use of passive systems in the innovative nuclear reactors puts demand on the estimation of the reliability assessment of these passive systems. The passive systems operate on the driving forces such as natural circulation, gravity, internal stored energy etc. which are moderately weaker than that of active components. Hence, phenomenological failures (virtual components) are equally important as that of equipment failures (real components) in the evaluation of passive systems reliability. The contribution of the mechanical components to the passive system reliability can be evaluated in a classical way using the available component reliability database and well known methods. On the other hand, different methods are required to evaluate the reliability of processes like thermohydraulics due to lack of adequate failure data. The research is ongoing worldwide on the reliability assessment of the passive systems and their integration into PSA, however consensus is not reached. Two of the most widely used methods are Reliability Evaluation of Passive Systems (REPAS) and Assessment of Passive System Reliability (APSRA). Both these methods characterize the uncertainties involved in the design and process parameters governing the function of the passive system. However, these methods differ in the quantification of passive system reliability. Inter comparison among different available methods provides useful insights into the strength and weakness of different methods. This paper highlights the results of the thermal hydraulic analysis of a typical passive isolation condenser system carried out using RELAP mode 3.2 computer code applying REPAS and APSRA methodologies. The failure surface is established for the passive system under consideration and system reliability has also been evaluated using these methods. Challenges involved in passive system reliabilities are identified, which require further attention in order to overcome the shortcomings of these
Examining Design and Inter-Rater Reliability of a Rubric Measuring Research Quality across Multiple Disciplines

Directory of Open Access Journals (Sweden)

Marilee J. Bresciani

2009-05-01

Full Text Available The paper presents a rubric to help evaluate the quality of research projects. The rubric was applied in a competition across a variety of disciplines during a two-day research symposium at one institution in the southwest region of the United States of America. It was collaboratively designed by a faculty committee at the institution and was administered to 204 undergraduate, master, and doctoral oral presentations by approximately 167 different evaluators. No training or norming of the rubric was given to 147 of the evaluators prior to the competition. The findings of the inter-rater reliability analysis reveal substantial agreement among the judges, which contradicts literature describing the fact that formal norming must occur prior to seeing substantial levels of inter-rater reliability. By presenting the rubric along with the methodology used in its design and evaluation, it is hoped that others will find this to be a useful tool for evaluating documents and for teaching research methods.
Inter-assessor reliability of practice based biomechanical assessment of the foot and ankle

Directory of Open Access Journals (Sweden)

Jarvis Hannah L

2012-06-01

Full Text Available Abstract Background There is no consensus on which protocols should be used to assess foot and lower limb biomechanics in clinical practice. The reliability of many assessments has been questioned by previous research. The aim of this investigation was to (i identify (through consensus what biomechanical examinations are used in clinical practice and (ii evaluate the inter-assessor reliability of some of these examinations. Methods Part1: Using a modified Delphi technique 12 podiatrists derived consensus on the biomechanical examinations used in clinical practice. Part 2: Eleven podiatrists assessed 6 participants using a subset of the assessment protocol derived in Part 1. Examinations were compared between assessors. Results Clinicians choose to estimate rather than quantitatively measure foot position and motion. Poor inter-assessor reliability was recorded for all examinations. Intra-class correlation coefficient values (ICC for relaxed calcaneal stance position were less than 0.23 and were less than 0.14 for neutral calcaneal stance position. For the examination of ankle joint dorsiflexion, ICC values suggest moderate reliability (less than 0.61. The results of a random effects ANOVA highlight that participant (up to 5.7°, assessor (up to 5.8° and random (up to 5.7° error all contribute to the total error (up to 9.5° for relaxed calcaneal stance position, up to 10.7° for the examination of ankle joint dorsiflexion. Kappa Fleiss values for categorisation of first ray position and mobility were less than 0.05 and for limb length assessment less than 0.02, indicating slight agreement. Conclusion Static biomechanical assessment of the foot, leg and lower limb is an important protocol in clinical practice, but the key examinations used to make inferences about dynamic foot function and to determine orthotic prescription are unreliable.

Effect of knee angle on neuromuscular assessment of plantar flexor muscles: A reliability study

Science.gov (United States)

Cornu, Christophe; Jubeau, Marc

2018-01-01

Introduction This study aimed to determine the intra- and inter-session reliability of neuromuscular assessment of plantar flexor (PF) muscles at three knee angles. Methods Twelve young adults were tested for three knee angles (90°, 30° and 0°) and at three time points separated by 1 hour (intra-session) and 7 days (inter-session). Electrical (H reflex, M wave) and mechanical (evoked and maximal voluntary torque, activation level) parameters were measured on the PF muscles. Intraclass correlation coefficients (ICC) and coefficients of variation were calculated to determine intra- and inter-session reliability. Results The mechanical measurements presented excellent (ICC>0.75) intra- and inter-session reliabilities regardless of the knee angle considered. The reliability of electrical measurements was better for the 90° knee angle compared to the 0° and 30° angles. Conclusions Changes in the knee angle may influence the reliability of neuromuscular assessments, which indicates the importance of considering the knee angle to collect consistent outcomes on the PF muscles. PMID:29596480
Evaluating the reliability of an injury prevention screening tool: Test-retest study.

Science.gov (United States)

Gittelman, Michael A; Kincaid, Madeline; Denny, Sarah; Wervey Arnold, Melissa; FitzGerald, Michael; Carle, Adam C; Mara, Constance A

2016-10-01

A standardized injury prevention (IP) screening tool can identify family risks and allow pediatricians to address behaviors. To assess behavior changes on later screens, the tool must be reliable for an individual and ideally between household members. Little research has examined the reliability of safety screening tool questions. This study utilized test-retest reliability of parent responses on an existing IP questionnaire and also compared responses between household parents. Investigators recruited parents of children 0 to 1 year of age during admission to a tertiary care children's hospital. When both parents were present, one was chosen as the "primary" respondent. Primary respondents completed the 30-question IP screening tool after consent, and they were re-screened approximately 4 hours later to test individual reliability. The "second" parent, when present, only completed the tool once. All participants received a 10-dollar gift card. Cohen's Kappa was used to estimate test-retest reliability and inter-rater agreement. Standard test-retest criteria consider Kappa values: 0.0 to 0.40 poor to fair, 0.41 to 0.60 moderate, 0.61 to 0.80 substantial, and 0.81 to 1.00 as almost perfect reliability. One hundred five families participated, with five lost to follow-up. Thirty-two (30.5%) parent dyads completed the tool. Primary respondents were generally mothers (88%) and Caucasian (72%). Test-retest of the primary respondents showed their responses to be almost perfect; average 0.82 (SD = 0.13, range 0.49-1.00). Seventeen questions had almost perfect test-retest reliability and 11 had substantial reliability. However, inter-rater agreement between household members for 12 objective questions showed little agreement between responses; inter-rater agreement averaged 0.35 (SD = 0.34, range -0.19-1.00). One question had almost perfect inter-rater agreement and two had substantial inter-rater agreement. The IP screening tool used by a single individual had excellent
MRI assessment of bone marrow in children with juvenile idiopathic arthritis: intra- and inter-observer variability

Energy Technology Data Exchange (ETDEWEB)

Tanturri de Horatio, Laura; Barbuti, Domenico; Toma, Paolo [Ospedale Pediatrico Bambino Gesu, Department of Radiology, Rome (Italy); Damasio, Maria Beatrice [Ospedale G. Gaslini, Department of Radiology, Genoa (Italy); Bracaglia, Claudia [Ospedale Pediatrico Bambino Gesu, Department of Paediatrics, Rome (Italy); Lambot-Juhan, Karen [Hopital Necker-Enfants Malades, Department of Radiology, Paris (France); Boavida, Peter [Great Ormond Street Hospital, Department of Radiology, London (United Kingdom); Ording Mueller, Lil-Sofie [University Hospital North Norway, Department of Radiology, Tromsoe (Norway); Malattia, Clara [Ospedale G. Gaslini, Department of Pediatrics, Genoa (Italy); Rava, Lucilla [Ospedale Pediatrico Bambino Gesu, Department of Epidemiology, Rome (Italy); Rosendahl, Karen [Great Ormond Street Hospital, Department of Radiology, London (United Kingdom); Haukeland University Hospital, Department of Pediatric Radiology, Bergen (Norway)

2012-06-15

Bone marrow oedema (BMO) is included in MRI-based scoring systems of disease activity in adults with rheumatoid arthritis. Similar systems in juvenile idiopathic arthritis (JIA) are lacking. To assess the reproducibility in a multi-centre setting of an MRI BMO scoring system in children with JIA. Seventy-six wrist MRIs were read twice, independently, by two experienced paediatric radiologists. BMO was defined as ill-defined lesions within the trabecular bone, returning high and low signal on T2- and T1-weighted images respectively, with or without contrast enhancement. BMO extension was scored for each of 14 bones at the wrist from 0 (none) to 3 (extensive). The intra-observer agreement was moderate to excellent, with weighted kappa ranging from 0.85 to 1.0 and 0.49 to 1.0 (readers 1 and 2 respectively), while the inter-observer agreement ranged from 0.41 to 0.79. The intra- and inter-observer intraclass correlation coefficients were excellent and satisfactory, respectively. The scoring system was reliable and may be used for grading bone marrow abnormality in JIA. The relatively large variability in aggregate scores, particularly between readers, underscores the need for thorough standardisation. (orig.)
MRI assessment of bone marrow in children with juvenile idiopathic arthritis: intra- and inter-observer variability

International Nuclear Information System (INIS)

Tanturri de Horatio, Laura; Barbuti, Domenico; Toma, Paolo; Damasio, Maria Beatrice; Bracaglia, Claudia; Lambot-Juhan, Karen; Boavida, Peter; Ording Mueller, Lil-Sofie; Malattia, Clara; Rava, Lucilla; Rosendahl, Karen

2012-01-01

Bone marrow oedema (BMO) is included in MRI-based scoring systems of disease activity in adults with rheumatoid arthritis. Similar systems in juvenile idiopathic arthritis (JIA) are lacking. To assess the reproducibility in a multi-centre setting of an MRI BMO scoring system in children with JIA. Seventy-six wrist MRIs were read twice, independently, by two experienced paediatric radiologists. BMO was defined as ill-defined lesions within the trabecular bone, returning high and low signal on T2- and T1-weighted images respectively, with or without contrast enhancement. BMO extension was scored for each of 14 bones at the wrist from 0 (none) to 3 (extensive). The intra-observer agreement was moderate to excellent, with weighted kappa ranging from 0.85 to 1.0 and 0.49 to 1.0 (readers 1 and 2 respectively), while the inter-observer agreement ranged from 0.41 to 0.79. The intra- and inter-observer intraclass correlation coefficients were excellent and satisfactory, respectively. The scoring system was reliable and may be used for grading bone marrow abnormality in JIA. The relatively large variability in aggregate scores, particularly between readers, underscores the need for thorough standardisation. (orig.)
Inter-observer variation in masked and unmasked images for quality evaluation of clinical radiographs

International Nuclear Information System (INIS)

Tingberg, A.; Eriksson, F.; Medin, J.; Besjakov, J.; Baarth, M.; Haakansson, M.; Sandborg, M.; Almen, A.; Lanhede, B.; Alm-Carlsson, G.; Mattsson, S.; Maansson, L. G.

2005-01-01

Purpose: To investigate the influence of masking on the inter-observer variation in image quality evaluation of clinical radiographs of chest and lumbar spine. Background: Inter-observer variation is a big problem in image quality evaluation since this variation is often much bigger than the variation in image quality between, for example, two radiographic systems. In this study, we have evaluated the effect of masking on the inter-observer variation. The idea of the masking was to force every observer to view exactly the same part of the image and to avoid the effect of the overall 'first impression' of the image. A discussion with a group of European expert radiologists before the study indicated that masking might be a good way to reduce the inter-observer variation. Methods: Five chest and five lumbar spine radiographs were collected together with detailed information regarding exposure conditions. The radiographs were digitised with a high-performance scanner and five different manipulations were performed, simulating five different exposure conditions. The contrast, noise and spatial resolution were manipulated by this method. The images were printed onto the film and the individual masks were produced for each film, showing only the parts of the images that were necessary for the image quality evaluation. The quality of the images was evaluated on ordinary viewing boxes by a large group of experienced radiologists. The images were examined with and without the masks with a set of image criteria (if fulfilled, 1 point; and not fulfilled, 0 point), and the mean score was calculated for each simulated exposure condition. Results: The results of this study indicate that - contrary to what was supposed - the inter-observer variation increased when the images were masked. In some cases, especially for chest, this increase was statistically significant. Conclusions: Based on the results of this study, image masking in studies of fulfilment of image criteria cannot
Harmonization process and reliability assessment of anthropometric measurements in the elderly EXERNET multi-centre study.

Directory of Open Access Journals (Sweden)

Alba Gómez-Cabello

Full Text Available BACKGROUND: The elderly EXERNET multi-centre study aims to collect normative anthropometric data for old functionally independent adults living in Spain. PURPOSE: To describe the standardization process and reliability of the anthropometric measurements carried out in the pilot study and during the final workshop, examining both intra- and inter-rater errors for measurements. MATERIALS AND METHODS: A total of 98 elderly from five different regions participated in the intra-rater error assessment, and 10 different seniors living in the city of Toledo (Spain participated in the inter-rater assessment. We examined both intra- and inter-rater errors for heights and circumferences. RESULTS: For height, intra-rater technical errors of measurement (TEMs were smaller than 0.25 cm. For circumferences and knee height, TEMs were smaller than 1 cm, except for waist circumference in the city of Cáceres. Reliability for heights and circumferences was greater than 98% in all cases. Inter-rater TEMs were 0.61 cm for height, 0.75 cm for knee-height and ranged between 2.70 and 3.09 cm for the circumferences measured. Inter-rater reliabilities for anthropometric measurements were always higher than 90%. CONCLUSION: The harmonization process, including the workshop and pilot study, guarantee the quality of the anthropometric measurements in the elderly EXERNET multi-centre study. High reliability and low TEM may be expected when assessing anthropometry in elderly population.
Atlas-based segmentation technique incorporating inter-observer delineation uncertainty for whole breast

International Nuclear Information System (INIS)

Bell, L R; Pogson, E M; Metcalfe, P; Holloway, L; Dowling, J A

2017-01-01

Accurate, efficient auto-segmentation methods are essential for the clinical efficacy of adaptive radiotherapy delivered with highly conformal techniques. Current atlas based auto-segmentation techniques are adequate in this respect, however fail to account for inter-observer variation. An atlas-based segmentation method that incorporates inter-observer variation is proposed. This method is validated for a whole breast radiotherapy cohort containing 28 CT datasets with CTVs delineated by eight observers. To optimise atlas accuracy, the cohort was divided into categories by mean body mass index and laterality, with atlas’ generated for each in a leave-one-out approach. Observer CTVs were merged and thresholded to generate an auto-segmentation model representing both inter-observer and inter-patient differences. For each category, the atlas was registered to the left-out dataset to enable propagation of the auto-segmentation from atlas space. Auto-segmentation time was recorded. The segmentation was compared to the gold-standard contour using the dice similarity coefficient (DSC) and mean absolute surface distance (MASD). Comparison with the smallest and largest CTV was also made. This atlas-based auto-segmentation method incorporating inter-observer variation was shown to be efficient (<4min) and accurate for whole breast radiotherapy, with good agreement (DSC>0.7, MASD <9.3mm) between the auto-segmented contours and CTV volumes. (paper)
Inter-rater reliability and agreement of the 6-minute walk test in females with hip fractures

DEFF Research Database (Denmark)

Overgaard, Jan; Larsen, Camilla Marie; Tange Kristensen, Morten

physiotherapy students independently examined (randomized order) a convenient sample of 20 participants; their assessments were separated by two days, and testing followed instructions from the American Thoracic Society. Hip pain was assessed with the Verbal Ranking Scale. Participants (all women) with a mean...... (SD) age of 78.1 ± 5.9 years performed the test within a mean of 31.5 ± 5.8 days post-surgery; 10 had a cervical and 10 a trochanteric fracture. Excellent inter-rater reliability; ICC2.1 = 0.92 (95% CI, 0.81 - 0.97) was found, and the standard error of measurement (SEM) and smallest real difference.......6 meters longer, at the second trial (P = 0.002). Participants with moderate hip fracture-related pain walked a shorter distance than those with no or light pain during the first test (P = 0.04), while this was not the case during the second (P = 0.25). Excellent inter-rater reliability was found...
Patients' and observers' perceptions of involvement differ. Validation study on inter-relating measures for shared decision making.

Directory of Open Access Journals (Sweden)

Jürgen Kasper

Full Text Available OBJECTIVE: Patient involvement into medical decisions as conceived in the shared decision making method (SDM is essential in evidence based medicine. However, it is not conclusively evident how best to define, realize and evaluate involvement to enable patients making informed choices. We aimed at investigating the ability of four measures to indicate patient involvement. While use and reporting of these instruments might imply wide overlap regarding the addressed constructs this assumption seems questionable with respect to the diversity of the perspectives from which the assessments are administered. METHODS: The study investigated a nested cohort (N = 79 of a randomized trial evaluating a patient decision aid on immunotherapy for multiple sclerosis. Convergent validities were calculated between observer ratings of videotaped physician-patient consultations (OPTION and patients' perceptions of the communication (Shared Decision Making Questionnaire, Control Preference Scale & Decisional Conflict Scale. RESULTS: OPTION reliability was high to excellent. Communication performance was low according to OPTION and high according to the three patient administered measures. No correlations were found between observer and patient judges, neither for means nor for single items. Patient report measures showed some moderate correlations. CONCLUSION: Existing SDM measures do not refer to a single construct. A gold standard is missing to decide whether any of these measures has the potential to indicate patient involvement. PRACTICE IMPLICATIONS: Pronounced heterogeneity of the underpinning constructs implies difficulties regarding the interpretation of existing evidence on the efficacy of SDM. Consideration of communication theory and basic definitions of SDM would recommend an inter-subjective focus of measurement. TRIAL REGISTRATION: Controlled-Trials.com ISRCTN25267500.
Patients' and observers' perceptions of involvement differ. Validation study on inter-relating measures for shared decision making.

Science.gov (United States)

Kasper, Jürgen; Heesen, Christoph; Köpke, Sascha; Fulcher, Gary; Geiger, Friedemann

2011-01-01

Patient involvement into medical decisions as conceived in the shared decision making method (SDM) is essential in evidence based medicine. However, it is not conclusively evident how best to define, realize and evaluate involvement to enable patients making informed choices. We aimed at investigating the ability of four measures to indicate patient involvement. While use and reporting of these instruments might imply wide overlap regarding the addressed constructs this assumption seems questionable with respect to the diversity of the perspectives from which the assessments are administered. The study investigated a nested cohort (N = 79) of a randomized trial evaluating a patient decision aid on immunotherapy for multiple sclerosis. Convergent validities were calculated between observer ratings of videotaped physician-patient consultations (OPTION) and patients' perceptions of the communication (Shared Decision Making Questionnaire, Control Preference Scale & Decisional Conflict Scale). OPTION reliability was high to excellent. Communication performance was low according to OPTION and high according to the three patient administered measures. No correlations were found between observer and patient judges, neither for means nor for single items. Patient report measures showed some moderate correlations. Existing SDM measures do not refer to a single construct. A gold standard is missing to decide whether any of these measures has the potential to indicate patient involvement. Pronounced heterogeneity of the underpinning constructs implies difficulties regarding the interpretation of existing evidence on the efficacy of SDM. Consideration of communication theory and basic definitions of SDM would recommend an inter-subjective focus of measurement. Controlled-Trials.com ISRCTN25267500.
Dysplastic naevus: histological criteria and their inter-observer reproducibility.

Science.gov (United States)

Hastrup, N; Clemmensen, O J; Spaun, E; Søndergaard, K

1994-06-01

Forty melanocytic lesions were examined in a pilot study, which was followed by a final series of 100 consecutive melanocytic lesions, in order to evaluate the inter-observer reproducibility of the histological criteria proposed for the dysplastic naevus. The specimens were examined in a blind fashion by four observers. Analysis by kappa statistics showed poor reproducibility of nuclear features, while reproducibility of architectural features was acceptable, improving in the final series. Consequently, we cannot apply the combined criteria of cytological and architectural features with any confidence in the diagnosis of dysplastic naevus, and, until further studies have documented that architectural criteria alone will suffice in the diagnosis of dysplastic naevus, we, as pathologists, shall avoid this term.
Inter-observer variability in contouring the penile bulb on CT images for prostate cancer treatment planning

International Nuclear Information System (INIS)

Perna, Lucia; Cozzarini, Cesare; Maggiulli, Eleonora; Fellin, Gianni; Rancati, Tiziana; Valdagni, Riccardo; Vavassori, Vittorio; Villa, Sergio; Fiorino, Claudio

2011-01-01

Several investigations have recently suggested the existence of a correlation between the dose received by the penile bulb (PB) and the risk of erectile dysfunction (ED) after radical radiotherapy for clinically localized prostate carcinoma. A prospective multi-Institute study (DUE-01) was implemented with the aim to assess the predictive parameters of ED. Previously, an evaluation of inter-observer variations of PB contouring was mandatory in order to quantify its impact on PB dose-volume parameters by means of a dummy run exercise. Fifteen observers, from different Institutes, drew the PB on the planning CT images of ten patients; inter-observer variations were analysed in terms of PB volume variation and cranial/caudal limits. 3DCRT treatment plans were simulated to evaluate the impact of PB contouring inter-variability on dose-volume statistics parameters. For DVH analysis the values of PB mean dose and the volume of PB receiving more than 50 Gy and 70 Gy (V50 and V70, respectively) were considered. Systematic differences from the average values were assessed by the Wilcoxon test. Seven observers systematically overestimated or underestimated the PB volume with deviations from the average volumes ranging between -48% and +34% (p < 0.05). The analysis of the cranial and caudal borders showed a prevalence of random over systematic deviations. Inter-observer contouring variability strongly impacts on DVH parameters, although standard deviations of inter-patient differences were larger than inter-observer variations: 14.5 Gy versus 6.8 Gy for mean PB dose, 23.0% versus 11.0% and 16.8% versus 9.3% for V50 and V70 respectively. In conclusion, despite the large inter-observer variation in contouring PB, a large multi-centric study may have the possibility to detect a possible correlation between PB % dose-volume parameters and ED. The impact of contouring uncertainty could be reduced by 'a posteriori' contouring from a single observer or by introducing
The development of a reliable amateur boxing performance analysis template.

Science.gov (United States)

Thomson, Edward; Lamb, Kevin; Nicholas, Ceri

2013-01-01

The aim of this study was to devise a valid performance analysis system for the assessment of the movement characteristics associated with competitive amateur boxing and assess its reliability using analysts of varying experience of the sport and performance analysis. Key performance indicators to characterise the demands of an amateur contest (offensive, defensive and feinting) were developed and notated using a computerised notational analysis system. Data were subjected to intra- and inter-observer reliability assessment using median sign tests and calculating the proportion of agreement within predetermined limits of error. For all performance indicators, intra-observer reliability revealed non-significant differences between observations (P > 0.05) and high agreement was established (80-100%) regardless of whether exact or the reference value of ±1 was applied. Inter-observer reliability was less impressive for both analysts (amateur boxer and experienced analyst), with the proportion of agreement ranging from 33-100%. Nonetheless, there was no systematic bias between observations for any indicator (P > 0.05), and the proportion of agreement within the reference range (±1) was 100%. A reliable performance analysis template has been developed for the assessment of amateur boxing performance and is available for use by researchers, coaches and athletes to classify and quantify the movement characteristics of amateur boxing.
Training induces scapular dyskinesis in pain-free competitive swimmers: a reliability and observational study.

Science.gov (United States)

Madsen, Pernille H; Bak, Klaus; Jensen, Susanne; Welter, Ulrik

2011-03-01

Scapular dyskinesis is a major etiological factor in overhead athletes' shoulder problems. Our hypotheses were to evaluate if (1) visual observation of scapular dyskinesis during scaption has substantial interobserver reliability, and (2) scapular dyskinesis may be induced by swim training in pain-free swimmers. A reliability and observational study. Bachelor project at a college institution and at a private sports orthopedic hospital. Seventy-eight competitive swimmers with no history of shoulder pain were included in the study. Fourteen swimmers were evaluated regarding reliability. Inclusion criteria were competitive swimmers with high training volume who previously had no shoulder pain. Observations of scapular dyskinesis (yes/no) during simple scaption. The interobserver reliability of scaption and wall push-up was evaluated in 14 swimmers using kappa analysis. Prevalence of scapular dyskinesis at 4 time intervals during a swim training session. The scaption test resulted in a weighted kappa value of 0.75. Scapular dyskinesis was seen in 29 shoulders (37%) after the first time interval, in another 24 (cumulated prevalence 68%) after one-half of the training session, and in an additional 4 swimmers (cumulated prevalence 73%) after three-quarters of the training session. During the last quarter of the training session, another 7 swimmers had dyskinesis, resulting in a cumulated prevalence of 82%. The prevalence of abnormal scapular kinesis during a normal training session is high in previously pain-free swimmers. The prevalence increases with more training and occurs early during the training session.
Reliability of the Cooking Task in adults with acquired brain injury.

Science.gov (United States)

Poncet, Frédérique; Swaine, Bonnie; Taillefer, Chantal; Lamoureux, Julie; Pradat-Diehl, Pascale; Chevignard, Mathilde

2015-01-01

Acquired brain injury (ABI) often leads to deficits in executive functioning (EF) responsible for severe and long-standing disabilities in daily life activities. The Cooking Task is an ecological and valid test of EF involving multi-tasking in a real environment. Given its complex scoring system, it is important to establish the tool's reliability. The objective of the study was to examine the reliability of the Cooking Task (internal consistency, inter-rater and test-retest reliability). A total of 160 patients with ABI (113 men, mean age 37 years, SD = 14.3) were tested using the Cooking Task. For test-retest reliability, patients were assessed by the same rater on two occasions (mean interval 11 days) while two raters independently and simultaneously observed and scored patients' performances to estimate inter-rater reliability. Internal consistency was high for the global scale (Cronbach α = .74). Inter-rater reliability (n = 66) for total errors was also high (ICC = .93), however the test-retest reliability (n = 11) was poor (ICC = .36). In general the Cooking Task appears to be a reliable tool. The low test-retest results were expected given the importance of EF in the performance of novel tasks.
Intra-Rater, Inter-Rater and Test-Retest Reliability of an Instrumented Timed Up and Go (iTUG Test in Patients with Parkinson's Disease.

Directory of Open Access Journals (Sweden)

Rob C van Lummel

Full Text Available The "Timed Up and Go" (TUG is a widely used measure of physical functioning in older people and in neurological populations, including Parkinson's Disease. When using an inertial sensor measurement system (instrumented TUG [iTUG], the individual components of the iTUG and the trunk kinematics can be measured separately, which may provide relevant additional information.The aim of this study was to determine intra-rater, inter-rater and test-retest reliability of the iTUG in patients with Parkinson's Disease.Twenty eight PD patients, aged 50 years or older, were included. For the iTUG the DynaPort Hybrid (McRoberts, The Hague, The Netherlands was worn at the lower back. The device measured acceleration and angular velocity in three directions at a rate of 100 samples/s. Patients performed the iTUG five times on two consecutive days. Repeated measurements by the same rater on the same day were used to calculate intra-rater reliability. Repeated measurements by different raters on the same day were used to calculate intra-rater and inter-rater reliability. Repeated measurements by the same rater on different days were used to calculate test-retest reliability.Nineteen ICC values (15% were ≥ 0.9 which is considered as excellent reliability. Sixty four ICC values (49% were ≥ 0.70 and < 0.90 which is considered as good reliability. Thirty one ICC values (24% were ≥ 0.50 and < 0.70, indicating moderate reliability. Sixteen ICC values (12% were ≥ 0.30 and < 0.50 indicating poor reliability. Two ICT values (2% were < 0.30 indicating very poor reliability.In conclusion, in patients with Parkinson's disease the intra-rater, inter-rater, and test-retest reliability of the individual components of the instrumented TUG (iTUG was excellent to good for total duration and for turning durations, and good to low for the sub durations and for the kinematics of the SiSt and StSi. The results of this fully automated analysis of instrumented TUG movements
Can Physicians Identify Inappropriate Nuclear Stress Tests? An Examination of Inter-rater Reliability for the 2009 Appropriate Use Criteria for Radionuclide Imaging

Science.gov (United States)

Ye, Siqin; Rabbani, LeRoy E.; Kelly, Christopher R.; Kelly, Maureen R.; Lewis, Matthew; Paz, Yehuda; Peck, Clara L.; Rao, Shaline; Bokhari, Sabahat; Weiner, Shepard D.; Einstein, Andrew J.

2014-01-01

Background We sought to determine inter-rater reliability of the 2009 Appropriate Use Criteria (AUC) for radionuclide imaging (RNI) and whether physicians at various levels of training can effectively identify nuclear stress tests with inappropriate indications. Methods and Results Four hundred patients were randomly selected from a consecutive cohort of patients undergoing nuclear stress testing at an academic medical center. Raters with different levels of training (including cardiology attending physicians, cardiology fellows, internal medicine hospitalists, and internal medicine interns) classified individual nuclear stress tests using the 2009 AUC. Consensus classification by two cardiologists was considered the operational gold standard, and sensitivity and specificity of individual raters for identifying inappropriate tests was calculated. Inter-rater reliability of the AUC was assessed using Cohen’s kappa statistics for pairs of different raters. The mean age of patients was 61.5 years; 214 (54%) were female. The cardiologists rated 256 (64%) of 400 NSTs as appropriate, 68 (18%) as uncertain, 55 (14%) as inappropriate; 21 (5%) tests were unable to be classified. Inter-rater reliability for non-cardiologist raters was modest (unweighted Cohen’s kappa, 0.51, 95% confidence interval, 0.45 to 0.55). Sensitivity of individual raters for identifying inappropriate tests ranged from 47% to 82%, while specificity ranged from 85% to 97%. Conclusions Inter-rater reliability for the 2009 AUC for RNI is modest, and there is considerable variation in the ability of raters at different levels of training to identify inappropriate tests. PMID:25563660
Validation and inter-rater reliability of a three item falls risk screening tool

Directory of Open Access Journals (Sweden)

Catherine Maree Said

2017-11-01

Full Text Available Abstract Background Falls screening tools are routinely used in hospital settings and the psychometric properties of tools should be examined in the setting in which they are used. The aim of this study was to explore the concurrent and predictive validity of the Austin Health Falls Risk Screening Tool (AHFRST, compared with The Northern Hospital Modified St Thomas’s Risk Assessment Tool (TNH-STRATIFY, and the inter-rater reliability of the AHFRST. Methods A research physiotherapist used the AHFRST and TNH-STRATIFY to classify 130 participants admitted to Austin Health (five acute wards, n = 115 two subacute wards n = 15; median length of stay 6 days IQR 3–12 as ‘High’ or ‘Low’ falls risk. The AHFRST was also completed by nursing staff on patient admission. Falls data was collected from the hospital incident reporting system. Results Six falls occurred during the study period (fall rate of 4.6 falls per 1000 bed days. There was substantial agreement between the AHFRST and the TNH-STRATIFY (Kappa = 0.68, 95% CI 0.52–0.78. Both tools had poor predictive validity, with low specificity (AHFRST 46.0%, 95% CI 37.0–55.1; TNH-STRATIFY 34.7%, 95% CI 26.4–43.7 and positive predictive values (AHFRST 5.6%, 95% CI 1.6–13.8; TNH-STRATIFY 6.9%, 95% CI 2.6–14.4. The AHFRST showed moderate inter-rater reliability (Kappa = 0.54, 95% CI = 0.36–0.67, p < 0.001 although 18 patients did not have the AHFRST completed by nursing staff. Conclusions There was an acceptable level of agreement between the 3 item AHFRST classification of falls risk and the longer, 9 item TNH-STRATIFY classification. However, both tools demonstrated limited predictive validity in the Austin Health population. The results highlight the importance of evaluating the validity of falls screening tools, and the clinical utility of these tools should be reconsidered.
Inter- and intra-observer variability of time-lapse annotations

DEFF Research Database (Denmark)

Sundvall, Linda; Ingerslev, Hans Jakob; Breth Knudsen, Ulla

2013-01-01

. This provides the basis for further investigation of embryo assessment and selection by time-lapse imaging in prospective trials. Study funding/competing interest(s): Research at the Fertility Clinic was funded by an unrestricted grant from Ferring and MSD. The authors have no competing interests to declare.......Study question: How consistent is the time-lapse annotation of dynamic and static morphologic parameters of embryo development, within and between observers? Summary answer: The assessment of dynamic parameters is characterized by almost perfect agreement within and between observers. What is known...... already: The commonly employed method used to assess embryos in IVF treatments is based on static evaluation of morphology in a microscope, but this is limited by substantial intra- and inter-observer variation. Time-lapse imaging has been proposed as a method to refine embryo selection by adding new...
The Pfirrmann classification of lumbar intervertebral disc degeneration: an independent inter- and intra-observer agreement assessment.

Science.gov (United States)

Urrutia, Julio; Besa, Pablo; Campos, Mauricio; Cikutovic, Pablo; Cabezon, Mario; Molina, Marcelo; Cruz, Juan Pablo

2016-09-01

Grading inter-vertebral disc degeneration (IDD) is important in the evaluation of many degenerative conditions, including patients with low back pain. Magnetic resonance imaging (MRI) is considered the best imaging instrument to evaluate IDD. The Pfirrmann classification is commonly used to grade IDD; the authors describing this classification showed an adequate agreement using it; however, there has been a paucity of independent agreement studies using this grading system. The aim of this study was to perform an independent inter- and intra-observer agreement study using the Pfirrmann classification. T2-weighted sagittal images of 79 patients consecutively studied with lumbar spine MRI were classified using the Pfirrmann grading system by six evaluators (three spine surgeons and three radiologists). After a 6-week interval, the 79 cases were presented to the same evaluators in a random sequence for repeat evaluation. The intra-class correlation coefficient (ICC) and the weighted kappa (wκ) were used to determine the inter- and intra-observer agreement. The inter-observer agreement was excellent, with an ICC = 0.94 (0.93-0.95) and wκ = 0.83 (0.74-0.91). There were no differences between spine surgeons and radiologists. Likewise, there were no differences in agreement evaluating the different lumbar discs. Most differences among observers were only of one grade. Intra-observer agreement was also excellent with ICC = 0.86 (0.83-0.89) and wκ = 0.89 (0.85-0.93). In this independent study, the Pfirrmann classification demonstrated an adequate agreement among different observers and by the same observer on separate occasions. Furthermore, it allows communication between radiologists and spine surgeons.

Inter-rater reliability of assessment of levator ani muscle strength and attachment to the pubic bone in nulliparous women.

Science.gov (United States)

van Delft, K; Schwertner-Tiepelmann, N; Thakar, R; Sultan, A H

2013-09-01

The modified Oxford scale (MOS) has been found previously to have poor inter-rater reliability, whereas digital assessment of levator ani muscle (LAM) attachment to the pubic bone has been shown to have acceptable reliability. Our aim was to evaluate inter-rater reliability of the validated MOS and to develop a reliable classification system for digital assessment of LAM attachment, correlating this to findings on transperineal ultrasound (TPUS) examination. Evaluation of the MOS by palpation was performed in nulliparous women by two investigators. LAM attachment was evaluated using digital palpation, for which a novel classification system was developed with four grades based on the position of the attachment and presence of discernible muscle. Findings were compared with those on TPUS examination. Inter-rater reliability was assessed using Cohen's kappa statistic. Twenty-five nulliparous women were examined. There was agreement in MOS scores between the investigators in 64% of women (n = 16), with a kappa of 0.66 (indicating substantial agreement). There was agreement in palpation of LAM attachment using the new grading system in 96% of women (n = 24), with a kappa of 0.90 (indicating almost perfect agreement). TPUS examination did not show LAM avulsion in any woman, with the exception of one with a partial avulsion. In this group of nulliparous patients, there was substantial agreement between the two investigators in evaluation of the MOS and there was good agreement between grades of LAM attachment using the new classification system, which correlated with findings on TPUS examination. It therefore appears that these results are reproducible in nulliparous women and the techniques can be readily learned and reliably incorporated into clinical practice and research after appropriate training. Further research is required to establish clinical utility of the grading system for LAM attachment in postpartum women and in women with symptomatic pelvic organ
Does experience in hysteroscopy improve accuracy and inter-observer agreement in the management of abnormal uterine bleeding?

Science.gov (United States)

Bourdel, Nicolas; Modaffari, Paola; Tognazza, Enrica; Pertile, Riccardo; Chauvet, Pauline; Botchorishivili, Revaz; Savary, Dennis; Pouly, Jean Luc; Rabischong, Benoit; Canis, Michel

2016-12-01

Hysteroscopic reliability may be influenced by the experience of the operator and by a lack of morphological diagnostic criteria for endometrial malignant pathologies. The aim of this study was to evaluate the diagnostic accuracy and the inter-observer agreement (IOA) in the management of abnormal uterine bleeding (AUB) among different experienced gynecologists. Each gynecologist, without any other clinical information, was asked to evaluate the anonymous video recordings of 51 consecutive patients who underwent hysteroscopy and endometrial resection for AUB. Experts (>500 hysteroscopies), seniors (20-499 procedures) and junior (≤19 procedures) gynecologists were asked to judge endometrial macroscopic appearance (benign, suspicious or frankly malignant). They also had to propose the histological diagnosis (atrophic or proliferative endometrium; simple, glandulocystic or atypical endometrial hyperplasia and endometrial carcinoma). Observers were free to indicate whether the quality of recordings were not good enough for adequate assessment. IOA (k coefficient), sensitivity, specificity, predictive value and the likelihood ratio were calculated. Five expert, five senior and six junior gynecologists were involved in the study. Considering endometrial cancer and endometrial atypical hyperplasia, sensitivity and specificity were respectively 55.5 % and 84.5 % for juniors, 66.6 % and 81.2 % for seniors and 86.6 % and 87.3 % for experts. Concerning endometrial macroscopic appearance, IOA was poor for juniors (k = 0.10) and fair for seniors and experts (k = 0.23 and 0.22, respectively). IOA was poor for juniors and experts (k = 0.18 and 0.20, respectively) and fair for seniors (k = 0.30) in predicting the histological diagnosis. Sensitivity improves with the observer's experience, but inter-observer agreement and reproducibility of hysteroscopy for endometrial malignancies are not satisfying no matter the level of expertise. Therefore, an accurate and
Accuracy, intra- and inter-unit reliability, and comparison between GPS and UWB-based position-tracking systems used for time-motion analyses in soccer.

Science.gov (United States)

Bastida Castillo, Alejandro; Gómez Carmona, Carlos D; De la Cruz Sánchez, Ernesto; Pino Ortega, José

2018-05-01

There is interest in the accuracy and inter-unit reliability of position-tracking systems to monitor players. Research into this technology, although relatively recent, has grown exponentially in the last years, and it is difficult to find professional team sport that does not use Global Positioning System (GPS) technology at least. The aim of this study is to know the accuracy of both GPS-based and Ultra Wide Band (UWB)-based systems on a soccer field and their inter- and intra-unit reliability. A secondary aim is to compare them for practical applications in sport science. Following institutional ethical approval and familiarization, 10 healthy and well-trained former soccer players (20 ± 1.6 years, 1.76 ± 0.08 cm, and 69.5 ± 9.8 kg) performed three course tests: (i) linear course, (ii) circular course, and (iii) a zig-zag course, all using UWB and GPS technologies. The average speed and distance covered were compared with timing gates and the real distance as references. The UWB technology showed better accuracy (bias: 0.57-5.85%), test-retest reliability (%TEM: 1.19), and inter-unit reliability (bias: 0.18) in determining distance covered than the GPS technology (bias: 0.69-6.05%; %TEM: 1.47; bias: 0.25) overall. Also, UWB showed better results (bias: 0.09; ICC: 0.979; bias: 0.01) for mean velocity measurement than GPS (bias: 0.18; ICC: 0.951; bias: 0.03).
Improving QST Reliability – More Raters, Tests or Occasions? A Multivariate Generalizability Study

DEFF Research Database (Denmark)

O'Neill, Søren; O'Neill, Lotte

2015-01-01

The reliability of quantitative sensory testing (QST) is affected by the error attributable to both test occasion and rater (examiner) as well as interactions between them. Most reliability studies only account for one source of error. The present study employed a fully-crossed, multivariate...... threshold, intensity, tolerance and modulation with mechanical, thermal and chemical stimuli. The classical test-retest and inter-rater reliability (0.19... procedures. Reliability was improved more by repeated testing on separate occasions opposed to repeated testing by different raters....
How reliable are Functional Movement Screening scores? A systematic review of rater reliability.

Science.gov (United States)

Moran, Robert W; Schneiders, Anthony G; Major, Katherine M; Sullivan, S John

2016-05-01

Several physical assessment protocols to identify intrinsic risk factors for injury aetiology related to movement quality have been described. The Functional Movement Screen (FMS) is a standardised, field-expedient test battery intended to assess movement quality and has been used clinically in preparticipation screening and in sports injury research. To critically appraise and summarise research investigating the reliability of scores obtained using the FMS battery. Systematic literature review. Systematic search of Google Scholar, Scopus (including ScienceDirect and PubMed), EBSCO (including Academic Search Complete, AMED, CINAHL, Health Source: Nursing/Academic Edition), MEDLINE and SPORTDiscus. Studies meeting eligibility criteria were assessed by 2 reviewers for risk of bias using the Quality Appraisal of Reliability Studies checklist. Overall quality of evidence was determined using van Tulder's levels of evidence approach. 12 studies were appraised. Overall, there was a 'moderate' level of evidence in favour of 'acceptable' (intraclass correlation coefficient ≥0.6) inter-rater and intra-rater reliability for composite scores derived from live scoring. For inter-rater reliability of composite scores derived from video recordings there was 'conflicting' evidence, and 'limited' evidence for intra-rater reliability. For inter-rater reliability based on live scoring of individual subtests there was 'moderate' evidence of 'acceptable' reliability (κ≥0.4) for 4 subtests (Deep Squat, Shoulder Mobility, Active Straight-leg Raise, Trunk Stability Push-up) and 'conflicting' evidence for the remaining 3 (Hurdle Step, In-line Lunge, Rotary Stability). This review found 'moderate' evidence that raters can achieve acceptable levels of inter-rater and intra-rater reliability of composite FMS scores when using live ratings. Overall, there were few high-quality studies, and the quality of several studies was impacted by poor study reporting particularly in relation to
Inter-observer agreement for the evaluation of bone involvement on Whole Body Low Dose Computed Tomography (WBLDCT) in Multiple Myeloma (MM)

Energy Technology Data Exchange (ETDEWEB)

Zacchino, M.; Minetti, V.; Dore, R.; Calliada, F. [University of Pavia, Fondazione IRCCS Policlinico San Matteo, Institute of Radiology, Pavia (Italy); Bonaffini, P.A.; Nasatti, A.; Sironi, S. [University of Milano Bicocca, San Gerardo Hospital, Department of Diagnostic Radiology, Monza (Italy); Corso, A. [University of Pavia, Fondazione IRCCS Policlinico San Matteo, Division of Hematology, Pavia (Italy); Tinelli, C. [University of Pavia, Fondazione IRCCS Policlinico San Matteo, Service of Biometry and Statistics, Pavia (Italy)

2015-11-15

We aimed to assess inter-observer agreement in bone involvement evaluation and define accuracy and reproducibility of MDCT images analysis in Multiple Myeloma (MM), by comparing two acquisition protocols at two different institutions. A total of 100 MM patients underwent whole body low-dose computed tomography (WB-LDCT), with two protocols: Group I (50 patients), 80 kV and 200-230 mAs; Group II, 120 kV-40 mAs. Four readers (two experts) retrospectively reviewed 22 anatomical districts, reporting the following for each patient: 1) osteolytic lesions; 2) cortical bone integrity; 3) fractures; 4) risk of vertebral collapse; 5) hyperattenuating bone lesions; and 6) extraosseous extension. Inter-observer agreement (by all readers, expert and young observers and comparison of the two protocols) was then statistically analyzed. According to Cohen's criteria, inter-observer agreement among the four readers and between experts and residents was good for the detection of bone lesions and extra-medullary extension, and for the evaluation of risk of collapse and cortical integrity. There was good agreement when comparing the two protocols. A greater variability was found for the evaluation of hyperattenuating lesions and the presence of fractures. WB-LDCT represents a reproducible and reliable technique that is helpful for defining bone disease in MM patients, with partial influence of readers' experience. (orig.)
Inter-observer agreement for the evaluation of bone involvement on Whole Body Low Dose Computed Tomography (WBLDCT) in Multiple Myeloma (MM)

International Nuclear Information System (INIS)

Zacchino, M.; Minetti, V.; Dore, R.; Calliada, F.; Bonaffini, P.A.; Nasatti, A.; Sironi, S.; Corso, A.; Tinelli, C.

2015-01-01

We aimed to assess inter-observer agreement in bone involvement evaluation and define accuracy and reproducibility of MDCT images analysis in Multiple Myeloma (MM), by comparing two acquisition protocols at two different institutions. A total of 100 MM patients underwent whole body low-dose computed tomography (WB-LDCT), with two protocols: Group I (50 patients), 80 kV and 200-230 mAs; Group II, 120 kV-40 mAs. Four readers (two experts) retrospectively reviewed 22 anatomical districts, reporting the following for each patient: 1) osteolytic lesions; 2) cortical bone integrity; 3) fractures; 4) risk of vertebral collapse; 5) hyperattenuating bone lesions; and 6) extraosseous extension. Inter-observer agreement (by all readers, expert and young observers and comparison of the two protocols) was then statistically analyzed. According to Cohen's criteria, inter-observer agreement among the four readers and between experts and residents was good for the detection of bone lesions and extra-medullary extension, and for the evaluation of risk of collapse and cortical integrity. There was good agreement when comparing the two protocols. A greater variability was found for the evaluation of hyperattenuating lesions and the presence of fractures. WB-LDCT represents a reproducible and reliable technique that is helpful for defining bone disease in MM patients, with partial influence of readers' experience. (orig.)
Reliability of histologic assessment in patients with eosinophilic oesophagitis.

Science.gov (United States)

Warners, M J; Ambarus, C A; Bredenoord, A J; Verheij, J; Lauwers, G Y; Walsh, J C; Katzka, D A; Nelson, S; van Viegen, T; Furuta, G T; Gupta, S K; Stitt, L; Zou, G; Parker, C E; Shackelton, L M; D Haens, G R; Sandborn, W J; Dellon, E S; Feagan, B G; Collins, M H; Jairath, V; Pai, R K

2018-04-01

The validity of the eosinophilic oesophagitis (EoE) histologic scoring system (EoEHSS) has been demonstrated, but only preliminary reliability data exist. Formally assess the reliability of the EoEHSS and additional histologic features. Four expert gastrointestinal pathologists independently reviewed slides from adult patients with EoE (N = 45) twice, in random order, using standardised training materials and scoring conventions for the EoEHSS and additional histologic features agreed upon during a modified Delphi process. Intra- and inter-rater reliability for scoring the EoEHSS, a visual analogue scale (VAS) of overall histopathologic disease severity, and additional histologic features were assessed using intra-class correlation coefficients (ICCs). Almost perfect intra-rater reliability was observed for the composite EoEHSS scores and the VAS. Inter-rater reliability was also almost perfect for the composite EoEHSS scores and substantial for the VAS. Of the EoEHSS items, eosinophilic inflammation was associated with the highest ICC estimates and consistent with almost perfect intra- and inter-rater reliability. With the exception of dyskeratotic epithelial cells and surface epithelial alteration, ICC estimates for the remaining EoEHSS items were above the benchmarks for substantial intra-rater, and moderate inter-rater reliability. Estimation of peak eosinophil count and number of lamina propria eosinophils were associated with the highest ICC estimates among the exploratory items. The composite EoEHSS and most component items are associated with substantial reliability when assessed by central pathologists. Future studies should assess responsiveness of the score to change after a therapeutic intervention to facilitate its use in clinical trials. © 2018 John Wiley & Sons Ltd.
Reliability of two social cognition tests: The combined stories test and the social knowledge test.

Science.gov (United States)

Thibaudeau, Élisabeth; Cellard, Caroline; Legendre, Maxime; Villeneuve, Karèle; Achim, Amélie M

2018-04-01

Deficits in social cognition are common in psychiatric disorders. Validated social cognition measures with good psychometric properties are necessary to assess and target social cognitive deficits. Two recent social cognition tests, the Combined Stories Test (COST) and the Social Knowledge Test (SKT), respectively assess theory of mind and social knowledge. Previous studies have shown good psychometric properties for these tests, but the test-retest reliability has never been documented. The aim of this study was to evaluate the test-retest reliability and the inter-rater reliability of the COST and the SKT. The COST and the SKT were administered twice to a group of forty-two healthy adults, with a delay of approximately four weeks between the assessments. Excellent test-retest reliability was observed for the COST, and a good test-retest reliability was observed for the SKT. There was no evidence of practice effect. Furthermore, an excellent inter-rater reliability was observed for both tests. This study shows a good reliability of the COST and the SKT that adds to the good validity previously reported for these two tests. These good psychometrics properties thus support that the COST and the SKT are adequate measures for the assessment of social cognition. Copyright © 2018. Published by Elsevier B.V.
Assessment of the nursing care product (APROCENF: a reliability and construct validity study

Directory of Open Access Journals (Sweden)

Danielle Fabiana Cucolo

Full Text Available ABSTRACT Objectives: to verify the reliability and construct validity estimates of the "Assessment of nursing care product" scale (APROCENF and its applicability. Methods: this validation study included a sample of 40 (inter-rater reliability and 172 (construct validity assessments performed by nurses at the end of the work shift at nine inpatient services of a teaching hospital in the Brazilian Southeast. The data were collected between February and September/2014 with interruptions. Cronbach's alpha and Spearman's correlation coefficients were calculated, as well as the intraclass correlation and the weighted kappa index (inter-rater reliability. Exploratory factor analysis was used with principal component extraction and varimax rotation (construct validity. Results: the internal consistency revealed an alpha coefficient of 0.85, item-item correlation ranging between 0.13 and 0.61 and item-total correlation between 0.43 and 0.69. Inter-rater equivalence was obtained and all items evidenced significant factor loadings. Conclusion: this research evidenced the reliability and construct validity of the scale to assess the nursing care product. Its application in nursing practice permits identifying improvements needed in the production process, contributing to management and care decisions.
Inter-examiner reliability of passive assessment of intervertebral motion in the cervical and lumbar spine: A systematic review

NARCIS (Netherlands)

van Trijffel, E.; Anderegg, Q.; Bossuyt, P. M. M.; Lucas, C.

2005-01-01

A systematic review was conducted to determine inter-examiner reliability of passive assessment of segmental intervertebral motion in the cervical and lumbar spine as well as to explore sources of heterogeneity. Passive assessment of motion is used to decide on treatments for neck and low-back pain
Transcultural Adaptation of GRID Hamilton Rating Scale For Depression (GRID-HAMD) to Brazilian Portuguese and Evaluation of the Impact of Training Upon Inter-Rater Reliability.

Science.gov (United States)

Henrique-Araújo, Ricardo; Osório, Flávia L; Gonçalves Ribeiro, Mônica; Soares Monteiro, Ivandro; Williams, Janet B W; Kalali, Amir; Alexandre Crippa, José; Oliveira, Irismar Reis De

2014-07-01

GRID-HAMD is a semi-structured interview guide developed to overcome flaws in HAM-D, and has been incorporated into an increasing number of studies. Carry out the transcultural adaptation of GRID-HAMD into the Brazilian Portuguese language, evaluate the inter-rater reliability of this instrument and the training impact upon this measure, and verify the raters' opinions of said instrument. The transcultural adaptation was conducted by appropriate methodology. The measurement of inter-rater reliability was done by way of videos that were evaluated by 85 professionals before and after training for the use of this instrument. The intraclass correlation coefficient (ICC) remained between 0.76 and 0.90 for GRID-HAMD-21 and between 0.72 and 0.91 for GRID-HAMD-17. The training did not have an impact on the ICC, except for a few groups of participants with a lower level of experience. Most of the participants showed high acceptance of GRID-HAMD, when compared to other versions of HAM-D. The scale presented adequate inter-rater reliability even before training began. Training did not have an impact on this measure, except for a few groups with less experience. GRID-HAMD received favorable opinions from most of the participants.
Is One Trial Sufficient to Obtain Excellent Pressure Pain Threshold Reliability in the Low Back of Asymptomatic Individuals? A Test-Retest Study.

Science.gov (United States)

Balaguier, Romain; Madeleine, Pascal; Vuillerme, Nicolas

2016-01-01

The assessment of pressure pain threshold (PPT) provides a quantitative value related to the mechanical sensitivity to pain of deep structures. Although excellent reliability of PPT has been reported in numerous anatomical locations, its absolute and relative reliability in the lower back region remains to be determined. Because of the high prevalence of low back pain in the general population and because low back pain is one of the leading causes of disability in industrialized countries, assessing pressure pain thresholds over the low back is particularly of interest. The purpose of this study study was (1) to evaluate the intra- and inter- absolute and relative reliability of PPT within 14 locations covering the low back region of asymptomatic individuals and (2) to determine the number of trial required to ensure reliable PPT measurements. Fifteen asymptomatic subjects were included in this study. PPTs were assessed among 14 anatomical locations in the low back region over two sessions separated by one hour interval. For the two sessions, three PPT assessments were performed on each location. Reliability was assessed computing intraclass correlation coefficients (ICC), standard error of measurement (SEM) and minimum detectable change (MDC) for all possible combinations between trials and sessions. Bland-Altman plots were also generated to assess potential bias in the dataset. Relative reliability for both intra- and inter- session was almost perfect with ICC ranged from 0.85 to 0.99. With respect to the intra-session, no statistical difference was reported for ICCs and SEM regardless of the conducted comparisons between trials. Conversely, for inter-session, ICCs and SEM values were significantly larger when two consecutive PPT measurements were used for data analysis. No significant difference was observed for the comparison between two consecutive measurements and three measurements. Excellent relative and absolute reliabilities were reported for both intra
The use and reliability of SymNose for quantitative measurement of the nose and lip in unilateral cleft lip and palate patients.

Science.gov (United States)

Mosmuller, David; Tan, Robin; Mulder, Frans; Bachour, Yara; de Vet, Henrica; Don Griot, Peter

2016-10-01

It is essential to have a reliable assessment method in order to compare the results of cleft lip and palate surgery. In this study the computer-based program SymNose, a method for quantitative assessment of the nose and lip, will be assessed on usability and reliability. The symmetry of the nose and lip was measured twice in 50 six-year-old complete and incomplete unilateral cleft lip and palate patients by four observers. For the frontal view the asymmetry level of the nose and upper lip were evaluated and for the basal view the asymmetry level of the nose and nostrils were evaluated. A mean inter-observer reliability when tracing each image once or twice was 0.70 and 0.75, respectively. Tracing the photographs with 2 observers and 4 observers gave a mean inter-observer score of 0.86 and 0.92, respectively. The mean intra-observer reliability varied between 0.80 and 0.84. SymNose is a practical and reliable tool for the retrospective assessment of large caseloads of 2D photographs of cleft patients for research purposes. Moderate to high single inter-observer reliability was found. For future research with SymNose reliable outcomes can be achieved by using the average outcomes of single tracings of two observers. Copyright © 2016 European Association for Cranio-Maxillo-Facial Surgery. Published by Elsevier Ltd. All rights reserved.
Reliability of ultrasound grading traditional score and new global OMERACT-EULAR score system (GLOESS): results from an inter- and intra-reading exercise by rheumatologists.

Science.gov (United States)

Ventura-Ríos, Lucio; Hernández-Díaz, Cristina; Ferrusquia-Toríz, Diana; Cruz-Arenas, Esteban; Rodríguez-Henríquez, Pedro; Alvarez Del Castillo, Ana Laura; Campaña-Parra, Alfredo; Canul, Efrén; Guerrero Yeo, Gerardo; Mendoza-Ruiz, Juan Jorge; Pérez Cristóbal, Mario; Sicsik, Sandra; Silva Luna, Karina

2017-12-01

This study aims to test the reliability of ultrasound to graduate synovitis in static and video images, evaluating separately grayscale and power Doppler (PD), and combined. Thirteen trained rheumatologist ultrasonographers participated in two separate rounds reading 42 images, 15 static and 27 videos, of the 7-joint count [wrist, 2nd and 3rd metacarpophalangeal (MCP), 2nd and 3rd interphalangeal (IPP), 2nd and 5th metatarsophalangeal (MTP) joints]. The images were from six patients with rheumatoid arthritis, performed by one ultrasonographer. Synovitis definition was according to OMERACT. Scoring system in grayscale, PD separately, and combined (GLOESS-Global OMERACT-EULAR Score System) were reviewed before exercise. Reliability intra- and inter-reading was calculated with Cohen's kappa weighted, according to Landis and Koch. Kappa values for inter-reading were good to excellent. The minor kappa was for GLOESS in static images, and the highest was for the same scoring in videos (k 0.59 and 0.85, respectively). Excellent values were obtained for static PD in 5th MTP joint and for PD video in 2nd MTP joint. Results for GLOESS in general were good to moderate. Poor agreement was observed in 3rd MCP and 3rd IPP in all kinds of images. Intra-reading agreement were greater in grayscale and GLOESS in static images than in videos (k 0.86 vs. 0.77 and k 0.86 vs. 0.71, respectively), but PD was greater in videos than in static images (k 1.0 vs. 0.79). The reliability of the synovitis scoring through static images and videos is in general good to moderate when using grayscale and PD separately or combined.
Reliability of the Crowe und Hartofilakidis classifications used in the assessment of the adult dysplastic hip

Energy Technology Data Exchange (ETDEWEB)

Decking, Ralf; Brunner, Alexander; Puhl, Wolfhart [University of Ulm, Orthopaedic Department, RKU, Ulm (Germany); Decking, Jens [Johannes Gutenberg University School of Medicine, Department of Orthopaedic Surgery, Mainz (Germany); Guenther, Klaus-Peter [University of Ulm, Orthopaedic Department, RKU, Ulm (Germany); University-Hospital Carl Gustav Carus, Department of Orthopaedics, Dresden (Germany)

2006-05-15

To assess the inter-observer and intra-observer reliability of two commonly used radiographic classification systems in the evaluation of hip dysplasia in skeletally mature adults. Three observers with different levels of training independently classified 62 dysplastic hips on 51 standard anteriorposterior pelvis radiographs according to the criteria defined by Crowe and by Hartofilakidis. To assess intra-observer reliability, the same radiographs were reviewed 3 months later by the same observers. At the time of the radiographic examination, the mean age of the 51 patients had been 54 years (range 18-82 years). A high correlation concerning the inter- and intra-observer reliability of both systems was demonstrated. Inter-observer reliability displayed a weighted kappa coefficient of 0.82 for the Crowe and 0.75 for the Hartofilakidis classification. Intra-observer reliability showed a kappa coefficient of 0.86 and 0.79, respectively. Both classification systems can be recommended to compare collectives of adult patients with congenital dysplasia of the hip. However, for future clinical practice, it would be advisable to agree on one universally accepted system as a standard in the literature. (orig.)
Reliability of the Crowe und Hartofilakidis classifications used in the assessment of the adult dysplastic hip

International Nuclear Information System (INIS)

Decking, Ralf; Brunner, Alexander; Puhl, Wolfhart; Decking, Jens; Guenther, Klaus-Peter

2006-01-01

To assess the inter-observer and intra-observer reliability of two commonly used radiographic classification systems in the evaluation of hip dysplasia in skeletally mature adults. Three observers with different levels of training independently classified 62 dysplastic hips on 51 standard anteriorposterior pelvis radiographs according to the criteria defined by Crowe and by Hartofilakidis. To assess intra-observer reliability, the same radiographs were reviewed 3 months later by the same observers. At the time of the radiographic examination, the mean age of the 51 patients had been 54 years (range 18-82 years). A high correlation concerning the inter- and intra-observer reliability of both systems was demonstrated. Inter-observer reliability displayed a weighted kappa coefficient of 0.82 for the Crowe and 0.75 for the Hartofilakidis classification. Intra-observer reliability showed a kappa coefficient of 0.86 and 0.79, respectively. Both classification systems can be recommended to compare collectives of adult patients with congenital dysplasia of the hip. However, for future clinical practice, it would be advisable to agree on one universally accepted system as a standard in the literature. (orig.)
Reliability and validity of the Turkish version of the Berg Balance Scale.

Science.gov (United States)

Sahin, Fusun; Yilmaz, Figen; Ozmaden, Asli; Kotevolu, Nurdan; Sahin, Tulay; Kuran, Banu

2008-01-01

The purpose of this study was to develop a Turkish version of the Berg Balance Scale (BBS) and assess its reliability and validity. Sixty healthy volunteers older than 65 years were included in to the study. Subjects who had lower extremity amputation, or were armchair or bedridden were excluded. After translation process, the Turkish version of the scale was administered to each participant twice with an interval of 2 weeks. The intraclass correlation coefficient (ICC) was calculated to assess intra- and inter-observer reliability. Chronbach alpha was calculated to evaluate internal consistency of the total BBS score. Interclass correlation coefficient was calcuated to examine test-retest reliability. Convergent validity was assessed by correlating the scale with Modified Barthel Index (MBI) and Timed Up and Go Test (TUG). Construct validity was assessed with factor analysis. The mean age in years of the participants were 77.00+/-5.67 (range: 67-92 yrs). The ICC for intra- and inter- observer reliability was 0.98 (pr=0.67 pr=-0.75 p<0.0001, respectively). The Turkish version of the BBS is a reliable and valid scale to be used in balance assessment of Turkish older adults.
Inter- and Intrarater Reliability Using Different Software Versions of E4D Compare in Dental Education.

Science.gov (United States)

Callan, Richard S; Cooper, Jeril R; Young, Nancy B; Mollica, Anthony G; Furness, Alan R; Looney, Stephen W

2015-06-01

The problems associated with intra- and interexaminer reliability when assessing preclinical performance continue to hinder dental educators' ability to provide accurate and meaningful feedback to students. Many studies have been conducted to evaluate the validity of utilizing various technologies to assist educators in achieving that goal. The purpose of this study was to compare two different versions of E4D Compare software to determine if either could be expected to deliver consistent and reliable comparative results, independent of the individual utilizing the technology. Five faculty members obtained E4D digital images of students' attempts (sample model) at ideal gold crown preparations for tooth #30 performed on typodont teeth. These images were compared to an ideal (master model) preparation utilizing two versions of E4D Compare software. The percent correlations between and within these faculty members were recorded and averaged. The intraclass correlation coefficient was used to measure both inter- and intrarater agreement among the examiners. The study found that using the older version of E4D Compare did not result in acceptable intra- or interrater agreement among the examiners. However, the newer version of E4D Compare, when combined with the Nevo scanner, resulted in a remarkable degree of agreement both between and within the examiners. These results suggest that consistent and reliable results can be expected when utilizing this technology under the protocol described in this study.
Musculoskeletal ultrasound imaging of the plantar forefoot in patients with rheumatoid arthritis: inter-observer agreement between a podiatrist and a radiologist

Directory of Open Access Journals (Sweden)

Bowen Catherine J

2008-07-01

Full Text Available Abstract Background The use of musculoskeletal ultrasound (MSUS in the diagnosis and management of foot and ankle musculoskeletal pathology is increasing. Due to the wide use of MSUS and the depth and breadth of training required new proposals advocate tailored learning of the technique to discrete fields of practice. The aims of the study were to evaluate the inter-observer agreement between a MSUS radiologist and a podiatrist, who had completed basic skills training in MSUS, in the MSUS assessment of the forefoot of patients with Rheumatoid Arthritis. Methods A consecutive sample of thirty-two patients with rheumatoid arthritis was assessed for presence of synovitis, erosions and bursitis within the forefoot using MSUS. All MSUS assessments were performed independently on the same day by a podiatrist and one of two Consultant Radiologists experienced in MSUS. Results Moderate agreement on image acquisition and interpretation was achieved for bursitis (kappa 0.522; p Conclusion This study demonstrated good inter-observer agreement between a podiatrist and radiologist on MSUS assessment of the forefoot, particularly for bursitis and erosions, in patients with rheumatoid arthritis. There is scope to further evaluate and consider the role of podiatrists in the MSUS imaging of the foot following appropriate training and also in the development of reliable protocols for MSUS assessment of the foot.

Assessing the environmental characteristics of cycling routes to school: a study on the reliability and validity of a Google Street View-based audit.

Science.gov (United States)

Vanwolleghem, Griet; Van Dyck, Delfien; Ducheyne, Fabian; De Bourdeaudhuij, Ilse; Cardon, Greet

2014-06-10

Google Street View provides a valuable and efficient alternative to observe the physical environment compared to on-site fieldwork. However, studies on the use, reliability and validity of Google Street View in a cycling-to-school context are lacking. We aimed to study the intra-, inter-rater reliability and criterion validity of EGA-Cycling (Environmental Google Street View Based Audit - Cycling to school), a newly developed audit using Google Street View to assess the physical environment along cycling routes to school. Parents (n = 52) of 11-to-12-year old Flemish children, who mostly cycled to school, completed a questionnaire and identified their child's cycling route to school on a street map. Fifty cycling routes of 11-to-12-year olds were identified and physical environmental characteristics along the identified routes were rated with EGA-Cycling (5 subscales; 37 items), based on Google Street View. To assess reliability, two researchers performed the audit. Criterion validity of the audit was examined by comparing the ratings based on Google Street View with ratings through on-site assessments. Intra-rater reliability was high (kappa range 0.47-1.00). Large variations in the inter-rater reliability (kappa range -0.03-1.00) and criterion validity scores (kappa range -0.06-1.00) were reported, with acceptable inter-rater reliability values for 43% of all items and acceptable criterion validity for 54% of all items. EGA-Cycling can be used to assess physical environmental characteristics along cycling routes to school. However, to assess the micro-environment specifically related to cycling, on-site assessments have to be added.
Feasibility of CBCT-based target and normal structure delineation in prostate cancer radiotherapy: Multi-observer and image multi-modality study

International Nuclear Information System (INIS)

Luetgendorf-Caucig, Carola; Fotina, Irina; Stock, Markus; Poetter, Richard; Goldner, Gregor; Georg, Dietmar

2011-01-01

Background and purpose: In-room cone-beam CT (CBCT) imaging and adaptive treatment strategies are promising methods to decrease target volumes and to spare organs at risk. The aim of this work was to analyze the inter-observer contouring uncertainties of target volumes and organs at risks (oars) in localized prostate cancer radiotherapy using CBCT images. Furthermore, CBCT contouring was benchmarked against other image modalities (CT, MR) and the influence of subjective image quality perception on inter-observer variability was assessed. Methods and materials: Eight prostate cancer patients were selected. Seven radiation oncologists contoured target volumes and oars on CT, MRI and CBCT. Volumes, coefficient of variation (COV), conformity index (cigen), and coordinates of center-of-mass (COM) were calculated for each patient and image modality. Reliability analysis was performed for the support of the reported findings. Subjective perception of image quality was assessed via a ten-scored visual analog scale (VAS). Results: The median volume for prostate was larger on CT compared to MRI and CBCT images. The inter-observer variation for prostate was larger on CBCT (CIgen = 0.57 ± 0.09, 0.61 reliability) compared to CT (CIgen = 0.72 ± 0.07, 0.83 reliability) and MRI (CIgen = 0.66 ± 0.12, 0.87 reliability). On all image modalities values of the intra-observer reliability coefficient (0.97 for CT, 0.99 for MR and 0.94 for CBCT) indicated high reproducibility of results. For all patients the root mean square (RMS) of the inter-observer standard deviation (σ) of the COM was largest on CBCT with σ(x) = 0.4 mm, σ(y) = 1.1 mm, and σ(z) = 1.7 mm. The concordance in delineating OARs was much stronger than for target volumes, with average CIgen > 0.70 for rectum and CIgen > 0.80 for bladder. Positive correlations between CIgen and VAS score of the image quality were observed for the prostate, seminal vesicles and rectum. Conclusions: Inter-observer variability for target
Reliability and Validity of the Dyadic Observed Communication Scale (DOCS).

Science.gov (United States)

Hadley, Wendy; Stewart, Angela; Hunter, Heather L; Affleck, Katelyn; Donenberg, Geri; Diclemente, Ralph; Brown, Larry K

2013-02-01

We evaluated the reliability and validity of the Dyadic Observed Communication Scale (DOCS) coding scheme, which was developed to capture a range of communication components between parents and adolescents. Adolescents and their caregivers were recruited from mental health facilities for participation in a large, multi-site family-based HIV prevention intervention study. Seventy-one dyads were randomly selected from the larger study sample and coded using the DOCS at baseline. Preliminary validity and reliability of the DOCS was examined using various methods, such as comparing results to self-report measures and examining interrater reliability. Results suggest that the DOCS is a reliable and valid measure of observed communication among parent-adolescent dyads that captures both verbal and nonverbal communication behaviors that are typical intervention targets. The DOCS is a viable coding scheme for use by researchers and clinicians examining parent-adolescent communication. Coders can be trained to reliably capture individual and dyadic components of communication for parents and adolescents and this complex information can be obtained relatively quickly.
The Inter-Reflexive Possibilities of Dual Observations: An Account from and through Experience

Science.gov (United States)

Barrett, Margaret S.; Mills, Janet

2009-01-01

In this article, we explore the methodological possibilities of dual observation and "inter-reflexive" interpretation as we have experienced this in a longitudinal ethnographic case study of music teaching and learning in an English cathedral choir school. Our intent here is to understand the ways in which our particular historical,…
The reliability of knee joint position testing using electrogoniometry

Directory of Open Access Journals (Sweden)

Winter Adele

2008-01-01

Full Text Available Abstract Background The current investigation examined the inter- and intra-tester reliability of knee joint angle measurements using a flexible Penny and Giles Biometric® electrogoniometer. The clinical utility of electrogoniometry was also addressed. Methods The first study examined the inter- and intra-tester reliability of measurements of knee joint angles in supine, sitting and standing in 35 healthy adults. The second study evaluated inter-tester and intra-tester reliability of knee joint angle measurements in standing and after walking 10 metres in 20 healthy adults, using an enhanced measurement protocol with a more detailed electrogoniometer attachment procedure. Both inter-tester reliability studies involved two testers. Results In the first study, inter-tester reliability (ICC[2,10] ranged from 0.58–0.71 in supine, 0.68–0.79 in sitting and 0.57–0.80 in standing. The standard error of measurement between testers was less than 3.55° and the limits of agreement ranged from -12.51° to 12.21°. Reliability coefficients for intra-tester reliability (ICC[3,10] ranged from 0.75–0.76 in supine, 0.86–0.87 in sitting and 0.87–0.88 in standing. The standard error of measurement for repeated measures by the same tester was less than 1.7° and the limits of agreement ranged from -8.13° to 7.90°. The second study showed that using a more detailed electrogoniometer attachment protocol reduced the error of measurement between testers to 0.5°. Conclusion Using a standardised protocol, reliable measures of knee joint angles can be gained in standing, supine and sitting by using a flexible goniometer.
Comparing the Meggitt-Wagner and the University of Texas wound classification systems for diabetic foot ulcers: inter-observer analyses

NARCIS (Netherlands)

Santema, Trientje B.; Lenselink, Ellie A.; Balm, Ron; Ubbink, Dirk T.

2016-01-01

Accurate classification of diabetic foot ulcers is essential for inter-clinician communication, assessment of healing tendency and determination of treatment options. The aim of this study was to assess the inter-observer agreement (IOA) of the most commonly used classification systems for diabetic
Repeated stimulation, inter-stimulus interval and inter-electrode distance alters muscle contractile properties as measured by Tensiomyography.

Directory of Open Access Journals (Sweden)

Hannah V Wilson

Full Text Available The influence of methodological parameters on the measurement of muscle contractile properties using Tensiomyography (TMG has not been published.To investigate the; (1 reliability of stimulus amplitude needed to elicit maximum muscle displacement (Dm, (2 effect of changing inter-stimulus interval on Dm (using a fixed stimulus amplitude and contraction time (Tc, (3 the effect of changing inter-electrode distance on Dm and Tc.Within subject, repeated measures.10 participants for each objective.Dm and Tc of the rectus femoris, measured using TMG.The coefficient of variance (CV and the intra-class correlation (ICC of stimulus amplitude needed to elicit maximum Dm was 5.7% and 0.92 respectively. Dm was higher when using an inter-electrode distance of 7cm compared to 5cm [P = 0.03] and when using an inter-stimulus interval of 10s compared to 30s [P = 0.017]. Further analysis of inter-stimulus interval data, found that during 10 repeated stimuli Tc became faster after the 5th measure when compared to the second measure [P<0.05]. The 30s inter-stimulus interval produced the most stable Tc over 10 measures compared to 10s and 5s respectively.Our data suggest that the stimulus amplitude producing maximum Dm of the rectus femoris is reliable. Inter-electrode distance and inter-stimulus interval can significantly influence Dm and/ or Tc. Our results support the use of a 30s inter-stimulus interval over 10s or 5s. Future studies should determine the influence of methodological parameters on muscle contractile properties in a range of muscles.
Repeated stimulation, inter-stimulus interval and inter-electrode distance alters muscle contractile properties as measured by Tensiomyography

Science.gov (United States)

Johnson, Mark I.; Francis, Peter

2018-01-01

Context The influence of methodological parameters on the measurement of muscle contractile properties using Tensiomyography (TMG) has not been published. Objective To investigate the; (1) reliability of stimulus amplitude needed to elicit maximum muscle displacement (Dm), (2) effect of changing inter-stimulus interval on Dm (using a fixed stimulus amplitude) and contraction time (Tc), (3) the effect of changing inter-electrode distance on Dm and Tc. Design Within subject, repeated measures. Participants 10 participants for each objective. Main outcome measures Dm and Tc of the rectus femoris, measured using TMG. Results The coefficient of variance (CV) and the intra-class correlation (ICC) of stimulus amplitude needed to elicit maximum Dm was 5.7% and 0.92 respectively. Dm was higher when using an inter-electrode distance of 7cm compared to 5cm [P = 0.03] and when using an inter-stimulus interval of 10s compared to 30s [P = 0.017]. Further analysis of inter-stimulus interval data, found that during 10 repeated stimuli Tc became faster after the 5th measure when compared to the second measure [P<0.05]. The 30s inter-stimulus interval produced the most stable Tc over 10 measures compared to 10s and 5s respectively. Conclusion Our data suggest that the stimulus amplitude producing maximum Dm of the rectus femoris is reliable. Inter-electrode distance and inter-stimulus interval can significantly influence Dm and/ or Tc. Our results support the use of a 30s inter-stimulus interval over 10s or 5s. Future studies should determine the influence of methodological parameters on muscle contractile properties in a range of muscles. PMID:29451885
Intra- and inter-observer agreement in MRI assessment of rotator cuff healing using the Sugaya classification 10years after surgery.

Science.gov (United States)

Niglis, L; Collin, P; Dosch, J-C; Meyer, N; Kempf, J-F

2017-10-01

The long-term outcomes of rotator cuff repair are unclear. Recurrent tears are common, although their reported frequency varies depending on the type and interpretation challenges of the imaging method used. The primary objective of this study was to assess the intra- and inter-observer reproducibility of the MRI assessment of rotator cuff repair using the Sugaya classification 10years after surgery. The secondary objective was to determine whether poor reproducibility, if found, could be improved by using a simplified yet clinically relevant classification. Our hypothesis was that reproducibility was limited but could be improved by simplifying the classification. In a retrospective study, we assessed intra- and inter-observer agreement in interpreting 49 magnetic resonance imaging (MRI) scans performed 10years after rotator cuff repair. These 49 scans were taken at random among 609 cases that underwent re-evaluation, with imaging, for the 2015 SoFCOT symposium on 10-year and 20-year clinical and anatomical outcomes of rotator cuff repair for full-thickness tears. Each of three observers read each of the 49 scans on two separate occasions. At each reading, they assessed the supra-spinatus tendon according to the Sugaya classification in five types. Intra-observer agreement for the Sugaya type was substantial (κ=0.64) but inter-observer agreement was only fair (κ=0.39). Agreement improved when the five Sugaya types were collapsed into two categories (1-2-3 and 4-5) (intra-observer κ=0.74 and inter-observer κ=0.68). Using the Sugaya classification to assess post-operative rotator cuff healing was associated with substantial intra-observer and fair inter-observer agreement. A simpler classification into two categories improved agreement while remaining clinically relevant. II, prospective randomised low-power study. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Validity and Reliability of the Clinical Competency Evaluation Instrument for Use among Physiotherapy Students: Pilot study.

Science.gov (United States)

Muhamad, Zailani; Ramli, Ayiesah; Amat, Salleh

2015-05-01

The aim of this study was to determine the content validity, internal consistency, test-retest reliability and inter-rater reliability of the Clinical Competency Evaluation Instrument (CCEVI) in assessing the clinical performance of physiotherapy students. This study was carried out between June and September 2013 at University Kebangsaan Malaysia (UKM), Kuala Lumpur, Malaysia. A panel of 10 experts were identified to establish content validity by evaluating and rating each of the items used in the CCEVI with regards to their relevance in measuring students' clinical competency. A total of 50 UKM undergraduate physiotherapy students were assessed throughout their clinical placement to determine the construct validity of these items. The instrument's reliability was determined through a cross-sectional study involving a clinical performance assessment of 14 final-year undergraduate physiotherapy students. The content validity index of the entire CCEVI was 0.91, while the proportion of agreement on the content validity indices ranged from 0.83-1.00. The CCEVI construct validity was established with factor loading of ≥0.6, while internal consistency (Cronbach's alpha) overall was 0.97. Test-retest reliability of the CCEVI was confirmed with a Pearson's correlation range of 0.91-0.97 and an intraclass coefficient correlation range of 0.95-0.98. Inter-rater reliability of the CCEVI domains ranged from 0.59 to 0.97 on initial and subsequent assessments. This pilot study confirmed the content validity of the CCEVI. It showed high internal consistency, thereby providing evidence that the CCEVI has moderate to excellent inter-rater reliability. However, additional refinement in the wording of the CCEVI items, particularly in the domains of safety and documentation, is recommended to further improve the validity and reliability of the instrument.
Reproducibility of tender point examination in chronic low back pain patients as measured by intrarater and inter-rater reliability and agreement

DEFF Research Database (Denmark)

Jensen, Ole Kudsk; Callesen, Jacob; Nielsen, Merete Graakjaer

2013-01-01

back examination and return-to-work intervention, 43 and 39 patients, respectively (18 women, 46%) entered and completed the study. MAIN OUTCOME MEASURES: The reliability was estimated by the intraclass correlation coefficient (ICC), and agreement was calculated for up to ±3 TPs. Furthermore......, the smallest detectable difference was calculated. RESULTS: TP examination was performed twice by two consultants in rheumatology and rehabilitation at 20 min intervals and repeated 1 week later. Intrarater reliability in the more and less experienced rater was ICC 0.84 (95% CI 0.69 to 0.98) and 0.72 (95% CI 0.......49 to 0.95), respectively. The figures for inter-rater reliability were intermediate between these figures. In more than 70% of the cases, the raters agreed within ±3 TPs in both men and women and between test days. The smallest detectable difference between raters was 5, and for the more and less...
Reliability analysis for manual radiographic measures of rotatory subluxation or lateral listhesis in adult scoliosis.

Science.gov (United States)

Freedman, Brett A; Horton, William C; Rhee, John M; Edwards, Charles C; Kuklo, Timothy R

2009-03-15

Retrospective observational study. To define the inter- and intraobserver reliability of 3 measures of rotatory subluxation (RS) in adult scoliosis (AS). RS is a hallmark of AS. To accurately track this measure, one must know its reliability. Reliability testing has not been performed. PA 36" films of 29 AS patients were collected from one surgeon's practice. Three observers on 2 separate occasions measured all levels with >or=3-mm RS (60 levels, 360 measurements) on the convexity of the involved segment using 3 different techniques-midbody (MB), endplate (EP), and centroid (C). These data were then analyzed to determine the intraclass correlation coefficient (ICC) for inter- and intraobserver reliability. The thoracolumbar/lumbar curve (average 58 degrees ) was the major curve for the majority (62%) of patients. RS at L3/4 was most common (35%). The overall inter- and intraobserver reliability was good-excellent for all methods, but the centroid method consistently had the highest ICC. ICC correlated with observer experience. Moderate-severe arthritic change (present in 55%) and poor image quality (52%) decreased ICC, but it still remained good-excellent for each measure. The reproducibility coefficient for each measure was 4 mm for MB and 2.8 mm for C and EP. MB, EP, and C are reliable techniques to measure RS even in elderly arthritic spines, but the methods inherently produce different values for a given level. The centroid method is most reliable and least influenced by experience. The EP method is easy to perform and very reliable. Spine surgeons should pick their preferred method and apply it consistently. Changes >3 mm suggest RS progression. RS may be a useful measure in addition to Cobb angle in AS. Having defined measurement reliability, the role of RS progression in surgical indications and patient outcomes can be evaluated.
Testing inter-observer reliability of the Transition Analysis aging method on the William M. Bass forensic skeletal collection.

Science.gov (United States)

Fojas, Christina L; Kim, Jieun; Minsky-Rowland, Jocelyn D; Algee-Hewitt, Bridget F B

2018-01-01

Skeletal age estimation is an integral part of the biological profile. Recent work shows how multiple-trait approaches better capture senescence as it occurs at different rates among individuals. Furthermore, a Bayesian statistical framework of analysis provides more useful age estimates. The component-scoring method of Transition Analysis (TA) may resolve many of the functional and statistical limitations of traditional phase-aging methods and is applicable to both paleodemography and forensic casework. The present study contributes to TA-research by validating TA for multiple, differently experienced observers using a collection of modern forensic skeletal cases. Five researchers independently applied TA to a random sample of 58 documented individuals from the William M. Bass Forensic Skeletal Collection, for whom knowledge of chronological age was withheld. Resulting scores were input into the ADBOU software and maximum likelihood estimates (MLEs) and 95% confidence intervals (CIs) were produced using the forensic prior. Krippendorff's alpha was used to evaluate interrater reliability and agreement. Inaccuracy and bias were measured to gauge the magnitude and direction of difference between estimated ages and chronological ages among the five observers. The majority of traits had moderate to excellent agreement among observers (≥0.6). The superior surface morphology had the least congruence (0.4), while the ventral symphyseal margin had the most (0.9) among scores. Inaccuracy was the lowest for individuals younger than 30 and the greatest for individuals over 60. Consistent over-estimation of individuals younger than 30 and under-estimation of individuals over 40 years old occurred. Individuals in their 30s showed a mixed pattern of under- and over-estimation among observers. These results support the use of the TA method by researchers of varying experience levels. Further, they validate its use on forensic cases, given the low error overall. © 2017 Wiley
The Pooling-score (P-score): inter- and intra-rater reliability in endoscopic assessment of the severity of dysphagia.

Science.gov (United States)

Farneti, D; Fattori, B; Nacci, A; Mancini, V; Simonelli, M; Ruoppolo, G; Genovese, E

2014-04-01

This study evaluated the intra- and inter-rater reliability of the Pooling score (P-score) in clinical endoscopic evaluation of severity of swallowing disorder, considering excess residue in the pharynx and larynx. The score (minimum 4 - maximum 11) is obtained by the sum of the scores given to the site of the bolus, the amount and ability to control residue/bolus pooling, the latter assessed on the basis of cough, raclage, number of dry voluntary or reflex swallowing acts ( 5). Four judges evaluated 30 short films of pharyngeal transit of 10 solid (1/4 of a cracker), 11 creamy (1 tablespoon of jam) and 9 liquid (1 tablespoon of 5 cc of water coloured with methlyene blue, 1 ml in 100 ml) boluses in 23 subjects (10 M/13 F, age from 31 to 76 yrs, mean age 58.56±11.76 years) with different pathologies. The films were randomly distributed on two CDs, which differed in terms of the sequence of the films, and were given to judges (after an explanatory session) at time 0, 24 hours later (time 1) and after 7 days (time 2). The inter- and intra-rater reliability of the P-score was calculated using the intra-class correlation coefficient (ICC; 3,k). The possibility that consistency of boluses could affect the scoring of the films was considered. The ICC for site, amount, management and the P-score total was found to be, respectively, 0.999, 0.997, 1.00 and 0.999. Clinical evaluation of a criterion of severity of a swallowing disorder remains a crucial point in the management of patients with pathologies that predispose to complications. The P-score, derived from static and dynamic parameters, yielded a very high correlation among the scores attributed by the four judges during observations carried out at different times. Bolus consistencies did not affect the outcome of the test: the analysis of variance, performed to verify if the scores attributed by the four judges to the parameters selected, might be influenced by the different consistencies of the boluses, was not
Correction of gene expression data: Performance-dependency on inter-replicate and inter-treatment biases.

Science.gov (United States)

Darbani, Behrooz; Stewart, C Neal; Noeparvar, Shahin; Borg, Søren

2014-10-20

This report investigates for the first time the potential inter-treatment bias source of cell number for gene expression studies. Cell-number bias can affect gene expression analysis when comparing samples with unequal total cellular RNA content or with different RNA extraction efficiencies. For maximal reliability of analysis, therefore, comparisons should be performed at the cellular level. This could be accomplished using an appropriate correction method that can detect and remove the inter-treatment bias for cell-number. Based on inter-treatment variations of reference genes, we introduce an analytical approach to examine the suitability of correction methods by considering the inter-treatment bias as well as the inter-replicate variance, which allows use of the best correction method with minimum residual bias. Analyses of RNA sequencing and microarray data showed that the efficiencies of correction methods are influenced by the inter-treatment bias as well as the inter-replicate variance. Therefore, we recommend inspecting both of the bias sources in order to apply the most efficient correction method. As an alternative correction strategy, sequential application of different correction approaches is also advised. Copyright © 2014 Elsevier B.V. All rights reserved.
Reliability of maximal isometric knee strength testing with modified hand-held dynamometry in patients awaiting total knee arthroplasty: useful in research and individual patient settings? A reliability study

Directory of Open Access Journals (Sweden)

Koblbauer Ian FH

2011-10-01

Full Text Available Abstract Background Patients undergoing total knee arthroplasty (TKA often experience strength deficits both pre- and post-operatively. As these deficits may have a direct impact on functional recovery, strength assessment should be performed in this patient population. For these assessments, reliable measurements should be used. This study aimed to determine the inter- and intrarater reliability of hand-held dynamometry (HHD in measuring isometric knee strength in patients awaiting TKA. Methods To determine interrater reliability, 32 patients (81.3% female were assessed by two examiners. Patients were assessed consecutively by both examiners on the same individual test dates. To determine intrarater reliability, a subgroup (n = 13 was again assessed by the examiners within four weeks of the initial testing procedure. Maximal isometric knee flexor and extensor strength were tested using a modified Citec hand-held dynamometer. Both the affected and unaffected knee were tested. Reliability was assessed using the Intraclass Correlation Coefficient (ICC. In addition, the Standard Error of Measurement (SEM and the Smallest Detectable Difference (SDD were used to determine reliability. Results In both the affected and unaffected knee, the inter- and intrarater reliability were good for knee flexors (ICC range 0.76-0.94 and excellent for knee extensors (ICC range 0.92-0.97. However, measurement error was high, displaying SDD ranges between 21.7% and 36.2% for interrater reliability and between 19.0% and 57.5% for intrarater reliability. Overall, measurement error was higher for the knee flexors than for the knee extensors. Conclusions Modified HHD appears to be a reliable strength measure, producing good to excellent ICC values for both inter- and intrarater reliability in a group of TKA patients. High SEM and SDD values, however, indicate high measurement error for individual measures. This study demonstrates that a modified HHD is appropriate to
How to assess intra- and inter-observer agreement with quantitative PET using variance component analysis

DEFF Research Database (Denmark)

Gerke, Oke; Vilstrup, Mie Holm; Segtnan, Eivind Antonsen

2016-01-01

(THG) in study 2. RESULTS: In study 1, we found a RC of 2.46 equalling half the width of the Bland-Altman limits of agreement. In study 2, the RC for identical conditions (same scanner, patient, time point, and observer) was 2392; allowing for different scanners increased the RC to 2543. Inter...
Relationship and inter observer agreement of tooth and face forms in a Saudi subpopulation.

Science.gov (United States)

Habib, Syed Rashid; Shiddi, Ibraheem Al; Al-Sufyani, Mohammed D; Althobaiti, Fahad A

2015-04-01

To determine the relationship of tooth form with the face form by different observers and further investigate the inter observer agreement on tooth forms, face forms, their relationship among male Saudis. A comparative cross-sectional study. Department of Prosthodontics, College of Dentistry, King Saud University, Riyadh, KSA, from February till August 2013. Ninety four male participants aged 18 - 35 years were randomly recruited for the study. Full-face and anterior teeth (intraoral) digital photographs in the frontal plane were recorded. The outline tracings of the face and the tooth were obtained using Autocad (version 2010) software. The outline of the tooth was enlarged proportionately, without altering the length to width ratio to fit the face outline. The outlines were then evaluated visually by 6 prosthodontists and results were tabulated. The most common type of face form (49.65%) and tooth form (56.38%) was square tapering. Using the visual method, a good relationship (31.41%), moderate relationship (35.31%), weak relationship (19.68%) and no relationship (13.65%) between the tooth form and face form was found by the observers. Overall kappa for inter observer agreement on face form, tooth form and their relationship was 0.24, 0.17 and 0.26 respectively. The kappa values showed a fair agreement between the observers. The study results indicated that there was no highly defined relationship between the tooth form and face form in the studied Saudi subpopulation. A fair agreement was found between the observers for classifying the tooth forms, face froms and their relationship.
WhatsApp Messenger is useful and reproducible in the assessment of tibial plateau fractures: inter- and intra-observer agreement study.

Science.gov (United States)

Giordano, Vincenzo; Koch, Hilton Augusto; Mendes, Carlos Henrique; Bergamin, André; de Souza, Felipe Serrão; do Amaral, Ney Pecegueiro

2015-02-01

The aim of this study was to evaluate the inter- and intra-observer agreement in the initial diagnosis and classification by means of plain radiographs and CT scans of tibial plateau fractures photographed and sent via WhatsApp Messenger. The increasing popularity of smartphones has driven the development of technology for data transmission and imaging and generated a growing interest in the use of these devices as diagnostic tools. The emergence of WhatsApp Messenger technology, which is available for various platforms used by smartphones, has led to an improvement in the quality and resolution of images sent and received. The images (plain radiographs and CT scans) were obtained from 13 cases of tibial plateau fractures using the iPhone 5 (Apple Inc., Cupertino, CA, USA) and were sent to six observers via the WhatsApp Messenger application. The observers were asked to determine the standard deviation and type of injury, the classification according to the Schatzker and the Luo classifications schemes, and whether the CT scan changed the classification. The six observers independently assessed the images on two separate occasions, 15 days apart. The inter- and intra-observer agreement for both periods of the study ranged from excellent to perfect (0.75WhatsApp Messenger. The authors now propose the systematic use of the application to facilitate faster documentation and obtaining the opinion of an experienced consultant when not on call. Finally, we think the use of the WhatsApp Messenger as an adjuvant tool could be broadened to other clinical centres to assess its viability in other skeletal and non-skeletal trauma situations. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
A national drug related problems database: evaluation of use in practice, reliability and reproducibility

DEFF Research Database (Denmark)

Kjeldsen, Lene Juel; Birkholm, Trine; Fischer, Hanne Lis

2014-01-01

Background A drug related problems database (DRP-database) was developed on request by clinical pharmacists. The information from the DRP-database has only been used locally e.g. to identify focus areas and to communicate identified DRPs to the hospital wards. Hence the quality of the data...... by clinical pharmacists with categorization performed by the project group. Reproducibility was explored by re-categorization of a sample of existing records in the DRP-database by two project group members individually. Main outcome measures Observed proportion of agreement and Fleiss' kappa as measures...... reliability study of 34 clinical pharmacists showed high inter-rater reliability with the project group (Fleiss' kappa = 0.79 with 95 % CI (0.70; 0.88)), and the reproducibility study also documented high inter-rater reliability of a sample of 379 records from the DRP-database re-categorized by two project...

MRI assessment of tenosynovitis in children with juvenile idiopathic arthritis: inter- and intra-observer variability

International Nuclear Information System (INIS)

Lambot, Karen; Brunelle, Francis; Boavida, Peter; Damasio, Maria Beatrice; Tanturri de Horatio, Laura; Barbuti, Domenico; Desgranges, Marie; Bader-Meunier, Brigitte; Quartier, Pierre; Malattia, Clara; Bracaglia, Claudia; Ording Mueller, Lil-Sofie; Elie, Caroline; Rosendahl, Karen

2013-01-01

There is sparse knowledge about grading tenosynovitis using MRI. The purpose of this study was to assess the reliability of a tenosynovitis MRI scoring system in juvenile idiopathic arthritis. Children with juvenile idiopathic arthritis and wrist involvement were enrolled in two paediatric centres, from October 2006 to January 2010. The extensor (compartments II, IV and VI) and flexor tendons were assessed for the presence of tenosynovitis on T1-weighted postcontrast fat-saturated MR images and were scored from 0 (normal) to 2 (moderate to severe) by two observers independently. Intra- and interobserver agreement was assessed. Ninety children (age range: 5-18.5 years) were included, of whom 34 had tenosynovitis involving extensors and 28 had tenosynovitis involving flexors. A total of 360 tendon areas were analysed, of which 114 had tenosynovitis (86/270 extensors and 28/90 flexors). Intra-reader 1 agreement was excellent for the extensors (k = 0.82-0.91) and for the flexors (k = 0.85); intra-reader 2 agreement was moderate to good for the extensors (k = 0.51-0.72) and good for the flexors (k = 0.64). Inter-reader agreement was good for the extensors (k = 0.69-0.73) and moderate for the flexors (k = 0.49). The proposed MRI scoring system for the assessment of wrist tenosynovitis in juvenile idiopathic arthritis appears feasible with an observer agreement sufficient for clinical use. (orig.)
MRI assessment of tenosynovitis in children with juvenile idiopathic arthritis: inter- and intra-observer variability

Energy Technology Data Exchange (ETDEWEB)

Lambot, Karen; Brunelle, Francis [Hopital Necker-Enfants Malades, Department of Paediatric Radiology, Paris (France); Boavida, Peter [Great Ormond Street Hospital, Department of Radiology, London (United Kingdom); Damasio, Maria Beatrice [Ospedale Pediatrico Gaslini, Department of Radiology, Genoa (Italy); Tanturri de Horatio, Laura; Barbuti, Domenico [Ospedale Pediatrico Bambino Gesu, Department of Radiology, Rome (Italy); Desgranges, Marie; Bader-Meunier, Brigitte; Quartier, Pierre [Hopital Necker-Enfants Malades, Department of Paediatric Immunology, Hematology and Rheumatology, APHP French Reference Center ' ' Arthrites juveniles' ' , Paris (France); Malattia, Clara [University of Genoa, Department of Paediatrics, Genoa (Italy); Bracaglia, Claudia [Ospedale Pediatrico Bambino Gesu, Department of Paediatrics, Rome (Italy); Ording Mueller, Lil-Sofie [Great Ormond Street Hospital, Department of Radiology, London (United Kingdom); University Hospital of North Norway, Department of Radiology, Tromsoe (Norway); Elie, Caroline [Paris Descartes University, Department of Biostatistics, Hopital Necker-Enfants Malades, Paris (France); Rosendahl, Karen [Great Ormond Street Hospital, Department of Radiology, London (United Kingdom); Haukeland University Hospital, Department of Radiology, Bergen (Norway)

2013-07-15

There is sparse knowledge about grading tenosynovitis using MRI. The purpose of this study was to assess the reliability of a tenosynovitis MRI scoring system in juvenile idiopathic arthritis. Children with juvenile idiopathic arthritis and wrist involvement were enrolled in two paediatric centres, from October 2006 to January 2010. The extensor (compartments II, IV and VI) and flexor tendons were assessed for the presence of tenosynovitis on T1-weighted postcontrast fat-saturated MR images and were scored from 0 (normal) to 2 (moderate to severe) by two observers independently. Intra- and interobserver agreement was assessed. Ninety children (age range: 5-18.5 years) were included, of whom 34 had tenosynovitis involving extensors and 28 had tenosynovitis involving flexors. A total of 360 tendon areas were analysed, of which 114 had tenosynovitis (86/270 extensors and 28/90 flexors). Intra-reader 1 agreement was excellent for the extensors (k = 0.82-0.91) and for the flexors (k = 0.85); intra-reader 2 agreement was moderate to good for the extensors (k = 0.51-0.72) and good for the flexors (k = 0.64). Inter-reader agreement was good for the extensors (k = 0.69-0.73) and moderate for the flexors (k = 0.49). The proposed MRI scoring system for the assessment of wrist tenosynovitis in juvenile idiopathic arthritis appears feasible with an observer agreement sufficient for clinical use. (orig.)
Evaluating team-based inter-professional advanced life support training in intensive care-a prospective observational study.

Science.gov (United States)

Brewster, D J; Barrett, J A; Gherardin, E; O'Neill, J A; Sage, D; Hanlon, G

2017-01-01

Recent focus on national standards within Australian hospitals has prompted a focus on the training of our staff in advanced life support (ALS). Research in critical care nursing has questioned the traditional annual certification of ALS competence as the best method of delivering this training. Simulation and team-based training may provide better ALS education to intensive care unit (ICU) staff. Our new inter-professional team-based advanced life support program involved ICU staff in a large private metropolitan ICU. A prospective observational study using three standardised questionnaires and two multiple choice questionnaire assessments was conducted. Ninety-nine staff demonstrated a 17.8% (95% confidence interval 4.2-31, P =0.01) increase in overall ICU nursing attendance at training sessions. Questionnaire response rates were 93 (94%), 99 (100%) and 60 (61%) respectively; 51 (52%) staff returned all three. Criteria were assessed by scores from 0 to 10. Nurses reported improved satisfaction with the education program (9.4 to 7.1, P versus 7.9 and 8.2, P versus 7.4 and 7.8, P versus 8.1, P =0.04). The new program cost approximately an extra $16,500 in nursing salaries. We concluded that team-based, inter-professional ALS training produced statistically significant improvements in nursing attendance, satisfaction with ALS education, confidence and role understanding compared to traditional ALS training.
Inter-Reader Reliability of Early FDG-PET/CT Response Assessment Using the Deauville Scale after 2 Cycles of Intensive Chemotherapy (OEPA) in Hodgkin's Lymphoma.

LENUS (Irish Health Repository)

Kluge, Regine

2016-01-01

The five point Deauville (D) scale is widely used to assess interim PET metabolic response to chemotherapy in Hodgkin lymphoma (HL) patients. An International Validation Study reported good concordance among reviewers in ABVD treated advanced stage HL patients for the binary discrimination between score D1,2,3 and score D4,5. Inter-reader reliability of the whole scale is not well characterised.
Prediction of Osteoporosis through Radiographic Assessment of Proximal Femoral Morphology and Texture in Elderly; is it Valid and Reliable

Directory of Open Access Journals (Sweden)

Özkan Köse

2015-08-01

Full Text Available Objective: The purpose of this study was to determine the best predictive radiographic measurement method to identify the presence of osteoporosis and test the inter-observer and intra-observer reliability and validity of these methods in postmenopausal women. Materials and Methods: Ninety-two elderly female patients who presented with hip pain were included. Hip radiographs were used to determine the values of Singh index (SI, canal-to-calcar ratio (CCR, and cortical thickness index (CTI. All measurements were performed by two independent observers on two separate occasions, at least 4 weeks apart. Bone mineral density (BMD was assessed by DEXA. In the first part of the analysis, reliability of the all measurement methods was tested. In the second part, correlation coefficient (Pearson r was used to determine the relationship between the measurement methods and BMD. Finally ROC curve analysis was performed to determine the sensitivity, specificity, and threshold values for each radiographic measurement method. Results: Intra-observer reliability analysis of SI revealed kappa coefficient of 0.359 for observer A, and 0.224 for observer B. Inter-observer reliability analysis of SI revealed kappa coefficient of 0.070 for observer A and 0.051 for observer B. The intra-observer and inter-observer reliability was good and excellent for CTI and CCR for both observers (ICC: 0.920 and ICC: 0.936. There was no correlation between SI and BMD (p=0.818. On the other hand, there was a significant correlation between CTI and CCR and BMD (p=0.001. All measured indices were significantly different (p<0.05 between osteoporotic and non-osteoporotic patients. CTI value less than 0.3 or CCR value less than 0.47 reflects the presence of osteoporosis with 100% sensitivity and 98% specificity. Conclusion: SI is not reliable and do not correlate with BMD. However, both CTI and CCR showed good and excellent reliability, and each index correlated well with the real BMD
Inter-rater reliability of h-index scores calculated by Web of Science and Scopus for clinical epidemiology scientists.

Science.gov (United States)

Walker, Benjamin; Alavifard, Sepand; Roberts, Surain; Lanes, Andrea; Ramsay, Tim; Boet, Sylvain

2016-06-01

We investigated the inter-rater reliability of Web of Science (WoS) and Scopus when calculating the h-index of 25 senior scientists in the Clinical Epidemiology Program of the Ottawa Hospital Research Institute. Bibliometric information and the h-indices for the subjects were computed by four raters using the automatic calculators in WoS and Scopus. Correlation and agreement between ratings was assessed using Spearman's correlation coefficient and a Bland-Altman plot, respectively. Data could not be gathered from Google Scholar due to feasibility constraints. The Spearman's rank correlation between the h-index of scientists calculated with WoS was 0.81 (95% CI 0.72-0.92) and with Scopus was 0.95 (95% CI 0.92-0.99). The Bland-Altman plot showed no significant rater bias in WoS and Scopus; however, the agreement between ratings is higher in Scopus compared to WoS. Our results showed a stronger relationship and increased agreement between raters when calculating the h-index of a scientist using Scopus compared to WoS. The higher inter-rater reliability and simple user interface used in Scopus may render it the more effective database when calculating the h-index of senior scientists in epidemiology. © 2016 Health Libraries Group.
How to assess and compare inter-rater reliability, agreement and correlation of ratings: an exemplary analysis of mother-father and parent-teacher expressive vocabulary rating pairs.

Science.gov (United States)

Stolarova, Margarita; Wolf, Corinna; Rinker, Tanja; Brielmann, Aenne

2014-01-01

This report has two main purposes. First, we combine well-known analytical approaches to conduct a comprehensive assessment of agreement and correlation of rating-pairs and to dis-entangle these often confused concepts, providing a best-practice example on concrete data and a tutorial for future reference. Second, we explore whether a screening questionnaire developed for use with parents can be reliably employed with daycare teachers when assessing early expressive vocabulary. A total of 53 vocabulary rating pairs (34 parent-teacher and 19 mother-father pairs) collected for two-year-old children (12 bilingual) are evaluated. First, inter-rater reliability both within and across subgroups is assessed using the intra-class correlation coefficient (ICC). Next, based on this analysis of reliability and on the test-retest reliability of the employed tool, inter-rater agreement is analyzed, magnitude and direction of rating differences are considered. Finally, Pearson correlation coefficients of standardized vocabulary scores are calculated and compared across subgroups. The results underline the necessity to distinguish between reliability measures, agreement and correlation. They also demonstrate the impact of the employed reliability on agreement evaluations. This study provides evidence that parent-teacher ratings of children's early vocabulary can achieve agreement and correlation comparable to those of mother-father ratings on the assessed vocabulary scale. Bilingualism of the evaluated child decreased the likelihood of raters' agreement. We conclude that future reports of agreement, correlation and reliability of ratings will benefit from better definition of terms and stricter methodological approaches. The methodological tutorial provided here holds the potential to increase comparability across empirical reports and can help improve research practices and knowledge transfer to educational and therapeutic settings.
How to assess and compare inter-rater reliability, agreement and correlation of ratings: an exemplary analysis of mother-father and parent-teacher expressive vocabulary rating pairs

Science.gov (United States)

Stolarova, Margarita; Wolf, Corinna; Rinker, Tanja; Brielmann, Aenne

2014-01-01

This report has two main purposes. First, we combine well-known analytical approaches to conduct a comprehensive assessment of agreement and correlation of rating-pairs and to dis-entangle these often confused concepts, providing a best-practice example on concrete data and a tutorial for future reference. Second, we explore whether a screening questionnaire developed for use with parents can be reliably employed with daycare teachers when assessing early expressive vocabulary. A total of 53 vocabulary rating pairs (34 parent–teacher and 19 mother–father pairs) collected for two-year-old children (12 bilingual) are evaluated. First, inter-rater reliability both within and across subgroups is assessed using the intra-class correlation coefficient (ICC). Next, based on this analysis of reliability and on the test-retest reliability of the employed tool, inter-rater agreement is analyzed, magnitude and direction of rating differences are considered. Finally, Pearson correlation coefficients of standardized vocabulary scores are calculated and compared across subgroups. The results underline the necessity to distinguish between reliability measures, agreement and correlation. They also demonstrate the impact of the employed reliability on agreement evaluations. This study provides evidence that parent–teacher ratings of children's early vocabulary can achieve agreement and correlation comparable to those of mother–father ratings on the assessed vocabulary scale. Bilingualism of the evaluated child decreased the likelihood of raters' agreement. We conclude that future reports of agreement, correlation and reliability of ratings will benefit from better definition of terms and stricter methodological approaches. The methodological tutorial provided here holds the potential to increase comparability across empirical reports and can help improve research practices and knowledge transfer to educational and therapeutic settings. PMID:24994985
Definition of gross tumor volume in lung cancer: inter-observer variability

NARCIS (Netherlands)

van de Steene, Jan; Linthout, Nadine; de Mey, Johan; Vinh-Hung, Vincent; Claassens, Cornelia; Noppen, Marc; Bel, Arjan; Storme, Guy

2002-01-01

BACKGROUND AND PURPOSE: To determine the inter-observer variation in gross tumor volume (GTV) definition in lung cancer, and its clinical relevance. MATERIALS AND METHODS: Five clinicians involved in lung cancer were asked to define GTV on the planning CT scan of eight patients. Resulting GTVs were
An objective spinal motion imaging assessment (OSMIA): reliability, accuracy and exposure data.

Science.gov (United States)

Breen, Alan C; Muggleton, Jennifer M; Mellor, Fiona E

2006-01-04

Minimally-invasive measurement of continuous inter-vertebral motion in clinical settings is difficult to achieve. This paper describes the reliability, validity and radiation exposure levels in a new Objective Spinal Motion Imaging Assessment system (OSMIA) based on low-dose fluoroscopy and image processing. Fluoroscopic sequences in coronal and sagittal planes were obtained from 2 calibration models using dry lumbar vertebrae, plus the lumbar spines of 30 asymptomatic volunteers. Calibration model 1 (mobile) was screened upright, in 7 inter-vertebral positions. The volunteers and calibration model 2 (fixed) were screened on a motorized table comprising 2 horizontal sections, one of which moved through 80 degrees. Model 2 was screened during motion 5 times and the L2-S1 levels of the volunteers twice. Images were digitised at 5fps. Inter-vertebral motion from model 1 was compared to its pre-settings to investigate accuracy. For volunteers and model 2, the first digitised image in each sequence was marked with templates. Vertebrae were tracked throughout the motion using automated frame-to-frame registration. For each frame, vertebral angles were subtracted giving inter-vertebral motion graphs. Volunteer data were acquired twice on the same day and analysed by two blinded observers. The root-mean-square (RMS) differences between paired data were used as the measure of reliability. RMS difference between reference and computed inter-vertebral angles in model 1 was 0.32 degrees for side-bending and 0.52 degrees for flexion-extension. For model 2, X-ray positioning contributed more to the variance of range measurement than did automated registration. For volunteer image sequences, RMS inter-observer variation in intervertebral motion range in the coronal plane was 1.86 degrees and intra-subject biological variation was between 2.75 degrees and 2.91 degrees. RMS inter-observer variation in the sagittal plane was 1.94 degrees. Radiation dosages in each view were below
The reliability and validity of radiological assessment for patellar instability. A systematic review and meta-analysis

Energy Technology Data Exchange (ETDEWEB)

Smith, Toby O. [University of East Anglia, Faculty of Health, Norwich (United Kingdom); Davies, Leigh [Norfolk and Norwich University Hospital, Norwich (United Kingdom); Toms, Andoni P.; Donell, Simon T. [University of East Anglia, Faculty of Health, Norwich (United Kingdom); Norfolk and Norwich University Hospital, Norwich (United Kingdom); Hing, Caroline B. [St George' s Hospital, London (United Kingdom)

2011-04-15

To determine the discriminative validity and reliability of the evidence base using meta-analysis. A review of published sources using the databases AMED, CINHAL, EMBASE, MEDLINE, Scopus and the Cochrane Library, and for unpublished material was conducted. All studies assessing the reliability, validity, sensitivity or specificity of magnetic resonance imaging (MRI), computed tomography (CT) or ultrasound (US) of the patellofemoral joint of patients following patellar dislocation, subluxation or instability, were included. A meta-analysis was performed to assess the difference in radiological measurements between healthy controls and subjects with patellar instability in order to assess discrimination validity. A narrative assessment was used to evaluate the inter- and intra-observer reliability as well as the sensitivity and specificity of specific radiological measurements. A total of 27 studies were reviewed. The findings indicated that there was acceptable inter-observer and intra-observer reliability and validity for different methods of assessing patellar height and the sulcus angle with X-ray, MRI and CT methods, and the tibial tubercle-trochlear groove (TT-TG) assessed using CT. There was poor reliability or validity for the assessment of severity of trochlear dysplasia and the sulcus angle using US. There is insufficient evidence to determine the reliability, validity, sensitivity or specificity of tests such as the congruence angle, lateral patellar displacement, lateral patellar tilt, trochlear depth, boss height, the crossing sign or Wiberg patellar classification. A critical appraisal of the literature identified a number of recurrent methodological limitations. Further study is recommended to evaluate the reliability and validity of these radiological outcomes using well-designed radiological trials. (orig.)
The reliability and validity of radiological assessment for patellar instability. A systematic review and meta-analysis

International Nuclear Information System (INIS)

Smith, Toby O.; Davies, Leigh; Toms, Andoni P.; Donell, Simon T.; Hing, Caroline B.

2011-01-01

To determine the discriminative validity and reliability of the evidence base using meta-analysis. A review of published sources using the databases AMED, CINHAL, EMBASE, MEDLINE, Scopus and the Cochrane Library, and for unpublished material was conducted. All studies assessing the reliability, validity, sensitivity or specificity of magnetic resonance imaging (MRI), computed tomography (CT) or ultrasound (US) of the patellofemoral joint of patients following patellar dislocation, subluxation or instability, were included. A meta-analysis was performed to assess the difference in radiological measurements between healthy controls and subjects with patellar instability in order to assess discrimination validity. A narrative assessment was used to evaluate the inter- and intra-observer reliability as well as the sensitivity and specificity of specific radiological measurements. A total of 27 studies were reviewed. The findings indicated that there was acceptable inter-observer and intra-observer reliability and validity for different methods of assessing patellar height and the sulcus angle with X-ray, MRI and CT methods, and the tibial tubercle-trochlear groove (TT-TG) assessed using CT. There was poor reliability or validity for the assessment of severity of trochlear dysplasia and the sulcus angle using US. There is insufficient evidence to determine the reliability, validity, sensitivity or specificity of tests such as the congruence angle, lateral patellar displacement, lateral patellar tilt, trochlear depth, boss height, the crossing sign or Wiberg patellar classification. A critical appraisal of the literature identified a number of recurrent methodological limitations. Further study is recommended to evaluate the reliability and validity of these radiological outcomes using well-designed radiological trials. (orig.)
Development and Reliability Testing of a Fast-Food Restaurant Observation Form.

Science.gov (United States)

Rimkus, Leah; Ohri-Vachaspati, Punam; Powell, Lisa M; Zenk, Shannon N; Quinn, Christopher M; Barker, Dianne C; Pugach, Oksana; Resnick, Elissa A; Chaloupka, Frank J

2015-01-01

To develop a reliable observational data collection instrument to measure characteristics of the fast-food restaurant environment likely to influence consumer behaviors, including product availability, pricing, and promotion. The study used observational data collection. Restaurants were in the Chicago Metropolitan Statistical Area. A total of 131 chain fast-food restaurant outlets were included. Interrater reliability was measured for product availability, pricing, and promotion measures on a fast-food restaurant observational data collection instrument. Analysis was done with Cohen's κ coefficient and proportion of overall agreement for categorical variables and intraclass correlation coefficient (ICC) for continuous variables. Interrater reliability, as measured by average κ coefficient, was .79 for menu characteristics, .84 for kids' menu characteristics, .92 for food availability and sizes, .85 for beverage availability and sizes, .78 for measures on the availability of nutrition information,.75 for characteristics of exterior advertisements, and .62 and .90 for exterior and interior characteristics measures, respectively. For continuous measures, average ICC was .88 for food pricing measures, .83 for beverage prices, and .65 for counts of exterior advertisements. Over 85% of measures demonstrated substantial or almost perfect agreement. Although some measures required revision or protocol clarification, results from this study suggest that the instrument may be used to reliably measure the fast-food restaurant environment.
SU-E-J-266: Cone Beam Computed Tomography (CBCT) Inter-Scan and Inter-Observer Tumor Volume Variability Assessment in Patients Treated with Stereotactic Body Radiation Therapy (SBRT) for Early Stage Non-Small Cell Lung Cancer (NSCLC)

Energy Technology Data Exchange (ETDEWEB)

Hou, Y; Aileen, C; Kozono, D; Killoran, J; Wagar, M; Lee, S; Hacker, F; Aerts, H; Lewis, J; Mak, R [Brigham and Women’s Hospital, Boston, MA (United States)

2015-06-15

Purpose: Quantification of volume changes on CBCT during SBRT for NSCLC may provide a useful radiological marker for radiation response and adaptive treatment planning, but the reproducibility of CBCT volume delineation is a concern. This study is to quantify inter-scan/inter-observer variability in tumor volume delineation on CBCT. Methods: Twenty earlystage (stage I and II) NSCLC patients were included in this analysis. All patients were treated with SBRT with a median dose of 54 Gy in 3 to 5 fractions. Two physicians independently manually contoured the primary gross tumor volume on CBCTs taken immediately before SBRT treatment (Pre) and after the same SBRT treatment (Post). Absolute volume differences (AVD) were calculated between the Pre and Post CBCTs for a given treatment to quantify inter-scan variability, and then between the two observers for a given CBCT to quantify inter-observer variability. AVD was also normalized with respect to average volume to obtain relative volume differences (RVD). Bland-Altman approach was used to evaluate variability. All statistics were calculated with SAS version 9.4. Results: The 95% limit of agreement (mean ± 2SD) on AVD and RVD measurements between Pre and Post scans were −0.32cc to 0.32cc and −0.5% to 0.5% versus −1.9 cc to 1.8 cc and −15.9% to 15.3% for the two observers respectively. The 95% limit of agreement of AVD and RVD between the two observers were −3.3 cc to 2.3 cc and −42.4% to 28.2% respectively. The greatest variability in inter-scan RVD was observed with very small tumors (< 5 cc). Conclusion: Inter-scan variability in RVD is greatest with small tumors. Inter-observer variability was larger than inter-scan variability. The 95% limit of agreement for inter-observer and inter-scan variability (∼15–30%) helps define a threshold for clinically meaningful change in tumor volume to assess SBRT response, with larger thresholds needed for very small tumors. Part of the work was funded by a Kaye
SU-E-J-266: Cone Beam Computed Tomography (CBCT) Inter-Scan and Inter-Observer Tumor Volume Variability Assessment in Patients Treated with Stereotactic Body Radiation Therapy (SBRT) for Early Stage Non-Small Cell Lung Cancer (NSCLC)

International Nuclear Information System (INIS)

Hou, Y; Aileen, C; Kozono, D; Killoran, J; Wagar, M; Lee, S; Hacker, F; Aerts, H; Lewis, J; Mak, R

2015-01-01

Purpose: Quantification of volume changes on CBCT during SBRT for NSCLC may provide a useful radiological marker for radiation response and adaptive treatment planning, but the reproducibility of CBCT volume delineation is a concern. This study is to quantify inter-scan/inter-observer variability in tumor volume delineation on CBCT. Methods: Twenty earlystage (stage I and II) NSCLC patients were included in this analysis. All patients were treated with SBRT with a median dose of 54 Gy in 3 to 5 fractions. Two physicians independently manually contoured the primary gross tumor volume on CBCTs taken immediately before SBRT treatment (Pre) and after the same SBRT treatment (Post). Absolute volume differences (AVD) were calculated between the Pre and Post CBCTs for a given treatment to quantify inter-scan variability, and then between the two observers for a given CBCT to quantify inter-observer variability. AVD was also normalized with respect to average volume to obtain relative volume differences (RVD). Bland-Altman approach was used to evaluate variability. All statistics were calculated with SAS version 9.4. Results: The 95% limit of agreement (mean ± 2SD) on AVD and RVD measurements between Pre and Post scans were −0.32cc to 0.32cc and −0.5% to 0.5% versus −1.9 cc to 1.8 cc and −15.9% to 15.3% for the two observers respectively. The 95% limit of agreement of AVD and RVD between the two observers were −3.3 cc to 2.3 cc and −42.4% to 28.2% respectively. The greatest variability in inter-scan RVD was observed with very small tumors (< 5 cc). Conclusion: Inter-scan variability in RVD is greatest with small tumors. Inter-observer variability was larger than inter-scan variability. The 95% limit of agreement for inter-observer and inter-scan variability (∼15–30%) helps define a threshold for clinically meaningful change in tumor volume to assess SBRT response, with larger thresholds needed for very small tumors. Part of the work was funded by a Kaye
Strength and Pain Threshold Handheld Dynamometry Test Reliability in Patellofemoral Pain.

Science.gov (United States)

van der Heijden, R A; Vollebregt, T; Bierma-Zeinstra, S M A; van Middelkoop, M

2015-12-01

Patellofemoral pain syndrome (PFPS), characterized by peri- and retropatellar pain, is a common disorder in young, active people. The etiology is unclear; however, quadriceps strength seems to be a contributing factor, and sensitization might play a role. The study purpose is determining the inter-rater reliability of handheld dynamometry to test both quadriceps strength and pressure pain threshold (PPT), a measure for sensitization, in patients with PFPS. This cross-sectional case-control study comprises 3 quadriceps strength and one PPT measurements performed by 2 independent investigators in 22 PFPS patients and 16 matched controls. Inter-rater reliability was analyzed using intraclass correlation coefficients (ICC) and Bland-Altman plots. Inter-rater reliability of quadriceps strength testing was fair to good in PFPS patients (ICC=0.72) and controls (ICC=0.63). Bland-Altman plots showed an increased difference between assessors when average quadriceps strength values exceeded 250 N. Inter-rater reliability of PPT was excellent in patients (ICC=0.79) and fair to good in controls (ICC=0.52). Handheld dynamometry seems to be a reliable method to test both quadriceps strength and PPT in PFPS patients. Inter-rater reliability was higher in PFPS patients compared to control subjects. With regard to quadriceps testing, a higher variance between assessors occurs when quadriceps strength increases. © Georg Thieme Verlag KG Stuttgart · New York.
Observer Variability in Evaluating Pisotriquetral Osteoarthritis using Pisotriquetral View

NARCIS (Netherlands)

Heeg, Erik; ten Berg, Paul W. L.; Maas, Mario; Strackee, Simon D.

2017-01-01

A pisotriquetral (semilateral) view of the wrist may improve the assessment of pisotriquetral osteoarthritis (OA), but its reliability and reproducibility are unclear. The purpose of this cross-sectional observer study was to investigate (1) the inter- and intraobserver agreement of evaluating
Reliability of externally fixed dynamometry hamstring strength testing in elite youth football players.

Science.gov (United States)

Wollin, Martin; Purdam, Craig; Drew, Michael K

2016-01-01

To investigate inter and intra-tester reliability of an externally fixed dynamometry unilateral hamstring strength test, in the elite sports setting. Reliability study. Sixteen, injury-free, elite male youth football players (age=16.81±0.54 years, height=180.22±5.29cm, weight 73.88±6.54kg, BMI=22.57±1.42) gave written informed consent. Unilateral maximum isometric peak hamstring force was evaluated by externally fixed dynamometry for inter-tester, intra-day and intra-tester, inter-week reliability. The test position was standardised to correlate with the terminal swing phase of the gait running cycle. Inter and intra-tester values demonstrated good to high levels of reliability. The intra-class coefficient (ICC) for inter-tester, intra-day reliability was 0.87 (95% CI=0.75-0.93) with standard error of measure percentage (SEM%) 4.7 and minimal detectable change percentage (MDC%) 12.9. Intra-tester, inter-week reliability results were ICC 0.86 (95% CI, 0.74-0.93), SEM% 5.0 and MDC% 14.0. This study demonstrates good to high inter and intra-tester reliability of isometric externally fixed dynamometry unilateral hamstring strength testing in the regular elite sport setting involving elite male youth football players. The intra-class coefficient in association with the low standard error of measure and minimal detectable change percentages suggest that this procedure is appropriate for clinical and academic use as well as monitoring hamstring strength in the elite sport setting. Crown Copyright © 2015. Published by Elsevier Ltd. All rights reserved.
Inter- and intra-observer variation in soft-tissue sarcoma target definition.

Science.gov (United States)

Roberge, D; Skamene, T; Turcotte, R E; Powell, T; Saran, N; Freeman, C

2011-08-01

To evaluate inter- and intra-observer variability in gross tumor volume definition for adult limb/trunk soft tissue sarcomas. Imaging studies of 15 patients previously treated with preoperative radiation were used in this study. Five physicians (radiation oncologists, orthopedic surgeons and a musculoskeletal radiologist) were asked to contour each of the 15 tumors on T1-weighted, gadolinium-enhanced magnetic resonance images. These contours were drawn twice by each physician. The volume and center of mass coordinates for each gross tumor volume were extracted and a Boolean analysis was performed to measure the degree of volume overlap. The median standard deviation in gross tumor volumes across observers was 6.1% of the average volume (range: 1.8%-24.9%). There was remarkably little variation in the 3D position of the gross tumor volume center of mass. For the 15 patients, the standard deviation of the 3D distance between centers of mass ranged from 0.06 mm to 1.7 mm (median 0.1mm). Boolean analysis demonstrated that 53% to 90% of the gross tumor volume was common to all observers (median overlap: 79%). The standard deviation in gross tumor volumes on repeat contouring was 4.8% (range: 0.1-14.4%) with a standard deviation change in the position of the center of mass of 0.4mm (range: 0mm-2.6mm) and a median overlap of 93% (range: 73%-98%). Although significant inter-observer differences were seen in gross tumor volume definition of adult soft-tissue sarcoma, the center of mass of these volumes was remarkably consistent. Variations in volume definition did not correlate with tumor size. Radiation oncologists should not hesitate to review their contours with a colleague (surgeon, radiologist or fellow radiation oncologist) to ensure that they are not outliers in sarcoma gross tumor volume definition. Protocols should take into account variations in volume definition when considering tighter clinical target volumes. Copyright © 2011 Société française de radioth
How to assess and compare inter-rater reliability, agreement and correlation of ratings: an exemplary analysis of mother-father and parent-teacher expressive vocabulary rating pairs

Directory of Open Access Journals (Sweden)

Margarita eStolarova

2014-06-01

Full Text Available This report has two main purposes. First, we combine well-known analytical approaches to conduct a comprehensive assessment of agreement and correlation of rating-pairs and to dis-entangle these often confused concepts, providing a best-practice example on concrete data and a tutorial for future reference. Second, we explore whether a screening questionnaire deve-loped for use with parents can be reliably employed with daycare teachers when assessing early expressive vocabulary. A total of 53 vocabulary rating pairs (34 parent-teacher and 19 mother-father pairs collected for two-year-old children (12 bilingual are evaluated. First, inter-rater reliability both within and across subgroups is assessed using the intra-class correlation coefficient (ICC. Next, based on this analysis of reliability and on the test-retest reliability of the employed tool, inter-rater agreement is analyzed, magnitude and direction of rating differences are considered. Finally, Pearson correlation coefficients of standardized vocabulary scores are calculated and compared across subgroups. The results underline the necessity to distinguish between reliability measures, agreement and correlation. They also demonstrate the impact of the employed reliability on agreement evaluations. This study provides evidence that parent-teacher ratings of children’s early vocabulary can achieve agreement and correlation comparable to those of mother-father ratings on the assessed vocabulary scale. Bilingualism of the evaluated child decreased the likelihood of raters’ agreement. We conclude that future reports of agree-ment, correlation and reliability of ratings will benefit from better definition of terms and stricter methodological approaches. The methodological tutorial provided here holds the potential to increase comparability across empirical reports and can help improve research practices and knowledge transfer to educational and therapeutic settings.

Inter- and intra-observer variability associated with the use of the Mirels' scoring system for metastatic bone lesions.

LENUS (Irish Health Repository)

Mac Niocaill, Ruairi F

2011-01-01

Metastatic bone disease is increasing in association with ever-improving medical management of osteophylic malignant conditions. The precise timing of surgical intervention for secondary lesions in long bones can be difficult to determine. This paper aims to evaluate a classic scoring system. All radiographs were examined twice by three orthopaedic oncologists and scored according to the Mirels\\' scoring system. The Kappa statistic was used for the purpose of statistical analysis. The results show agreement between observers (κ = 0.35-0.61) for overall scores at the two time intervals. Inter-observer agreement was also seen with subset analysis of size (κ = 0.27-0.60), site (κ = 0.77-1.0) and nature of the lesion (κ = 0.55-0.81). Similarly, low levels of intra-observer variability were noted for each of the three surgeons (κ= 0.34, 0.39, and 0.78, respectively). These results indicate a reliable, repeatable assessment of bony metastases. We continue to advocate its use in the management of patients with long bone metastases.
Observations of radiographer communication: An exploratory study using Transactional Analysis

International Nuclear Information System (INIS)

Booth, Lisa A.; Manning, David J.

2006-01-01

Purpose: Communication in medical imaging is a neglected area of research, despite the necessity for good communication if optimum diagnostic images are to be achieved. Methods: The present study has investigated the styles of communication used in medical imaging, using an approach known as Transactional Analysis. This approach has been demonstrated previously as having reliability and validity, using observations and supporting interviews with medical imaging staff, along with inter-rater observations of radiographer-patient interactions. Results: The results indicate that Transactional Analysis can be used effectively for identifying and naming interaction events in diagnostic radiography, with diagnostic radiographers using five styles of communication. Conclusion: Radiographers tend to use Parental styles of communicating; these styles are commonly associated with a practitioner-centred approach to dealing with patients which often result in non-adherence
Study for Reliability of Interpretation of the Three Phase Bone Scintigraphy in Patients with Post-traumatic Complex Regional Pain Syndrome

Energy Technology Data Exchange (ETDEWEB)

Park, Jung Mi [Bucheon Hospital Soonchunhyang University College of Medicine, Bucheon (Korea, Republic of); Kim, Seon Jung [National Health Insurance Corporation Ilsan Hospital, Koyang (Korea, Republic of); Chung, Seung Hyun [National Cancer Center, Koyang (Korea, Republic of); Lee, Yong Taek [Kangbuk Samsung Hospital, Sungkyunkwan University School of Medicine, Seoul (Korea, Republic of)

2008-02-15

We performed this study to evaluate reliability on interpretation of three phase bone scintigraphy (TPBS) in patients with post-traumatic complex regional pain syndrome (PT-CRPS). Based on International Association for the Study of Pain guideline in 1994, 34 patients with PT-CRPS were selected for this study. Two nuclear medicine physicians evaluated identical TPBS according to the uptake pattern, extent and intensity of the lesion, and their agreements (kappa values) were analysed. The final diagnosis based on arbitrary criteria of each physician were compared with those obtained by the criteria for PT-CRPS established in this study, which are hyperactivity on all phases (criteria 1), hyperactivity of whole joints on delayed phase (criteria 2), and hyperactivity of either whole or focal joints on delayed phase (criteria 3). Intra-observer agreements were good for uptake pattern, intensity, and extent on TPBS. Inter-observer agreements were also good, except extent on blood pool phase (0.55). The inter-observer agreements on final diagnosis improved when criteria 1-3 were applied (0.77-0.88), compared to when physician's own criteria were used (0.63). Those also improved from 0.29 to 0.47-0.82 for acute stage, and from 0.37 to 1.0 for chronic stage. The sensitivities of chronic stage were relatively lower to those of acute stage. Inter-observer's variations in diagnosis of the patients with PT-CRPS using TPBS were observed. These results were attributed to different criteria set by observers. In order to improve agreement on interpretation of TPBS, common positive criteria should be established, especially considering uptake pattern and clinical stages.
Study for Reliability of Interpretation of the Three Phase Bone Scintigraphy in Patients with Post-traumatic Complex Regional Pain Syndrome

International Nuclear Information System (INIS)

Park, Jung Mi; Kim, Seon Jung; Chung, Seung Hyun; Lee, Yong Taek

2008-01-01

We performed this study to evaluate reliability on interpretation of three phase bone scintigraphy (TPBS) in patients with post-traumatic complex regional pain syndrome (PT-CRPS). Based on International Association for the Study of Pain guideline in 1994, 34 patients with PT-CRPS were selected for this study. Two nuclear medicine physicians evaluated identical TPBS according to the uptake pattern, extent and intensity of the lesion, and their agreements (kappa values) were analysed. The final diagnosis based on arbitrary criteria of each physician were compared with those obtained by the criteria for PT-CRPS established in this study, which are hyperactivity on all phases (criteria 1), hyperactivity of whole joints on delayed phase (criteria 2), and hyperactivity of either whole or focal joints on delayed phase (criteria 3). Intra-observer agreements were good for uptake pattern, intensity, and extent on TPBS. Inter-observer agreements were also good, except extent on blood pool phase (0.55). The inter-observer agreements on final diagnosis improved when criteria 1-3 were applied (0.77-0.88), compared to when physician's own criteria were used (0.63). Those also improved from 0.29 to 0.47-0.82 for acute stage, and from 0.37 to 1.0 for chronic stage. The sensitivities of chronic stage were relatively lower to those of acute stage. Inter-observer's variations in diagnosis of the patients with PT-CRPS using TPBS were observed. These results were attributed to different criteria set by observers. In order to improve agreement on interpretation of TPBS, common positive criteria should be established, especially considering uptake pattern and clinical stages
Feasibility and reliability of a newly developed antenatal risk score card in routine care

NARCIS (Netherlands)

E. Birnie; E.A.P. Steegers; Drs. H.W. Torij; M.J. Veen; J. Poeran; G.J. Bonsel

2015-01-01

A population-based cross-sectional study (feasibility) and a cohort study (inter-rater reliability) to study in routine care the feasibility and inter-rater reliability of the Rotterdam Reproductive Risk Reduction risk score card (R4U), a new semi-quantitative score card for use during the antenatal
Reliability of one-repetition maximum performance in people with chronic heart failure.

Science.gov (United States)

Ellis, Rachel; Holland, Anne E; Dodd, Karen; Shields, Nora

2018-02-24

Evaluate intra-rater and inter-rater reliability of the one-repetition maximum strength test in people with chronic heart failure. Intra-rater and inter-rater reliability study. A public tertiary hospital in northern metropolitan Melbourne. Twenty-four participants (nine female, mean age 71.8 ± 13.1 years) with mild to moderate heart failure of any aetiology. Lower limb strength was assessed by determining the maximum weight that could be lifted using a leg press. Intra-rater reliability was tested by one assessor on two separate occasions . Inter-rater reliability was tested by two assessors in random order. Intra-class correlation coefficients and 95% confidence intervals were calculated. Bland and Altman analyses were also conducted, including calculation of mean differences between measures ([Formula: see text]) and limits of agreement . Ten intra-rater and 21 inter-rater assessments were completed. Excellent intra-rater (intra-class correlation coefficient 2,1 0.96) and inter-rater (intra-class correlation coefficient 2,1 0.93) reliability was found. Intra-rater assessment showed less variability (mean difference 4.5 kg, limits of agreement -8.11 to 17.11 kg) than inter-rater agreement (mean difference -3.81 kg, limits of agreement -23.39 to 15.77 kg). One-repetition maximum determined using a leg press is a reliable measure in people with heart failure. Given its smaller limits of agreement, intra-rater testing is recommended. Implications for Rehabilitation Using a leg press to determine a one-repetition maximum we were able to demonstrate excellent inter-rater and intra-rater reliability using an intra-class correlation coefficient. The Bland and Altman levels of agreement were wide for inter-rater reliability and so we recommend using one assessor if measuring change in strength within an individual over time.
Inter-observer variation in the interpretation of chest radiographs for pneumonia in community-acquired lower respiratory tract infections

Energy Technology Data Exchange (ETDEWEB)

Hopstaken, R.M. E-mail: rogier.hopstaken@hag.unimaas.nl; Witbraad, T.; Engelshoven, J.M.A. van; Dinant, G.J

2004-08-01

AIM: To assess inter-observer variation in the interpretation of chest radiographs of individuals with pneumonia versus those without pneumonia. MATERIALS AND METHODS: Chest radiographs of out-patients with a lower respiratory tract infection (LRTI) were assessed for the presence of infiltrates by radiologists from three local hospitals and were reassessed by one university hospital radiologist. Various measures of inter-observer agreement were calculated. RESULTS: The observed proportional agreement was 218 in 243 patients (89.7%). Kappa was 0.53 (moderate agreement) with a 95% confidence interval of 0.37 to 0.69. The observed positive agreement (59%) was much lower than for negative agreement (94%). Kappa was considerably lower, if chronic obstructive pulmonary disease was present ({kappa}=0.20) or Streptococcus pneumoniae ({kappa}=-0.29) was the infective agent. CONCLUSION: The overall inter-observer agreement adjusted for chance was moderate. Inter-observer agreement in cases with pneumonia was much worse than the agreement in negative (i.e. non-pneumonia) cases. A general practitioner's selection of patients with a higher chance of having pneumonia for chest radiography would thus not improve the observer agreement.
Inter-observer variation in the interpretation of chest radiographs for pneumonia in community-acquired lower respiratory tract infections

International Nuclear Information System (INIS)

Hopstaken, R.M.; Witbraad, T.; Engelshoven, J.M.A. van; Dinant, G.J.

2004-01-01

AIM: To assess inter-observer variation in the interpretation of chest radiographs of individuals with pneumonia versus those without pneumonia. MATERIALS AND METHODS: Chest radiographs of out-patients with a lower respiratory tract infection (LRTI) were assessed for the presence of infiltrates by radiologists from three local hospitals and were reassessed by one university hospital radiologist. Various measures of inter-observer agreement were calculated. RESULTS: The observed proportional agreement was 218 in 243 patients (89.7%). Kappa was 0.53 (moderate agreement) with a 95% confidence interval of 0.37 to 0.69. The observed positive agreement (59%) was much lower than for negative agreement (94%). Kappa was considerably lower, if chronic obstructive pulmonary disease was present (κ=0.20) or Streptococcus pneumoniae (κ=-0.29) was the infective agent. CONCLUSION: The overall inter-observer agreement adjusted for chance was moderate. Inter-observer agreement in cases with pneumonia was much worse than the agreement in negative (i.e. non-pneumonia) cases. A general practitioner's selection of patients with a higher chance of having pneumonia for chest radiography would thus not improve the observer agreement
Inter- and intraobserver reliability of the MTM-classification for proximal humeral fractures

DEFF Research Database (Denmark)

Bahrs, Christian; Schmal, Hagen; Lingenfelter, Erich

2008-01-01

tool. METHODS: Three observers classified plain radiographs of 22 fractures using both a simple version (fracture displacement, number of parts) and an extensive version (individual topographic fracture type and morphology) of the MTM classification. Kappa-statistics were used to determine reliability....... RESULTS: An acceptable reliability was found for the simple version classifying fracture displacement and fractured main parts. Fair interobserver agreement was found for the extensive version with individual topographic fracture type and morphology. CONCLUSION: Although the MTM-classification covers...
Inter-device reliability of an automatic-scoring actigraph for measuring sleep in healthy adults

Directory of Open Access Journals (Sweden)

Matthew Driller

2016-07-01

Full Text Available Actigraphy has become a common method of measuring sleep due to its non-invasive, cost-effective nature. An actigraph (Readiband™ that utilizes automatic scoring algorithms has been used in the research, but is yet to be evaluated for its inter-device reliability. A total of 77 nights of sleep data from 11 healthy adult participants was collected while participants were concomitantly wearing two Readiband™ actigraphs attached together (ACT1 and ACT2. Sleep indices including total sleep time (TST, sleep latency (SL, sleep efficiency (SE%, wake after sleep onset (WASO, total time in bed (TTB, wake episodes per night (WE, sleep onset variance (SOV and wake variance (WV were assessed between the two devices using mean differences, 95% levels of agreement, intraclass correlation coefficients (ICC, typical error of measurement (TEM and coefficient of variation (CV% analysis. There were no significant differences between devices for any of the measured sleep variables (p>0.05. TST, SE, SL, TTB, SOV and WV all resulted in very high ICC's (>0.90, with WASO and WE resulting in high ICC's between devices (0.85 and 0.80, respectively. Mean differences of −2.1 and 0.2 min for TST and SL were associated with a low TEM between devices (9.5 and 3.8 min, respectively. SE resulted in a 0.3% mean difference between devices. The Readiband™ is a reliable tool for researchers using multiple devices of this brand in sleep studies to assess basic measures of sleep quality and quantity in healthy adult populations.
The Orientation of Gastric Biopsy Samples Improves the Inter-observer Agreement of the OLGA Staging System.

Science.gov (United States)

Cotruta, Bogdan; Gheorghe, Cristian; Iacob, Razvan; Dumbrava, Mona; Radu, Cristina; Bancila, Ion; Becheanu, Gabriel

2017-12-01

Evaluation of severity and extension of gastric atrophy and intestinal metaplasia is recommended to identify subjects with a high risk for gastric cancer. The inter-observer agreement for the assessment of gastric atrophy is reported to be low. The aim of the study was to evaluate the inter-observer agreement for the assessment of severity and extension of gastric atrophy using oriented and unoriented gastric biopsy samples. Furthermore, the quality of biopsy specimens in oriented and unoriented samples was analyzed. A total of 35 subjects with dyspeptic symptoms addressed for gastrointestinal endoscopy that agreed to enter the study were prospectively enrolled. The OLGA/OLGIM gastric biopsies protocol was used. From each subject two sets of biopsies were obtained (four from the antrum, two oriented and two unoriented, two from the gastric incisure, one oriented and one unoriented, four from the gastric body, two oriented and two unoriented). The orientation of the biopsy samples was completed using nitrocellulose filters (Endokit®, BioOptica, Milan, Italy). The samples were blindly examined by two experienced pathologists. Inter-observer agreement was evaluated using kappa statistic for inter-rater agreement. The quality of histopathology specimens taking into account the identification of lamina propria was analyzed in oriented vs. unoriented samples. The samples with detectable lamina propria mucosae were defined as good quality specimens. Categorical data was analyzed using chi-square test and a two-sided p value <0.05 was considered statistically significant. A total of 350 biopsy samples were analyzed (175 oriented / 175 unoriented). The kappa index values for oriented/unoriented OLGA 0/I/II/III and IV stages have been 0.62/0.13, 0.70/0.20, 0.61/0.06, 0.62/0.46, and 0.77/0.50, respectively. For OLGIM 0/I/II/III stages the kappa index values for oriented/unoriented samples were 0.83/0.83, 0.88/0.89, 0.70/0.88 and 0.83/1, respectively. No case of OLGIM IV
Reliability of radiographic measurements for acute distal radius fractures

International Nuclear Information System (INIS)

Watson, Narelle J.; Asadollahi, Saeed; Parrish, Frank; Ridgway, Jacqueline; Tran, Phong; Keating, Jennifer L.

2016-01-01

The management of distal radial fractures is guided by the interpretation of radiographic findings. The aim of this investigation was to determine the intra- and inter-observer reliability of eight traditionally reported anatomic radiographic parameters in adults with an acute distal radius fracture. Five observers participated. All were routinely involved in making treatment decisions based on distal radius fracture radiographs. Observers performed independent repeated measurements on 30 radiographs for eight anatomical parameters: dorsal shift (mm), intra-articular gap (mm), intra-articular step (mm), palmar tilt (degrees), radial angle (degrees), radial height (mm), radial shift (mm), ulnar variance (mm). Intraclass correlation coefficients (ICCs) and the magnitude of retest errors were calculated. Measurement reliability was summarised as high (ICC > 0.80), moderate (0.60–0.80) or low (<0.60). Intra-observer reliability was high for dorsal shift and palmar tilt; moderate for radial angle, radial height, ulnar variance and radial shift; and low for intra-articular gap and step. Inter-observer reliability was high for palmar tilt; moderate for dorsal shift, ulnar variance, radial angle and radial height; and low for radial shift, intra-articular gap and step. Error magnitude (95 % confidence interval) was within 1–2 mm for intra-articular gap and step, 2–4 mm for ulnar variance, 4–6 mm for radial shift, dorsal shift and radial height, and 6–8° for radial angle and palmar tilt. Based on previous reports of critical values for palmar tilt, ulnar variance and radial angle, error margins appear small enough for measurements to be useful in guiding treatment decisions. Our findings indicate that clinicians cannot reliably measure values ≤1 mm for intra-articular gap and step when interpreting radiographic parameters using the standardised methods investigated in this study. As a guide for treatment selection, palmar tilt, ulnar variance and radial angle
The inter-rater reliability of the incontinence-associated dermatitis intervention tool-D (IADIT-D) between two independent registered nurses of nursing home residents in long-term care facilities.

Science.gov (United States)

Braunschmidt, Brigitte; Müller, Gerhard; Jukic-Puntigam, Margareta; Steininger, Alfred

2013-01-01

Incontinence-associated dermatitis (IAD) is the clinical manifestation of moisture related skin damage (Beeckman, Woodward, & Gray, 2011). Valid assessment instruments are needed for risk assessment and classification of IAD. Aim of the quantitative-descriptive cross-sectional study was to determine the inter-rater reliability of the item scores of the German Incontinence Associated Dermatitis Intervention Tool (IADIT-D) between two independent assessors of nursing home residents (n = 381) in long-term care facilities. The 19 pairs of assessors consisted of registered nurses. The data analysis was computed first with the calculation of the total percentage of agreement. Because this value is not randomly adjusted, the calculation of the Kappa-coefficients and AC1-Statistic was done as well. The total percentage of the inter-rater agreement was 84% (n = 319). In a second step of analysis, the calculation of all items determined high (kappa = .70) and very high agreement (AC1 = .83) levels, respectively. For the risk assessment (kappa = .82; AC1 = .94), the values amounted to very high agreement levels and for the classification (kappa(w) = .70; AC1 = .76) to high agreement levels. The high to very high agreement values of IADIT-D demonstrate that the items can be regarded as stable in regards to the inter-rater reliability for the use in long-term care facilities. In addition, further validation studies are needed.
The Reliability of Quality of Upper Extremity Skills Test in Children with Cerebral Palsy

Directory of Open Access Journals (Sweden)

Nazila Akbar-Fahimi

2012-01-01

Full Text Available Objective: The aim of this study was to survey the reliability of Intra-rater and Inter-rater with and without video camera assessment in children with spastic cerebral palsy. Materials & Methods: In this cross-sectional study, we validate the Quality of Upper Extremity Skill Test questionnaire. Fifty children with hemiplegia aged 19 to 95 months (mean age 61.31 ± 25.7 month were enrolled in our study using non random available approach. After obtaining parents’ consent, intra-rater assessment was performed in one session and intera rater assessment with camera after 10 days. Then, the third examiner did the reassessment using film observation of 46 children from 50. Spearman correlation for survey the reliability of intra-rater & inter rater with & without video recording assessment & gross motor function classification system 66 for determined functionality of child were used. Results: Intra-rater correlation was 0.774-0.996, Inter-rater correlation was 0.663-0.998 and correlation for video camera assessment was 0.710-0.974 for the first and third evaluation and 0.652-0.938 for second and third evaluation. P value for sub scales and total score was P<0.01. Conclusion: There is a high correlation in Intra rater and inter rater assessment with and without video recording in Quality of Upper Extremity Skill Test in children with cerebral palsy. So that it can be used as a reliable test to evaluate Quality of Upper Extremity Skills in these children.
How to assess intra- and inter-observer agreement with quantitative PET using variance component analysis: a proposal for standardisation

International Nuclear Information System (INIS)

Gerke, Oke; Vilstrup, Mie Holm; Segtnan, Eivind Antonsen; Halekoh, Ulrich; Høilund-Carlsen, Poul Flemming

2016-01-01

Quantitative measurement procedures need to be accurate and precise to justify their clinical use. Precision reflects deviation of groups of measurement from another, often expressed as proportions of agreement, standard errors of measurement, coefficients of variation, or the Bland-Altman plot. We suggest variance component analysis (VCA) to estimate the influence of errors due to single elements of a PET scan (scanner, time point, observer, etc.) to express the composite uncertainty of repeated measurements and obtain relevant repeatability coefficients (RCs) which have a unique relation to Bland-Altman plots. Here, we present this approach for assessment of intra- and inter-observer variation with PET/CT exemplified with data from two clinical studies. In study 1, 30 patients were scanned pre-operatively for the assessment of ovarian cancer, and their scans were assessed twice by the same observer to study intra-observer agreement. In study 2, 14 patients with glioma were scanned up to five times. Resulting 49 scans were assessed by three observers to examine inter-observer agreement. Outcome variables were SUVmax in study 1 and cerebral total hemispheric glycolysis (THG) in study 2. In study 1, we found a RC of 2.46 equalling half the width of the Bland-Altman limits of agreement. In study 2, the RC for identical conditions (same scanner, patient, time point, and observer) was 2392; allowing for different scanners increased the RC to 2543. Inter-observer differences were negligible compared to differences owing to other factors; between observer 1 and 2: −10 (95 % CI: −352 to 332) and between observer 1 vs 3: 28 (95 % CI: −313 to 370). VCA is an appealing approach for weighing different sources of variation against each other, summarised as RCs. The involved linear mixed effects models require carefully considered sample sizes to account for the challenge of sufficiently accurately estimating variance components. The online version of this article (doi:10
An objective spinal motion imaging assessment (OSMIA: reliability, accuracy and exposure data

Directory of Open Access Journals (Sweden)

Mellor Fiona E

2006-01-01

Full Text Available Abstract Background Minimally-invasive measurement of continuous inter-vertebral motion in clinical settings is difficult to achieve. This paper describes the reliability, validity and radiation exposure levels in a new Objective Spinal Motion Imaging Assessment system (OSMIA based on low-dose fluoroscopy and image processing. Methods Fluoroscopic sequences in coronal and sagittal planes were obtained from 2 calibration models using dry lumbar vertebrae, plus the lumbar spines of 30 asymptomatic volunteers. Calibration model 1 (mobile was screened upright, in 7 inter-vertebral positions. The volunteers and calibration model 2 (fixed were screened on a motorised table comprising 2 horizontal sections, one of which moved through 80 degrees. Model 2 was screened during motion 5 times and the L2-S1 levels of the volunteers twice. Images were digitised at 5fps. Inter-vertebral motion from model 1 was compared to its pre-settings to investigate accuracy. For volunteers and model 2, the first digitised image in each sequence was marked with templates. Vertebrae were tracked throughout the motion using automated frame-to-frame registration. For each frame, vertebral angles were subtracted giving inter-vertebral motion graphs. Volunteer data were acquired twice on the same day and analysed by two blinded observers. The root-mean-square (RMS differences between paired data were used as the measure of reliability. Results RMS difference between reference and computed inter-vertebral angles in model 1 was 0.32 degrees for side-bending and 0.52 degrees for flexion-extension. For model 2, X-ray positioning contributed more to the variance of range measurement than did automated registration. For volunteer image sequences, RMS inter-observer variation in intervertebral motion range in the coronal plane was 1.86 degreesand intra-subject biological variation was between 2.75 degrees and 2.91 degrees. RMS inter-observer variation in the sagittal plane was 1
Action research in inter-organisational networks : - impartial studies or the Trojan horse?

DEFF Research Database (Denmark)

Goduscheit, René Chester; Rasmussen, Erik Stavnsager; Jørgensen, Jacob Høj

2007-01-01

Traditionally, the literature on action research has been aimed at intra-organisational issues. These studies have distinguished between two researcher roles: The problem-solver and the observer. This article addresses the distinct challenges of action research in inter-organisational projects....... In addition to the problem-solver and observer roles, the researcher in an inter-organisational setting can serve as a legitimiser of the project and manage to involve partners that in an ordinary business-to-business setting would not have participated. Based on an action research project in a Danish inter......-organisational network, this article discusses potential pitfalls in the legitimiser role. Lack of clarity in defining the researcher role and project ownership in relation to the funding organisation and the rest of the network can jeopardise the project and potentially the credibility of the researchers. The article...
An initial reliability and validity study of the Interaction, Communication, and Literacy Skills Audit.

Science.gov (United States)

El-Choueifati, Nisrine; Purcell, Alison; McCabe, Patricia; Heard, Robert; Munro, Natalie

2014-06-01

Early childhood educators (ECEs) have an important role in promoting positive outcomes for children's language and literacy development. This paper reports the development of a new tool, The Interaction Communication and Literacy (ICL) Skills Audit, and pilots its reliability and validity. Intra- and inter-rater reliability was examined by three speech-language pathologists (SLPs). Five skill areas relating to ECE language and literacy practice were rated. The face and content validity of the ICL Skills Audit was examined by expert SLPs (n = 8) and expert ECEs (n = 4) via questionnaire. The overall intra-rater reliability for the ICL Skills Audit was excellent with percentage close agreement (PCA) of 91-94. Inter-rater agreement was PCA 68-80. Expert SLPs and ECEs agreed that the content was comprehensive and practical. Based on this preliminary study, the ICL Skills Audit appears to be a promising tool that can be used by SLPs and ECEs in collaboration to measure the skills of ECEs in the areas of language and literacy support. Future psychometric and outcome research on the revised ICL Skills Audit is warranted.
Inter-Rater Reliability of Neck Reflex Points in Women with Chronic Neck Pain.

Science.gov (United States)

Weinschenk, Stefan; Göllner, Richard; Hollmann, Markus W; Hotz, Lorenz; Picardi, Susanne; Hubbert, Katharina; Strowitzki, Thomas; Meuser, Thomas

2016-01-01

Neck reflex points (NRP) are tender soft tissue areas of the cervical region that display reflectory changes in response to chronic inflammations of correlated regions in the visceral cranium. Six bilateral areas, NRP C0, C1, C2, C3, C4 and C7, are detectable by palpating the lateral neck. We investigated the inter-rater reliability of NRP to assess their potential clinical relevance. 32 consecutive patients with chronic neck pain were examined for NRP tenderness by an experienced physician and an inexperienced medical student in a blinded design. A detailed description of the palpation technique is included in this section. Absence of pain was defined as pain index (PI) = 0, slight tenderness = 1, and marked pain = 2. Findings were evaluated either by pair-wise Cohen's kappa (ĸ) or by percentage of agreement (PA). Examiners identified 40% and 41% of positive NRP, respectively (PI > 0, physician: 155, student: 157) with a slight preference for the left side (1.2:1). The number of patients identified with >6 positive NRP by the examiners was similar (13 vs. 12 patients). ĸ values ranged from 0.52 to 0.95. The overall kappa was ĸ = 0.80 for the left and ĸ = 0.74 for the right side. PA varied from 78.1% to 96.9% with strongest agreement at NRP C0, NRP C2, and NRP C7. Inter-rater agreement was independent of patients' age, gender, body mass index and examiner's experience. The high reproducibility suggests the clinical relevance of NRP in women. © 2016 S. Karger GmbH, Freiburg.
Computer-assisted radiographic calculation of spinal curvature in brachycephalic "screw-tailed" dog breeds with congenital thoracic vertebral malformations: reliability and clinical evaluation.

Directory of Open Access Journals (Sweden)

Julien Guevar

Full Text Available The objectives of this study were: To investigate computer-assisted digital radiographic measurement of Cobb angles in dogs with congenital thoracic vertebral malformations, to determine its intra- and inter-observer reliability and its association with the presence of neurological deficits. Medical records were reviewed (2009-2013 to identify brachycephalic screw-tailed dog breeds with radiographic studies of the thoracic vertebral column and with at least one vertebral malformation present. Twenty-eight dogs were included in the study. The end vertebrae were defined as the cranial end plate of the vertebra cranial to the malformed vertebra and the caudal end plate of the vertebra caudal to the malformed vertebra. Three observers performed the measurements twice. Intraclass correlation coefficients were used to calculate the intra- and inter-observer reliabilities. The intraclass correlation coefficient was excellent for all intra- and inter-observer measurements using this method. There was a significant difference in the kyphotic Cobb angle between dogs with and without associated neurological deficits. The majority of dogs with neurological deficits had a kyphotic Cobb angle higher than 35°. No significant difference in the scoliotic Cobb angle was observed. We concluded that the computer assisted digital radiographic measurement of the Cobb angle for kyphosis and scoliosis is a valid, reproducible and reliable method to quantify the degree of spinal curvature in brachycephalic screw-tailed dog breeds with congenital thoracic vertebral malformations.

Diagnosing paratonia in the demented elderly: reliability and validity of the Paratonia Assessment Instrument (PAI).

Science.gov (United States)

Hobbelen, Johannes S M; Koopmans, Raymond T C M; Verhey, Frans R J; Habraken, Kitty M; de Bie, Rob A

2008-08-01

Paratonia is one of the associated movement disorders characteristic of dementia. The aim of this study was to develop an assessment tool (the Paratonia Assessment Instrument, PAI), based on the new consensus definition of paratonia. An additional aim was to investigate the reliability and validity of the PAI. A three-phase cross-sectional survey was conducted. In the first two phases, the PAI was developed and validated. In the third phase, the inter-observer reliability and feasibility of the instrument was tested. The original PAI consisted of five criteria that all needed to be met in order to make the diagnosis. On the basis of a qualitative analysis, one criterion was reformulated and another was removed. Following this, inter-observer reliability between the two assessors resulted in an improvement of Cohen's kappa from 0.532 in the initial phase to 0.677 in the second phase. This improvement was substantiated in the third phase by two independent assessors with Cohen's kappa ranging from 0.625 to 1. The PAI is a reliable and valid assessment tool for diagnosing paratonia in elderly people with dementia that can be applied easily in daily practice.
Diagnostic reliability of MMPI-2 computer-based test interpretations.

Science.gov (United States)

Pant, Hina; McCabe, Brian J; Deskovitz, Mark A; Weed, Nathan C; Williams, John E

2014-09-01

Reflecting the common use of the MMPI-2 to provide diagnostic considerations, computer-based test interpretations (CBTIs) also typically offer diagnostic suggestions. However, these diagnostic suggestions can sometimes be shown to vary widely across different CBTI programs even for identical MMPI-2 profiles. The present study evaluated the diagnostic reliability of 6 commercially available CBTIs using a 20-item Q-sort task developed for this study. Four raters each sorted diagnostic classifications based on these 6 CBTI reports for 20 MMPI-2 profiles. Two questions were addressed. First, do users of CBTIs understand the diagnostic information contained within the reports similarly? Overall, diagnostic sorts of the CBTIs showed moderate inter-interpreter diagnostic reliability (mean r = .56), with sorts for the 1/2/3 profile showing the highest inter-interpreter diagnostic reliability (mean r = .67). Second, do different CBTIs programs vary with respect to diagnostic suggestions? It was found that diagnostic sorts of the CBTIs had a mean inter-CBTI diagnostic reliability of r = .56, indicating moderate but not strong agreement across CBTIs in terms of diagnostic suggestions. The strongest inter-CBTI diagnostic agreement was found for sorts of the 1/2/3 profile CBTIs (mean r = .71). Limitations and future directions are discussed. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Rater reliability and concurrent validity of the Keyboard Personal Computer Style instrument (K-PeCS).

Science.gov (United States)

Baker, Nancy A; Cook, James R; Redfern, Mark S

2009-01-01

This paper describes the inter-rater and intra-rater reliability, and the concurrent validity of an observational instrument, the Keyboard Personal Computer Style instrument (K-PeCS), which assesses stereotypical postures and movements associated with computer keyboard use. Three trained raters independently rated the video clips of 45 computer keyboard users to ascertain inter-rater reliability, and then re-rated a sub-sample of 15 video clips to ascertain intra-rater reliability. Concurrent validity was assessed by comparing the ratings obtained using the K-PeCS to scores developed from a 3D motion analysis system. The overall K-PeCS had excellent reliability [inter-rater: intra-class correlation coefficients (ICC)=.90; intra-rater: ICC=.92]. Most individual items on the K-PeCS had from good to excellent reliability, although six items fell below ICC=.75. Those K-PeCS items that were assessed for concurrent validity compared favorably to the motion analysis data for all but two items. These results suggest that most items on the K-PeCS can be used to reliably document computer keyboarding style.
Inter-rater agreement on PIVC-associated phlebitis signs, symptoms and scales.

Science.gov (United States)

Marsh, Nicole; Mihala, Gabor; Ray-Barruel, Gillian; Webster, Joan; Wallis, Marianne C; Rickard, Claire M

2015-10-01

Many peripheral intravenous catheter (PIVC) infusion phlebitis scales and definitions are used internationally, although no existing scale has demonstrated comprehensive reliability and validity. We examined inter-rater agreement between registered nurses on signs, symptoms and scales commonly used in phlebitis assessment. Seven PIVC-associated phlebitis signs/symptoms (pain, tenderness, swelling, erythema, palpable venous cord, purulent discharge and warmth) were observed daily by two raters (a research nurse and registered nurse). These data were modelled into phlebitis scores using 10 different tools. Proportions of agreement (e.g. positive, negative), observed and expected agreements, Cohen's kappa, the maximum achievable kappa, prevalence- and bias-adjusted kappa were calculated. Two hundred ten patients were recruited across three hospitals, with 247 sets of paired observations undertaken. The second rater was blinded to the first's findings. The Catney and Rittenberg scales were the most sensitive (phlebitis in >20% of observations), whereas the Curran, Lanbeck and Rickard scales were the most restrictive (≤2% phlebitis). Only tenderness and the Catney (one of pain, tenderness, erythema or palpable cord) and Rittenberg scales (one of erythema, swelling, tenderness or pain) had acceptable (more than two-thirds, 66.7%) levels of inter-rater agreement. Inter-rater agreement for phlebitis assessment signs/symptoms and scales is low. This likely contributes to the high degree of variability in phlebitis rates in literature. We recommend further research into assessment of infrequent signs/symptoms and the Catney or Rittenberg scales. New approaches to evaluating vein irritation that are valid, reliable and based on their ability to predict complications need exploration. © 2015 John Wiley & Sons, Ltd.
Interrater reliability of videotaped observational gait-analysis assessments.

Science.gov (United States)

Eastlack, M E; Arvidson, J; Snyder-Mackler, L; Danoff, J V; McGarvey, C L

1991-06-01

The purpose of this study was to determine the interrater reliability of videotaped observational gait-analysis (VOGA) assessments. Fifty-four licensed physical therapists with varying amounts of clinical experience served as raters. Three patients with rheumatoid arthritis who demonstrated an abnormal gait pattern served as subjects for the videotape. The raters analyzed each patient's most severely involved knee during the four subphases of stance for the kinematic variables of knee flexion and genu valgum. Raters were asked to determine whether these variables were inadequate, normal, or excessive. The temporospatial variables analyzed throughout the entire gait cycle were cadence, step length, stride length, stance time, and step width. Generalized kappa coefficients ranged from .11 to .52. Intraclass correlation coefficients (2,1) and (3,1) were slightly higher. Our results indicate that physical therapists' VOGA assessments are only slightly to moderately reliable and that improved interrater reliability of the assessments of physical therapists utilizing this technique is needed. Our data suggest that there is a need for greater standardization of gait-analysis training.
New definitions of 6 clinical signs of perceptual disorder in children with cerebral palsy: an observational study through reliability measures.

Science.gov (United States)

Ferrari, A; Sghedoni, A; Alboresi, S; Pedroni, E; Lombardi, F

2014-12-01

Recently authors have begun to emphasize the non-motor aspects of Cerebral Palsy and their influence on motor control and recovery prognosis. Much has been written about single clinical signs (i.e., startle reaction) but so far no definitions of the six perceptual signs presented in this study have appeared in literature. This study defines 6 signs (startle reaction, upper limbs in startle position, frequent eye blinking, posture freezing, averted eye gaze, grimacing) suggestive of perceptual disorders in children with cerebral palsy and measures agreement on sign recognition among independent observers and consistency of opinions over time. Observational study with both cross-sectional and prospective components. Fifty-six videos presented to observers in random order. Videos were taken from 19 children with a bilateral form of cerebral palsy referred to the Children Rehabilitation Unit in Reggio Emilia. Thirty-five rehabilitation professionals from all over Italy: 9 doctors and 26 physiotherapists. Measure of agreement among 35 independent observers was compiled from a sample of 56 videos. Interobserver reliability was determined using the K index of Fleiss and reliability intra-observer was calculated by the Spearman correlation index between ranks (rho - ρ). Percentage of agreement between observers and Gold Standard was used as criterion validity. Interobserver reliability was moderate for startle reaction, upper limb in startle position, adverted eye gaze and eye-blinking and fair for posture freezing and grimacing. Intraobserver reliability remained consistent over time. Criterion validity revealed very high agreement between independent observer evaluation and gold standard. Semiotics of perceptual disorders can be used as a specific and sensitive instrument in order to identify a new class of patients within existing heterogeneous clinical types of bilateral cerebral palsy forms and could help clinicians in identifying functional prognosis. To provide
[Constraints and opportunities for inter-sector health promotion initiatives: a case study].

Science.gov (United States)

Magalhães, Rosana

2015-07-01

This article analyzes the implementation of inter-sector initiatives linked to the Family Grant, Family Health, and School Health Programs in the Manguinhos neighborhood in the North Zone of Rio de Janeiro, Brazil. The study was conducted in 2010 and 2011 and included document review, local observation, and 25 interviews with program managers, professionals, and staff. This was an exploratory case study using a qualitative approach that identified constraints and opportunities for inter-sector health experiences, contributing to the debate on the effectiveness of health promotion and poverty relief programs.
Assessment of Lower Limb Muscle Strength and Power Using Hand-Held and Fixed Dynamometry: A Reliability and Validity Study

Science.gov (United States)

Perraton, Luke G.; Bower, Kelly J.; Adair, Brooke; Pua, Yong-Hao; Williams, Gavin P.; McGaw, Rebekah

2015-01-01

Introduction Hand-held dynamometry (HHD) has never previously been used to examine isometric muscle power. Rate of force development (RFD) is often used for muscle power assessment, however no consensus currently exists on the most appropriate method of calculation. The aim of this study was to examine the reliability of different algorithms for RFD calculation and to examine the intra-rater, inter-rater, and inter-device reliability of HHD as well as the concurrent validity of HHD for the assessment of isometric lower limb muscle strength and power. Methods 30 healthy young adults (age: 23±5yrs, male: 15) were assessed on two sessions. Isometric muscle strength and power were measured using peak force and RFD respectively using two HHDs (Lafayette Model-01165 and Hoggan microFET2) and a criterion-reference KinCom dynamometer. Statistical analysis of reliability and validity comprised intraclass correlation coefficients (ICC), Pearson correlations, concordance correlations, standard error of measurement, and minimal detectable change. Results Comparison of RFD methods revealed that a peak 200ms moving window algorithm provided optimal reliability results. Intra-rater, inter-rater, and inter-device reliability analysis of peak force and RFD revealed mostly good to excellent reliability (coefficients ≥ 0.70) for all muscle groups. Concurrent validity analysis showed moderate to excellent relationships between HHD and fixed dynamometry for the hip and knee (ICCs ≥ 0.70) for both peak force and RFD, with mostly poor to good results shown for the ankle muscles (ICCs = 0.31–0.79). Conclusions Hand-held dynamometry has good to excellent reliability and validity for most measures of isometric lower limb strength and power in a healthy population, particularly for proximal muscle groups. To aid implementation we have created freely available software to extract these variables from data stored on the Lafayette device. Future research should examine the reliability
Accuracy and reliability of observational gait analysis data: judgments of push-off in gait after stroke.

Science.gov (United States)

McGinley, Jennifer L; Goldie, Patricia A; Greenwood, Kenneth M; Olney, Sandra J

2003-02-01

Physical therapists routinely observe gait in clinical practice. The purpose of this study was to determine the accuracy and reliability of observational assessments of push-off in gait after stroke. Eighteen physical therapists and 11 subjects with hemiplegia following a stroke participated in the study. Measurements of ankle power generation were obtained from subjects following stroke using a gait analysis system. Concurrent videotaped gait performances were observed by the physical therapists on 2 occasions. Ankle power generation at push-off was scored as either normal or abnormal using two 11-point rating scales. These observational ratings were correlated with the measurements of peak ankle power generation. A high correlation was obtained between the observational ratings and the measurements of ankle power generation (mean Pearson r=.84). Interobserver reliability was moderately high (mean intraclass correlation coefficient [ICC (2,1)]=.76). Intraobserver reliability also was high, with a mean ICC (2,1) of.89 obtained. Physical therapists were able to make accurate and reliable judgments of push-off in videotaped gait of subjects following stroke using observational assessment. Further research is indicated to explore the accuracy and reliability of data obtained with observational gait analysis as it occurs in clinical practice.
Intra-observer and inter-observer agreements for the measurement of dual-input whole tumor computed tomography perfusion in patients with lung cancer: Influences of the size and inner-air density of tumors.

Science.gov (United States)

Wang, Qingle; Zhang, Zhiyong; Shan, Fei; Shi, Yuxin; Xing, Wei; Shi, Liangrong; Zhang, Xingwei

2017-09-01

This study was conducted to assess intra-observer and inter-observer agreements for the measurement of dual-input whole tumor computed tomography perfusion (DCTP) in patients with lung cancer. A total of 88 patients who had undergone DCTP, which had proved a diagnosis of primary lung cancer, were divided into two groups: (i) nodules (diameter ≤3 cm) and masses (diameter >3 cm) by size, and (ii) tumors with and without air density. Pulmonary flow, bronchial flow, and pulmonary index were measured in each group. Intra-observer and inter-observer agreements for measurement were assessed using intraclass correlation coefficient, within-subject coefficient of variation, and Bland-Altman analysis. In all lung cancers, the reproducibility coefficient for intra-observer agreement (range 26.1-38.3%) was superior to inter-observer agreement (range 38.1-81.2%). Further analysis revealed lower agreements for nodules compared to masses. Additionally, inner-air density reduced both agreements for lung cancer. The intra-observer agreement for measuring lung cancer DCTP was satisfied, while the inter-observer agreement was limited. The effects of tumoral size and inner-air density to agreements, especially between two observers, should be emphasized. In future, an automatic computer-aided segment of perfusion value of the tumor should be developed. © 2017 The Authors. Thoracic Cancer published by China Lung Oncology Group and John Wiley & Sons Australia, Ltd.
Reliability of widefield capillary microscopy to measure nailfold capillary density in systemic sclerosis.

Science.gov (United States)

Hudson, M; Masetto, A; Steele, R; Arthurs, E; Baron, M

2010-01-01

To determine intra- and inter-observer reliability of widefield microscopy to measure nailfold capillary density in patients with systemic sclerosis (SSc). Five SSc patients were examined with a STEMV-8 Zeiss biomicroscope with 50x magnification. The nailfold of the second, third, fourth and fifth fingers of both hands of each patient were photographed twice by each of two observers, once in the morning and again in the afternoon (total of 32 pictures). Two raters reviewed the photographs to produce capillary density readings. Intra- and inter-rater reliability of the readings were computed using intra-class correlations (ICC). Additional analyses were undertaken to determine the impact of other sources of variability in the data, namely patient, finger, technician and time. Intra-and inter-rater reliability were substantial (ICC 0.72-0.84) when raters were reading the same photographs or photographs taken at the same time of day. Agreement was only fair between morning and afternoon density readings (ICC 0.30-0.37). Patients, individual fingers and technician accounted for a large part of the variability in the data (combined variance component of 7.69 out of the total 12.23). The coefficient of variation of widefield microscopy was 24%. Although intra- and inter-rater reliability of nailfold capillary density measurements using widefield microscopy are good, proper standardisation of the conditions under which capillaroscopy is done and better imaging of nailfold capillary abnormalities should be considered if nailfold capillary density is to be used as an outcome measure in multi-centre clinical trials in SSc.
Inter-rater reliability of postnatal ultrasound interpretation in infants with congenital hydronephrosis.

Science.gov (United States)

Vemulakonda, V M; Wilcox, D T; Torok, M R; Hou, A; Campbell, J B; Kempe, A

2015-09-01

The most common measurements of hydronephrosis are the anterior-posterior (AP) diameter and the Society for Fetal Urology (SFU) grading systems. To date, the inter-rater reliability (IRR) of these measures has not been compared in the postnatal period. The objectives of this study were to compare the IRR of the AP diameter and the SFU grading system in infants and to determine whether ultrasound findings other than pelvicalyceal dilation are associated with higher SFU grades. Initial postnatal ultrasounds of infants seen from February 1, 2011, to January 31, 2012, with a primary diagnosis of congenital hydronephrosis were included for review. Ultrasound images were de-identified and reviewed by four pediatric urologists. IRR was calculated using the intraclass correlation (ICC) measure. A paired t test was used to compare ICCs. Associations between SFU grade and other ultrasound findings were tested using Chi-square or Fisher's exact tests. A total of 112 kidneys in 56 patients were reviewed. IRR of the SFU grading system was high (right kidney ICC = 0.83, left kidney ICC = 0.85); however, IRR of AP diameter measurement was higher (right kidney ICC = 00.97, left kidney ICC = 0.98; p hydronephrosis on bivariable and multivariable analysis. The SFU grading system is associated with excellent IRR, although the AP diameter appears to have higher IRR. Physicians may consider ultrasound findings that are not explicitly included in the SFU system when assigning hydronephrosis grade, which may lead to variability in use of this classification system.
Thyroid Ultrasound: Change of Inter-observer Variability and Diagnostic Performance after Training

International Nuclear Information System (INIS)

Moon, Hee Jung; Kim, Eun Kyung; Kwak, Jin Young; Park, Jeong Seon

2011-01-01

To investigate and compare inter-observer variability and diagnostic performance of thyroid ultrasound (US) between a faculty member and observing residents. From October 2007 to June 2009, 18 residents underwent training in thyroid US section. Group 1 included 8 residents that were trained for the first time and group 2 included 10 residents that were trained for the second time. US features of nodules were recorded according to the composition, echogenicity, margin, calcifications, shape, and final assessment by a faculty member and residents, respectively. Following a discussion, a faculty member performed fine needle aspiration. Then, the inter-observer variability and diagnostic performance between a faculty member and residents were investigated and compared for US. In group 1, agreement for composition in resident 1, calcification for residents 5 and 6, and shape for resident 4 were slight, moderate, moderate, and moderate, respectively. In group 2, agreement for composition in residents 1 and 10 were moderate. Substantial or greater agreement was observed more frequently in group 2 than 1. The diagnostic performances for both the faculty and residents were high and not statistically different. Agreement for US features between a faculty and residents as well as diagnostic performance were high. Moreover, diagnostic performance of residents that underwent training a second time of training was higher than residents that underwent training only once
A dedicated BI-RADS training programme: Effect on the inter-observer variation among screening radiologists

International Nuclear Information System (INIS)

Timmers, J.M.H.; Doorne-Nagtegaal, H.J. van; Verbeek, A.L.M.; Heeten, G.J. den; Broeders, M.J.M.

2012-01-01

Introduction: The Breast Imaging Reporting and Data System (BI-RADS) was introduced in the Dutch breast cancer screening programme to improve communication between medical specialists. Following introduction, a substantial variation in the use of the BI-RADS lexicon for final assessment categories was noted among screening radiologists. We set up a dedicated training programme to reduce this variation. This study evaluates whether this programme was effective. Materials and methods: Two comparable test sets were read before and after completion of the training programme. Each set contained 30 screening mammograms of referred women selected from screening practice. The sets were read by 25 experienced and 30 new screening radiologists. Cohen's kappa (κ) was used to calculate the inter-observer agreement. The BI-RADS 2003 version was implemented in the screening programme as the BI-RADS 2008 version requires the availability of diagnostic work-up, and this is unavailable. Results: The inter-observer agreement of all participating radiologists (n = 55) with the expert panel increased from a pre-training κ-value of 0.44 to a post-training κ-value of 0.48 (p = 0.14). The inter-observer agreement of the new screening radiologists (n = 30) with the expert panel increased from κ = 0.41 to κ = 0.50 (p = 0.01), whereas there was no difference in agreement among the 25 experienced radiologists (from κ = 0.48 to κ = 0.46, p = 0.60). Conclusion: Our training programme in the BI-RADS lexicon resulted in a significant improvement of agreement among new screening radiologists. Overall, the agreement among radiologists was moderate (guidelines Landis and Koch). This is in line with results found in the literature
Feasibility and reliability of frailty assessment in the critically ill: a systematic review.

Science.gov (United States)

Pugh, Richard J; Ellison, Amy; Pye, Kate; Subbe, Christian P; Thorpe, Chris M; Lone, Nazir I; Clegg, Andrew

2018-02-26

For healthcare systems, an ageing population poses challenges in the delivery of equitable and effective care. Frailty assessment has the potential to improve care in the intensive care setting, but applying assessment tools in critical illness may be problematic. The aim of this systematic review was to evaluate evidence for the feasibility and reliability of frailty assessment in critical care. Our primary search was conducted in Medline, Medline In-process, EMBASE, CINAHL, PsycINFO, AMED, Cochrane Database of Systematic Reviews, and Web of Science (January 2001 to October 2017). We included observational studies reporting data on feasibility and reliability of frailty assessment in the critical care setting in patients 16 years and older. Feasibility was assessed in terms of timing of evaluation, the background, training and expertise required for assessors, and reliance upon proxy input. Reliability was assessed in terms of inter-rater reliability. Data from 11 study publications are included, representing 8 study cohorts and 7761 patients. Proxy involvement in frailty assessment ranged from 58 to 100%. Feasibility data were not well-reported overall, but the exclusion rate due to lack of proxy availability ranged from 0 to 45%, the highest rate observed where family involvement was mandatory and the assessment tool relatively complex (frailty index, FI). Conventional elements of frailty phenotype (FP) assessment required modification prior to use in two studies. Clinical staff tended to use a simple judgement-based tool, the clinical frailty scale (CFS). Inter-rater reliability was reported in one study using the CFS and although a good level of agreement was observed between clinician assessments, this was a small and single-centre study. Though of unproven reliability in the critically ill, CFS was the tool used most widely by critical care clinical staff. Conventional FP assessment required modification for general application in critical care, and an FI
Effects of questionnaire-based diagnosis and training on inter-rater reliability among practitioners of traditional Chinese medicine.

Science.gov (United States)

Mist, Scott; Ritenbaugh, Cheryl; Aickin, Mikel

2009-07-01

To investigate whether a training process that focused on a questionnaire-based diagnosis in Traditional Chinese Medicine (TCM), and developing diagnostic consensus, would improve the agreement of TCM diagnoses among 10 TCM practitioners evaluating patients with temporomandibular joint disorder (TMJD). Evaluation of a diagnostic training program at the Department of Family and Community Medicine, University of Arizona, Tucson, Arizona, and the Oregon College of Oriental Medicine, Portland, Oregon. Screened participants for a study of TCM for TMJD. PRACTITIONERS: Ten (10) licensed acupuncturists with a minimum of 5 years licensure and education in Chinese herbs. A training session using a questionnaire-based diagnostic form was conducted, followed by waves of diagnostic sessions. Between sessions, practitioners discussed the results of the previous round of participants with a focus on reducing variability in primary diagnosis and severity rating of each diagnosis: 3 waves of 5 patients were assessed by 4 practitioner pairs for a total of 120 diagnoses. At 18 months, practitioners completed a recalibration exercise with a similar format with a total of 32 diagnoses. These diagnoses were then examined with respect to the rate of agreement among the 10 practitioners using inter-rater correlations and kappas. The inter-rater correlation with respect to the TCM diagnoses among the 10 practitioners increased from 0.112 to 0.618 with training. Statistically significant improvements were found between the baseline and 18 month exercises (p reliability of TCM diagnosis may be improved through a training process and a questionnaire-based diagnosis process. The improvements varied by diagnosis, with the greatest congruence among primary and more severe diagnoses. Future TCM studies should consider including calibration training to improve the validity of results.
Five times sit-to-stand test in subjects with total knee replacement: Reliability and relationship with functional mobility tests.

Science.gov (United States)

Medina-Mirapeix, Francesc; Vivo-Fernández, Iván; López-Cañizares, Juan; García-Vidal, José A; Benítez-Martínez, Josep Carles; Del Baño-Aledo, María Elena

2018-01-01

The objective was to determine the inter-observer and test/retest reliability of the "Five-repetition sit-to-stand" (5STS) test in patients with total knee replacement (TKR). To explore correlation between 5STS and two mobility tests. A reliability study was conducted among 24 (mean age 72.13, S.D. 10.67; 50% were women) outpatients with TKR. They were recruited from a traumatology unit of a public hospital via convenience sampling. A physiotherapist and trauma physician assessed each patient at the same time. The same physiotherapist realized a 5STS second measurement 45-60min after the first one. Reliability was assessed with intraclass correlation coefficients (ICCs) and Bland-Altman plots. Pearson coefficient was calculated to assess the correlation between 5STS, time up to go test (TUG) and four meters gait speed (4MGS). ICC for inter-observer and test-retest reliability of the 5STS were 0.998 (95% confidence interval [CI], 0.995-0.999) and 0.982 (95% CI, 0.959-0.992). Bland-Altman plot inter-observer showed limits between -0.82 and 1.06 with a mean of 0.11 and no heteroscedasticity within the data. Bland-Altman plot for test-retest showed the limits between 1.76 and 4.16, a mean of 1.20 and heteroscedasticity within the data. Pearson correlation coefficient revealed significant correlation between 5STS and TUG (r=0.7, ptest-retest reliability when it is used in people with TKR, and also significant correlation with other functional mobility tests. These findings support the use of 5STS as outcome measure in TKR population. Copyright © 2017 Elsevier B.V. All rights reserved.
Laterality judgments in people with low back pain--A cross-sectional observational and test-retest reliability study.

Science.gov (United States)

Linder, Martin; Michaelson, Peter; Röijezon, Ulrik

2016-02-01

Disruption of cortical representation, or body schema, has been indicated as a factor in the persistence and recurrence of low back pain (LBP). This has been observed through impaired laterality judgment ability and it has been suggested that this ability is affected in a spatial rather than anatomical manner. We compared laterality judgment performance of foot and trunk movements between people with LBP with or without leg pain and healthy controls, and investigated associations between test performance and pain. We also assessed the test-retest reliability of the Recognise Online™ software when used in a clinical and a home setting. Cross-sectional observational and test-retest study. Thirty individuals with LBP and 30 healthy controls performed judgment tests of foot and trunk laterality once supervised in a clinic and twice at home. No statistically significant group differences were found. LBP intensity was negatively related to trunk laterality accuracy (p = 0.019). Intraclass correlation values ranged from 0.51 to 0.91. Reaction time improved significantly between test occasions while accuracy did not. Laterality judgments were not impaired in subjects with LBP compared to controls. Further research may clarify the relationship between pain mechanisms in LBP and laterality judgment ability. Reliability values were mostly acceptable, with wide and low confidence intervals, suggesting test-retest reliability for Recognise Online™ could be questioned in this trial. A significant learning effect was observed which should be considered in clinical and research application of the test. Copyright © 2015 Elsevier Ltd. All rights reserved.
Reliability and validity of videotaped functional performance tests in ACL-injured subjects

DEFF Research Database (Denmark)

von Porat, Anette; Holmström, Eva; Roos, Ewa

2008-01-01

BACKGROUND AND PURPOSE: In clinical practice, visual observation is often used to determine functional impairment and to evaluate treatment following a knee injury. The aim of this study was to evaluate the reliability and validity of observational assessments of knee movement pattern quality......, crossover hop on one leg and one-leg hop. The videos were observed by four physiotherapists, and the knee movement pattern quality, a feature of the loading strategy of the lower extremity, was scored on an 11-point rating scale. To assess the criterion validity, the observational rating was correlated...... obtained between the observers' assessment and knee flexion angle, r = 0.37-0.61. The crossover hop test or one-leg hop test was ranked as the most useful test in 172 of 192 occasions (90%) when assessing knee function. CONCLUSION: The moderate to good inter-observer reliability and the moderate criterion...
Inter- and Intra-Observer Variability in Prostate Definition With Tissue Harmonic and Brightness Mode Imaging

International Nuclear Information System (INIS)

Sandhu, Gurpreet Kaur; Dunscombe, Peter; Meyer, Tyler; Pavamani, Simon; Khan, Rao

2012-01-01

Purpose: The objective of this study was to compare the relative utility of tissue harmonic (H) and brightness (B) transrectal ultrasound (TRUS) images of the prostate by studying interobserver and intraobserver variation in prostate delineation. Methods and Materials: Ten patients with early-stage disease were randomly selected. TRUS images of prostates were acquired using B and H modes. The prostates on all images were contoured by an experienced radiation oncologist (RO) and five equally trained observers. The observers were blinded to information regarding patient and imaging mode. The volumes of prostate glands and areas of midgland slices were calculated. Volumes contoured were compared among the observers and between observer group and RO. Contours on one patient were repeated five times by four observers to evaluate the intraobserver variability. Results: A one-sample Student t-test showed the volumes outlined by five observers are in agreement (p > 0.05) with the RO. Paired Student t-test showed prostate volumes (p = 0.008) and midgland areas (p = 0.006) with H mode were significantly smaller than that with B mode. Two-factor analysis of variances showed significant interobserver variability (p < 0.001) in prostate volumes and areas. Inter- and intraobserver consistency was quantified as the standard deviation of mean volumes and areas, and concordance indices. It was found that for small glands (≤35 cc) H mode provided greater interobserver consistency; however, for large glands (≥35 cc), B mode provided more consistent estimates. Conclusions: H mode provided superior inter- and intraobserver agreement in prostate volume definition for small to medium prostates. In large glands, H mode does not exhibit any additional advantage. Although harmonic imaging has not proven advantageous for all cases, its utilization seems to be judicious for small prostates.

Reliability of the Bulb Dynamometer for Assessing Grip Strength

Directory of Open Access Journals (Sweden)

Colleen Maher

2018-04-01

Full Text Available Background: Hand function is an overall indicator of health and is often measured using grip strength. Handheld dynamometry is the most common method of measuring grip strength. The purpose of this study was to determine the inter-rater and test-retest reliability, the reliability of one trial versus three trials, and the preliminary norms for a young adult population using the Baseline® Pneumatic Squeeze Bulb Dynamometer (30 psi. Methods: This study used a one-group methodological design. One hundred and three healthy adults (30 males and 73 females were recruited. Six measurements were collected for each hand per participant. The data was analyzed using Intraclass Correlation Coefficients (ICC two-way effects model (2,2 and paired-samples t-tests. Results: The ICC for inter-rater reliability ranged from 0.955 to 0.977. Conclusion: The results of this study suggest that the bulb dynamometer is a reliable tool to measure grip strength and should be further explored for reliable and valid use in diverse populations and as an alternative to the Jamar dynamometer.
Palliative sedation: reliability and validity of sedation scales.

Science.gov (United States)

Arevalo, Jimmy J; Brinkkemper, Tijn; van der Heide, Agnes; Rietjens, Judith A; Ribbe, Miel; Deliens, Luc; Loer, Stephan A; Zuurmond, Wouter W A; Perez, Roberto S G M

2012-11-01

Observer-based sedation scales have been used to provide a measurable estimate of the comfort of nonalert patients in palliative sedation. However, their usefulness and appropriateness in this setting has not been demonstrated. To study the reliability and validity of observer-based sedation scales in palliative sedation. A prospective evaluation of 54 patients under intermittent or continuous sedation with four sedation scales was performed by 52 nurses. Included scales were the Minnesota Sedation Assessment Tool (MSAT), Richmond Agitation-Sedation Scale (RASS), Vancouver Interaction and Calmness Scale (VICS), and a sedation score proposed in the Guideline for Palliative Sedation of the Royal Dutch Medical Association (KNMG). Inter-rater reliability was tested with the intraclass correlation coefficient (ICC) and Cohen's kappa coefficient. Correlations between the scales using Spearman's rho tested concurrent validity. We also examined construct, discriminative, and evaluative validity. In addition, nurses completed a user-friendliness survey. Overall moderate to high inter-rater reliability was found for the VICS interaction subscale (ICC = 0.85), RASS (ICC = 0.73), and KNMG (ICC = 0.71). The largest correlation between scales was found for the RASS and KNMG (rho = 0.836). All scales showed discriminative and evaluative validity, except for the MSAT motor subscale and VICS calmness subscale. Finally, the RASS was less time consuming, clearer, and easier to use than the MSAT and VICS. The RASS and KNMG scales stand as the most reliable and valid among the evaluated scales. In addition, the RASS was less time consuming, clearer, and easier to use than the MSAT and VICS. Further research is needed to evaluate the impact of the scales on better symptom control and patient comfort. Copyright © 2012 U.S. Cancer Pain Relief Committee. Published by Elsevier Inc. All rights reserved.
Reliability of digital ulcer definitions as proposed by the UK Scleroderma Study Group: A challenge for clinical trial design.

Science.gov (United States)

Hughes, Michael; Tracey, Andrew; Bhushan, Monica; Chakravarty, Kuntal; Denton, Christopher P; Dubey, Shirish; Guiducci, Serena; Muir, Lindsay; Ong, Voon; Parker, Louise; Pauling, John D; Prabu, Athiveeraramapandian; Rogers, Christine; Roberts, Christopher; Herrick, Ariane L

2018-06-01

The reliability of clinician grading of systemic sclerosis-related digital ulcers has been reported to be poor to moderate at best, which has important implications for clinical trial design. The aim of this study was to examine the reliability of new proposed UK Scleroderma Study Group digital ulcer definitions among UK clinicians with an interest in systemic sclerosis. Raters graded (through a custom-built interface) 90 images (80 unique and 10 repeat) of a range of digital lesions collected from patients with systemic sclerosis. Lesions were graded on an ordinal scale of severity: 'no ulcer', 'healed ulcer' or 'digital ulcer'. A total of 23 clinicians - 18 rheumatologists, 3 dermatologists, 1 hand surgeon and 1 specialist rheumatology nurse - completed the study. A total of 2070 (1840 unique + 230 repeat) image gradings were obtained. For intra-rater reliability, across all images, the overall weighted kappa coefficient was high (0.71) and was moderate (0.55) when averaged across individual raters. Overall inter-rater reliability was poor (0.15). Although our proposed digital ulcer definitions had high intra-rater reliability, the overall inter-rater reliability was poor. Our study highlights the challenges of digital ulcer assessment by clinicians with an interest in systemic sclerosis and provides a number of useful insights for future clinical trial design. Further research is warranted to improve the reliability of digital ulcer definition/rating as an outcome measure in clinical trials, including examining the role for objective measurement techniques, and the development of digital ulcer patient-reported outcome measures.
Assessment of apraxia: inter-rater reliability of a new apraxia test, association between apraxia and other cognitive deficits and prevalence of apraxia in a rehabilitation setting.

Science.gov (United States)

Zwinkels, Angeliek; Geusgens, Chantal; van de Sande, Peter; Van Heugten, Caroline

2004-11-01

To investigate the inter-rater reliability of a new apraxia test. Furthermore to examine the association of apraxia with other neuropsychological impairments and the prevalence of apraxia in a rehabilitation setting on the basis of the new test. Cross-sectional cohort study, involving 100 patients with a first stroke admitted to a rehabilitation centre in the Netherlands. General patient characteristics and stroke-related aspects. Cognitive screening involving apraxia, visuospatial scanning, abstract thinking and reasoning, memory, attention, planning and aphasia. The indices for inter-rater agreement range from excellent to poor. Significant correlations are found between apraxia and visuospatial scanning, memory, attention, planning and aphasia. The patients with apraxia perform significantly worse than the patients without apraxia on memory, the time needed to complete the tests for scanning and attention, and aphasia. The prevalence of apraxia is 25.3% in the total group, 51.3% in the left hemisphere stroke patients and 6.0% in the right hemisphere stroke patients. Patients with and without apraxia do not differ significantly concerning age, gender and type of stroke. The apraxia test has been shown to be a reliable instrument. Apraxia is often associated with aphasia, memory problems and mental slowness. This study shows that on the basis of the apraxia test, the prevalence of apraxia among patients in the rehabilitation centre is high, especially among patients with left hemisphere lesions.
Action Research in Inter-organisational Networks - Impartial studies or the Trojan Horse?

DEFF Research Database (Denmark)

Goduscheit, René Chester; Bergenholtz, Carsten; Jørgensen, Jacob Høj

2008-01-01

Traditionally, the literature on action research has been aimed at intra-organisational issues. These studies have distinguished between two researcher roles: The problem-solver and the observer. This article addresses the distinct challenges of action research in inter-organisational projects. I...
The "F" in SAFE: Reliability of assessing clean faces for trachoma control in the field.

Directory of Open Access Journals (Sweden)

Sheila K West

2017-11-01

Full Text Available Although facial cleanliness is part of the SAFE strategy for trachoma there is controversy over the reliability of measuring a clean face. A child's face with no ocular and nasal discharge is clean and the endpoint of interest, regardless of the number of times it must be washed to achieve that endpoint. The issue of reliability rests on the reproducibility of graders to assess a clean face. We report the reproducibility of assessing a clean face in a field trial in Kongwa, Tanzania.Seven graders were trained to assess the presence and absence of nasal and ocular discharge on children's faces. Sixty children ages 1-7 years were recruited from a community and evaluated independently by seven graders, once and again about 50 minutes later. Intra-and inter-observer variation was calculated using unweighted kappa statistics. The average intra-observer agreement was kappa = 0.72, and the average inter-observer agreement was kappa = 0.78.Intra-observer and inter-observer agreement was substantial for the assessment of clean faces using trained Tanzania staff who represent a variety of educational backgrounds. As long as training is provided, the estimate of clean faces in children should be reliable, and reflect the effort of families to keep ocular and nasal discharge off the faces. These data suggest assessment of clean faces could be added to trachoma surveys, which already measure environmental improvements, in districts.
Inter-Rater Reliability of Historical Data Collected by Non-Medical Research Assistants and Physicians in Patients with Acute Abdominal Pain

Directory of Open Access Journals (Sweden)

Mills, Angela M

2009-02-01

Full Text Available OBJECTIVES: In many academic emergency departments (ED, physicians are asked to record clinical data for research that may be time consuming and distracting from patient care. We hypothesized that non-medical research assistants (RAs could obtain historical information from patients with acute abdominal pain as accurately as physicians.METHODS: Prospective comparative study conducted in an academic ED of 29 RAs to 32 resident physicians (RPs to assess inter-rater reliability in obtaining historical information in abdominal pain patients. Historical features were independently recorded on standardized data forms by a RA and RP blinded to each others' answers. Discrepancies were resolved by a third person (RA who asked the patient to state the correct answer on a third questionnaire, constituting the "criterion standard." Inter-rater reliability was assessed using kappa statistics (kappa and percent crude agreement (CrA.RESULTS: Sixty-five patients were enrolled (mean age 43. Of 43 historical variables assessed, the median agreement was moderate (kappa 0.59 [Interquartile range 0.37-0.69]; CrA 85.9% and varied across data categories: initial pain location (kappa 0.61 [0.59-0.73]; CrA 87.7%, current pain location (kappa 0.60 [0.47-0.67]; CrA 82.8%, past medical history (kappa 0.60 [0.48-0.74]; CrA 93.8%, associated symptoms (kappa 0.38 [0.37-0.74]; CrA 87.7%, and aggravating/alleviating factors (kappa 0.09 [-0.01-0.21]; CrA 61.5%. When there was disagreement between the RP and the RA, the RA more often agreed with the criterion standard (64% [55-71%] than the RP (36% [29-45%].CONCLUSION: Non-medical research assistants who focus on clinical research are often more accurate than physicians, who may be distracted by patient care responsibilities, at obtaining historical information from ED patients with abdominal pain.
A study of lip prints and its reliability as a forensic tool

Science.gov (United States)

Verma, Yogendra; Einstein, Arouquiaswamy; Gondhalekar, Rajesh; Verma, Anoop K.; George, Jiji; Chandra, Shaleen; Gupta, Shalini; Samadi, Fahad M.

2015-01-01

Introduction: Lip prints, like fingerprints, are unique to an individual and can be easily recorded. Therefore, we compared direct and indirect lip print patterns in males and females of different age groups, studied the inter- and intraobserver bias in recording the data, and observed any changes in the lip print patterns over a period of time, thereby, assessing the reliability of lip prints as a forensic tool. Materials and Methods: Fifty females and 50 males in the age group of 15 to 35 years were selected for the study. Lips with any deformity or scars were not included. Lip prints were registered by direct and indirect methods and transferred to a preformed registration sheet. Direct method of lip print registration was repeated after a six-month interval. All the recorded data were analyzed statistically. Results: The predominant patterns were vertical and branched. More females showed the branched pattern and males revealed an equal prevalence of vertical and reticular patterns. There was an interobserver agreement, which was 95%, and there was no change in the lip prints over time. Indirect registration of lip prints correlated with direct method prints. Conclusion: Lip prints can be used as a reliable forensic tool, considering the consistency of lip prints over time and the accurate correlation of indirect prints to direct prints. PMID:26668449
A multicentre observational study to evaluate a new tool to assess emergency physicians' non-technical skills.

Science.gov (United States)

Flowerdew, Lynsey; Gaunt, Arran; Spedding, Jessica; Bhargava, Ajay; Brown, Ruth; Vincent, Charles; Woloshynowych, Maria

2013-06-01

To evaluate a new tool to assess emergency physicians' non-technical skills. This was a multicentre observational study using data collected at four emergency departments in England. A proportion of observations used paired observers to obtain data for inter-rater reliability. Data were also collected for test-retest reliability, observability of skills, mean ratings and dispersion of ratings for each skill, as well as a comparison of skill level between hospitals. Qualitative data described the range of non-technical skills exhibited by trainees and identified sources of rater error. 96 assessments of 43 senior trainees were completed. At a scale level, intra-class coefficients were 0.575, 0.532 and 0.419 and using mean scores were 0.824, 0.702 and 0.519. Spearman's ρ for calculating test-retest reliability was 0.70 using mean scores. All skills were observed more than 60% of the time. The skill Maintenance of Standards received the lowest mean rating (4.8 on a nine-point scale) and the highest mean was calculated for Team Building (6.0). Two skills, Supervision & Feedback and Situational Awareness-Gathering Information, had significantly different distributions of ratings across the four hospitals (ptechnical skills, especially in relation to leadership. The framework of skills may be used to identify areas for development in individual trainees, as well as guide other patient safety interventions.
Food and beverage environment analysis and monitoring system: a reliability study in the school food and beverage environment.

Science.gov (United States)

Bullock, Sally Lawrence; Craypo, Lisa; Clark, Sarah E; Barry, Jason; Samuels, Sarah E

2010-07-01

States and school districts around the country are developing policies that set nutrition standards for competitive foods and beverages sold outside of the US Department of Agriculture's reimbursable school lunch program. However, few tools exist for monitoring the implementation of these new policies. The objective of this research was to develop a computerized assessment tool, the Food and Beverage Environment Analysis and Monitoring System (FoodBEAMS), to collect data on the competitive school food environment and to test the inter-rater reliability of the tool among research and nonresearch professionals. FoodBEAMS was used to collect data in spring 2007 on the competitive foods and beverages sold in 21 California high schools. Adherence of the foods and beverages to California's competitive food and beverage nutrition policies for schools (Senate Bills 12 and 965) was determined using the data collected by both research and nonresearch professionals. The inter-rater reliability between the data collectors was assessed using the intraclass correlation coefficient. Researcher vs researcher and researcher vs nonresearcher inter-rater reliability was high for both foods and beverages, with intraclass correlation coefficients ranging from .972 to .987. Results of this study provide evidence that FoodBEAMS is a promising tool for assessing and monitoring adherence to nutrition standards for competitive foods sold on school campuses and can be used reliably by both research and nonresearch professionals. Copyright 2010 American Dietetic Association. Published by Elsevier Inc. All rights reserved.
Reliability and validity of the Pragmatics Observational Measure (POM): a new observational measure of pragmatic language for children.

Science.gov (United States)

Cordier, Reinie; Munro, Natalie; Wilkes-Gillan, Sarah; Speyer, Renée; Pearce, Wendy M

2014-07-01

There is a need for a reliable and valid assessment of childhood pragmatic language skills during peer-peer interactions. This study aimed to evaluate the psychometric properties of a newly developed pragmatic assessment, the Pragmatic Observational Measure (POM). The psychometric properties of the POM were investigated from observational data of two studies - study 1 involved 342 children aged 5-11 years (108 children with ADHD; 108 typically developing playmates; 126 children in the control group), and study 2 involved 9 children with ADHD who attended a 7-week play-based intervention. The psychometric properties of the POM were determined based on the COnsensus-based Standards for the selection of health status Measurement INstruments (COSMIN) taxonomy of psychometric properties and definitions for health-related outcomes; the Pragmatic Protocol was used as the reference tool against which the POM was evaluated. The POM demonstrated sound psychometric properties in all the reliability, validity and interpretability criteria against which it was assessed. The findings showed that the POM is a reliable and valid measure of pragmatic language skills of children with ADHD between the age of 5 and 11 years and has clinical utility in identifying children with pragmatic language difficulty. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.
Reliability of Health-Related Physical Fitness Tests among Colombian Children and Adolescents: The FUPRECOL Study.

Directory of Open Access Journals (Sweden)

Robinson Ramírez-Vélez

Full Text Available Substantial evidence indicates that youth physical fitness levels are an important marker of lifestyle and cardio-metabolic health profiles and predict future risk of chronic diseases. The reliability physical fitness tests have not been explored in Latino-American youth population. This study's aim was to examine the reliability of health-related physical fitness tests that were used in the Colombian health promotion "Fuprecol study". Participants were 229 Colombian youth (boys n = 124 and girls n = 105 aged 9 to 17.9 years old. Five components of health-related physical fitness were measured: 1 morphological component: height, weight, body mass index (BMI, waist circumference, triceps skinfold, subscapular skinfold, and body fat (% via impedance; 2 musculoskeletal component: handgrip and standing long jump test; 3 motor component: speed/agility test (4x10 m shuttle run; 4 flexibility component (hamstring and lumbar extensibility, sit-and-reach test; 5 cardiorespiratory component: 20-meter shuttle-run test (SRT to estimate maximal oxygen consumption. The tests were performed two times, 1 week apart on the same day of the week, except for the SRT which was performed only once. Intra-observer technical errors of measurement (TEMs and inter-rater (reliability were assessed in the morphological component. Reliability for the Musculoskeletal, motor and cardiorespiratory fitness components was examined using Bland-Altman tests. For the morphological component, TEMs were small and reliability was greater than 95% of all cases. For the musculoskeletal, motor, flexibility and cardiorespiratory components, we found adequate reliability patterns in terms of systematic errors (bias and random error (95% limits of agreement. When the fitness assessments were performed twice, the systematic error was nearly 0 for all tests, except for the sit and reach (mean difference: -1.03% [95% CI = -4.35% to -2.28%]. The results from this study indicate that the
Reliability and Validity of the Dutch Physical Activity Questionnaires for Children (PAQ-C) and Adolescents (PAQ-A).

Science.gov (United States)

Bervoets, Liene; Van Noten, Caroline; Van Roosbroeck, Sofie; Hansen, Dominique; Van Hoorenbeeck, Kim; Verheyen, Els; Van Hal, Guido; Vankerckhoven, Vanessa

2014-01-01

This study was designed to validate the Dutch Physical Activity Questionnaires for Children (PAQ-C) and Adolescents (PAQ-A). After adjustment of the original Canadian PAQ-C and PAQ-A (i.e. translation/back-translation and evaluation by expert committee), content validity of both PAQs was assessed and calculated using item-level (I-CVI) and scale-level (S-CVI) content validity indexes. Inter-item and inter-rater reliability of 196 PAQ-C and 95 PAQ-A filled in by both children or adolescents and their parent, were evaluated. Inter-item reliability was calculated by Cronbach's alpha (α) and inter-rater reliability was examined by percent observed agreement and weighted kappa (κ). Concurrent validity of PAQ-A was examined in a subsample of 28 obese and 16 normal-weight children by comparing it with concurrently measured physical activity using a maximal cardiopulmonary exercise test for the assessment of peak oxygen uptake (VO2 peak). For both PAQs, I-CVI ranged 0.67-1.00. S-CVI was 0.89 for PAQ-C and 0.90 for PAQ-A. A total of 192 PAQ-C and 94 PAQ-A were fully completed by both child and parent. Cronbach's α was 0.777 for PAQ-C and 0.758 for PAQ-A. Percent agreement ranged 59.9-74.0% for PAQ-C and 51.1-77.7% for PAQ-A, and weighted κ ranged 0.48-0.69 for PAQ-C and 0.51-0.68 for PAQ-A. The correlation between total PAQ-A score and VO2 peak - corrected for age, gender, height and weight - was 0.516 (p = 0.001). Both PAQs have an excellent content validity, an acceptable inter-item reliability and a moderate to good strength of inter-rater agreement. In addition, total PAQ-A score showed a moderate positive correlation with VO2 peak. Both PAQs have an acceptable to good reliability and validity, however, further validity testing is recommended to provide a more complete assessment of both PAQs.
Coronary artery disease reporting and data system (CAD-RADSTM): Inter-observer agreement for assessment categories and modifiers.

Science.gov (United States)

Maroules, Christopher D; Hamilton-Craig, Christian; Branch, Kelley; Lee, James; Cury, Roberto C; Maurovich-Horvat, Pál; Rubinshtein, Ronen; Thomas, Dustin; Williams, Michelle; Guo, Yanshu; Cury, Ricardo C

The Coronary Artery Disease Reporting and Data System (CAD-RADS) provides a lexicon and standardized reporting system for coronary CT angiography. To evaluate inter-observer agreement of the CAD-RADS among an panel of early career and expert readers. Four early career and four expert cardiac imaging readers prospectively and independently evaluated 50 coronary CT angiography cases using the CAD-RADS lexicon. All readers assessed image quality using a five-point Likert scale, with mean Likert score ≥4 designating high image quality, and CAD-RADS assessment categories and modifiers were assessed using intra-class correlation (ICC) and Fleiss' Kappa (κ).The impact of reader experience and image quality on inter-observer agreement was also examined. Inter-observer agreement for CAD-RADS assessment categories was excellent (ICC 0.958, 95% CI 0.938-0.974, p CAD-RADS assessment categories and modifiers is excellent, except for high-risk plaque (modifier V) which demonstrates fair agreement. These results suggest CAD-RADS is feasible for clinical implementation. Copyright © 2017. Published by Elsevier Inc.
Validating the InterVA model to estimate the burden of mortality from verbal autopsy data: a population-based cross-sectional study.

Directory of Open Access Journals (Sweden)

Sebsibe Tadesse

Full Text Available BACKGROUND: In countries with incomplete or no vital registration systems, verbal autopsy data are often reviewed by physicians in order to assign the probable cause of death. But in addition to being time and energy consuming, the method is liable to produce inconsistent results. The aim of this study is to validate the InterVA model for estimating the burden of mortality from verbal autopsy data by using physician review as a reference standard. METHODS AND FINDINGS: A population-based cross-sectional study was conducted from March to April, 2012. All adults aged ≥ 14 years and died between 01 January, 2010 and 15 February, 2012 were included in the study. The verbal autopsy interviews were reviewed by the InterVA model and physicians to estimate cause-specific mortality fractions. Cohen's kappa statistic, sensitivity, specificity, positive predictive value, and negative predictive value were applied to compare the agreement between the InterVA model and the physician review. A total of 408 adult deaths were studied. There was a general similarity and just slight differences between the InterVA model and the physicians in assigning cause-specific mortality. Both approaches showed an overall agreement in 298 (73% cases [kappa = 0.49, 95% CI: 0.37-0.60]. The observed sensitivities and specificities across causes of death categories varied from 13.3% to 81.9% and 77.7% to 99.5%, respectively. CONCLUSIONS: In understanding the burden of disease and setting health intervention priorities in areas that lack reliable vital registration systems, an accurate analysis of verbal autopsies is essential. Therefore, users should be aware of the suboptimal performance of the InterVA model. Similar validation studies need to be undertaken considering the limitation of the physician review as gold standard since physicians may misinterpret some of the verbal autopsy data and finally reach a wrong conclusion of the cause of death.
Inter-observer agreement for abdominal CT in unselected patients with acute abdominal pain

International Nuclear Information System (INIS)

Randen, Adrienne van; Lameris, Wytze; Nio, C.Y.; Spijkerboer, Anje M.; Meier, Mark A.; Tutein Nolthenius, Charlotte; Smithuis, Frank; Stoker, Jaap; Bossuyt, Patrick M.; Boermeester, Marja A.

2009-01-01

The level of inter-observer agreement of abdominal computed tomography (CT) in unselected patients presenting with acute abdominal pain at the Emergency Department (ED) was evaluated. Two hundred consecutive patients with acute abdominal pain were prospectively included. Multi-slice CT was performed in all patients with intravenous contrast medium only. Three radiologists independently read all CT examinations. They recorded specific radiological features and a final diagnosis on a case record form. We calculated the proportion of agreement and kappa values, for overall, urgent and frequently occurring diagnoses. The mean age of the evaluated patients was 46 years (range 19-94), of which 54% were women. Overall agreement on diagnoses was good, with a median kappa of 0.66. Kappa values for specific urgent diagnoses were excellent, with median kappa values of 0.84, 0.90 and 0.81, for appendicitis, diverticulitis and bowel obstruction, respectively. Abdominal CT has good inter-observer agreement in unselected patients with acute abdominal pain at the ED, with excellent agreement for specific urgent diagnoses as diverticulitis and appendicitis. (orig.)
Inter-observer agreement for abdominal CT in unselected patients with acute abdominal pain

Energy Technology Data Exchange (ETDEWEB)

Randen, Adrienne van [University of Amsterdam, Department of Radiology, Academic Medical Center, Amsterdam (Netherlands); University of Amsterdam, Department of Surgery, Academic Medical Center, Amsterdam (Netherlands); Academic Medical Center, Amsterdam (Netherlands); Lameris, Wytze [University of Amsterdam, Department of Radiology, Academic Medical Center, Amsterdam (Netherlands); University of Amsterdam, Department of Surgery, Academic Medical Center, Amsterdam (Netherlands); Nio, C.Y.; Spijkerboer, Anje M.; Meier, Mark A.; Tutein Nolthenius, Charlotte; Smithuis, Frank; Stoker, Jaap [University of Amsterdam, Department of Radiology, Academic Medical Center, Amsterdam (Netherlands); Bossuyt, Patrick M. [University of Amsterdam, Department of Clinical Epidemiology, Biostatistics, and Bioinformatics, Academic Medical Center, Amsterdam (Netherlands); Boermeester, Marja A. [University of Amsterdam, Department of Surgery, Academic Medical Center, Amsterdam (Netherlands)

2009-06-15

The level of inter-observer agreement of abdominal computed tomography (CT) in unselected patients presenting with acute abdominal pain at the Emergency Department (ED) was evaluated. Two hundred consecutive patients with acute abdominal pain were prospectively included. Multi-slice CT was performed in all patients with intravenous contrast medium only. Three radiologists independently read all CT examinations. They recorded specific radiological features and a final diagnosis on a case record form. We calculated the proportion of agreement and kappa values, for overall, urgent and frequently occurring diagnoses. The mean age of the evaluated patients was 46 years (range 19-94), of which 54% were women. Overall agreement on diagnoses was good, with a median kappa of 0.66. Kappa values for specific urgent diagnoses were excellent, with median kappa values of 0.84, 0.90 and 0.81, for appendicitis, diverticulitis and bowel obstruction, respectively. Abdominal CT has good inter-observer agreement in unselected patients with acute abdominal pain at the ED, with excellent agreement for specific urgent diagnoses as diverticulitis and appendicitis. (orig.)
Test-retest reliability of myofascial trigger point detection in hip and thigh areas.

Science.gov (United States)

Rozenfeld, E; Finestone, A S; Moran, U; Damri, E; Kalichman, L

2017-10-01

Myofascial trigger points (MTrP's) are a primary source of pain in patients with musculoskeletal disorders. Nevertheless, they are frequently underdiagnosed. Reliable MTrP palpation is the necessary for their diagnosis and treatment. The few studies that have looked for intra-tester reliability of MTrPs detection in upper body, provide preliminary evidence that MTrP palpation is reliable. Reliability tests for MTrP palpation on the lower limb have not yet been performed. To evaluate inter- and intra-tester reliability of MTrP recognition in hip and thigh muscles. Reliability study. 21 patients (15 males and 6 females, mean age 21.1 years) referred to the physical therapy clinic, 10 with knee or hip pain and 11 with pain in an upper limb, low back, shin or ankle. Two experienced physical therapists performed the examinations, blinded to the subjects' identity, medical condition and results of the previous MTrP evaluation. Each subject was evaluated four times, twice by each examiner in a random order. Dichotomous findings included a palpable taut band, tenderness, referred pain, and relevance of referred pain to patient's complaint. Based on these, diagnosis of latent MTrP's or active MTrP's was established. The evaluation was performed on both legs and included a total of 16 locations in the following muscles: rectus femoris (proximal), vastus medialis (middle and distal), vastus lateralis (middle and distal) and gluteus medius (anterior, posterior and distal). Inter- and intra-tester reliability (Cohen's kappa (κ)) values for single sites ranged from -0.25 to 0.77. Median intra-tester reliability was 0.45 and 0.46 for latent and active MTrP's, and median inter-tester reliability was 0.51 and 0.64 for latent and active MTrPs, respectively. The examination of the distal vastus medialis was most reliable for latent and active MTrP's (intra-tester k = 0.27-0.77, inter-tester k = 0.77 and intra-tester k = 0.53-0.72, inter-tester k = 0.72, correspondingly
Inter- and intra-observer variability in the assessment of atelectasis and consolidation in neonatal chest radiographs

International Nuclear Information System (INIS)

Bloomfield, F.H.; Teele, R.L.; Voss, M.; Knight, D.B.; Harding, J.E.

1999-01-01

Background. Radiology is an essential part of neonatal intensive care. Interpretation of chest radiographs frequently contributes to respiratory management of neonates, but there has been little assessment of the consistency of this interpretation. Objective. To assess the inter- and intra-observer variability for the reporting of atelectasis and/or consolidation in neonatal chest radiographs. Materials and methods. A total of 585 chest radiographs from the 220 babies ventilated in our nursery over a 2-year period were coded by two radiologists for generalised, lobar and segmental atelectasis and/or consolidation. Two months later one of the radiologists re-coded a random sample of these films (n = 117, 20 %). Agreement was assessed by the kappa statistic and by proportions of agreement for normality and abnormality. Results. The reported incidence of focal atelectasis was low (5-6 %). Focal changes of any nature were found in 21-26 % of films. Inter-observer agreement was fair to moderate (kappa = 0.25-0.44). Intra-observer agreement was mostly moderate to good (kappa = 0.38-0.66). Conclusion. The poor inter-observer agreement for the diagnosis of pulmonary parenchymal abnormalities on chest radiographs of neonates receiving intensive care suggests that abnormalities should be described rather than diagnoses given or that a list of differential diagnoses be offered. When research involves radiographic interpretation, the potential lack of consistency in reporting abnormalities must be borne in mind. (orig.)
Modified personal interviews: resurrecting reliable personal interviews for admissions?

Science.gov (United States)

Hanson, Mark D; Kulasegaram, Kulamakan Mahan; Woods, Nicole N; Fechtig, Lindsey; Anderson, Geoff

2012-10-01

Traditional admissions personal interviews provide flexible faculty-student interactions but are plagued by low inter-interview reliability. Axelson and Kreiter (2009) retrospectively showed that multiple independent sampling (MIS) may improve reliability of personal interviews; thus, the authors incorporated MIS into the admissions process for medical students applying to the University of Toronto's Leadership Education and Development Program (LEAD). They examined the reliability and resource demands of this modified personal interview (MPI) format. In 2010-2011, LEAD candidates submitted written applications, which were used to screen for participation in the MPI process. Selected candidates completed four brief (10-12 minutes) independent MPIs each with a different interviewer. The authors blueprinted MPI questions to (i.e., aligned them with) leadership attributes, and interviewers assessed candidates' eligibility on a five-point Likert-type scale. The authors analyzed inter-interview reliability using the generalizability theory. Sixteen candidates submitted applications; 10 proceeded to the MPI stage. Reliability of the written application components was 0.75. The MPI process had overall inter-interview reliability of 0.79. Correlation between the written application and MPI scores was 0.49. A decision study showed acceptable reliability of 0.74 with only three MPIs scored using one global rating. Furthermore, a traditional admissions interview format would take 66% more time than the MPI format. The MPI format, used during the LEAD admissions process, achieved high reliability with minimal faculty resources. The MPI format's reliability and effective resource use were possible through MIS and employment of expert interviewers. MPIs may be useful for other admissions tasks.

Inter- and intra-observer agreement of BI-RADS-based subjective visual estimation of amount of fibroglandular breast tissue with magnetic resonance imaging: comparison to automated quantitative assessment

International Nuclear Information System (INIS)

Wengert, G.J.; Helbich, T.H.; Woitek, R.; Kapetas, P.; Clauser, P.; Baltzer, P.A.; Vogl, W.D.; Weber, M.; Meyer-Baese, A.; Pinker, Katja

2016-01-01

To evaluate the inter-/intra-observer agreement of BI-RADS-based subjective visual estimation of the amount of fibroglandular tissue (FGT) with magnetic resonance imaging (MRI), and to investigate whether FGT assessment benefits from an automated, observer-independent, quantitative MRI measurement by comparing both approaches. Eighty women with no imaging abnormalities (BI-RADS 1 and 2) were included in this institutional review board (IRB)-approved prospective study. All women underwent un-enhanced breast MRI. Four radiologists independently assessed FGT with MRI by subjective visual estimation according to BI-RADS. Automated observer-independent quantitative measurement of FGT with MRI was performed using a previously described measurement system. Inter-/intra-observer agreements of qualitative and quantitative FGT measurements were assessed using Cohen's kappa (k). Inexperienced readers achieved moderate inter-/intra-observer agreement and experienced readers a substantial inter- and perfect intra-observer agreement for subjective visual estimation of FGT. Practice and experience reduced observer-dependency. Automated observer-independent quantitative measurement of FGT was successfully performed and revealed only fair to moderate agreement (k = 0.209-0.497) with subjective visual estimations of FGT. Subjective visual estimation of FGT with MRI shows moderate intra-/inter-observer agreement, which can be improved by practice and experience. Automated observer-independent quantitative measurements of FGT are necessary to allow a standardized risk evaluation. (orig.)
Inter- and intra-observer agreement of BI-RADS-based subjective visual estimation of amount of fibroglandular breast tissue with magnetic resonance imaging: comparison to automated quantitative assessment

Energy Technology Data Exchange (ETDEWEB)

Wengert, G.J.; Helbich, T.H.; Woitek, R.; Kapetas, P.; Clauser, P.; Baltzer, P.A. [Medical University of Vienna/ Vienna General Hospital, Department of Biomedical Imaging and Image-guided Therapy, Division of Molecular and Gender Imaging, Vienna (Austria); Vogl, W.D. [Medical University of Vienna, Department of Biomedical Imaging and Image-guided Therapy, Computational Imaging Research Lab, Wien (Austria); Weber, M. [Medical University of Vienna, Department of Biomedical Imaging and Image-guided Therapy, Division of General and Pediatric Radiology, Wien (Austria); Meyer-Baese, A. [State University of Florida, Department of Scientific Computing in Medicine, Tallahassee, FL (United States); Pinker, Katja [Medical University of Vienna/ Vienna General Hospital, Department of Biomedical Imaging and Image-guided Therapy, Division of Molecular and Gender Imaging, Vienna (Austria); State University of Florida, Department of Scientific Computing in Medicine, Tallahassee, FL (United States); Memorial Sloan-Kettering Cancer Center, Department of Radiology, Molecular Imaging and Therapy Services, New York City, NY (United States)

2016-11-15

To evaluate the inter-/intra-observer agreement of BI-RADS-based subjective visual estimation of the amount of fibroglandular tissue (FGT) with magnetic resonance imaging (MRI), and to investigate whether FGT assessment benefits from an automated, observer-independent, quantitative MRI measurement by comparing both approaches. Eighty women with no imaging abnormalities (BI-RADS 1 and 2) were included in this institutional review board (IRB)-approved prospective study. All women underwent un-enhanced breast MRI. Four radiologists independently assessed FGT with MRI by subjective visual estimation according to BI-RADS. Automated observer-independent quantitative measurement of FGT with MRI was performed using a previously described measurement system. Inter-/intra-observer agreements of qualitative and quantitative FGT measurements were assessed using Cohen's kappa (k). Inexperienced readers achieved moderate inter-/intra-observer agreement and experienced readers a substantial inter- and perfect intra-observer agreement for subjective visual estimation of FGT. Practice and experience reduced observer-dependency. Automated observer-independent quantitative measurement of FGT was successfully performed and revealed only fair to moderate agreement (k = 0.209-0.497) with subjective visual estimations of FGT. Subjective visual estimation of FGT with MRI shows moderate intra-/inter-observer agreement, which can be improved by practice and experience. Automated observer-independent quantitative measurements of FGT are necessary to allow a standardized risk evaluation. (orig.)
Performance of intraclass correlation coefficient (ICC) as a reliability index under various distributions in scale reliability studies.

Science.gov (United States)

Mehta, Shraddha; Bastero-Caballero, Rowena F; Sun, Yijun; Zhu, Ray; Murphy, Diane K; Hardas, Bhushan; Koch, Gary

2018-04-29

Many published scale validation studies determine inter-rater reliability using the intra-class correlation coefficient (ICC). However, the use of this statistic must consider its advantages, limitations, and applicability. This paper evaluates how interaction of subject distribution, sample size, and levels of rater disagreement affects ICC and provides an approach for obtaining relevant ICC estimates under suboptimal conditions. Simulation results suggest that for a fixed number of subjects, ICC from the convex distribution is smaller than ICC for the uniform distribution, which in turn is smaller than ICC for the concave distribution. The variance component estimates also show that the dissimilarity of ICC among distributions is attributed to the study design (ie, distribution of subjects) component of subject variability and not the scale quality component of rater error variability. The dependency of ICC on the distribution of subjects makes it difficult to compare results across reliability studies. Hence, it is proposed that reliability studies should be designed using a uniform distribution of subjects because of the standardization it provides for representing objective disagreement. In the absence of uniform distribution, a sampling method is proposed to reduce the non-uniformity. In addition, as expected, high levels of disagreement result in low ICC, and when the type of distribution is fixed, any increase in the number of subjects beyond a moderately large specification such as n = 80 does not have a major impact on ICC. Copyright © 2018 John Wiley & Sons, Ltd.
Reliability of physical examination tests for the diagnosis of knee disorders: Evidence from a systematic review.

Science.gov (United States)

Décary, Simon; Ouellet, Philippe; Vendittoli, Pascal-André; Desmeules, François

2016-12-01

Clinicians often rely on physical examination tests to guide them in the diagnostic process of knee disorders. However, reliability of these tests is often overlooked and may influence the consistency of results and overall diagnostic validity. Therefore, the objective of this study was to systematically review evidence on the reliability of physical examination tests for the diagnosis of knee disorders. A structured literature search was conducted in databases up to January 2016. Included studies needed to report reliability measures of at least one physical test for any knee disorder. Methodological quality was evaluated using the QAREL checklist. A qualitative synthesis of the evidence was performed. Thirty-three studies were included with a mean QAREL score of 5.5 ± 0.5. Based on low to moderate quality evidence, the Thessaly test for meniscal injuries reached moderate inter-rater reliability (k = 0.54). Based on moderate to excellent quality evidence, the Lachman for anterior cruciate ligament injuries reached moderate to excellent inter-rater reliability (k = 0.42 to 0.81). Based on low to moderate quality evidence, the Tibiofemoral Crepitus, Joint Line and Patellofemoral Pain/Tenderness, Bony Enlargement and Joint Pain on Movement tests for knee osteoarthritis reached fair to excellent inter-rater reliability (k = 0.29 to 0.93). Based on low to moderate quality evidence, the Lateral Glide, Lateral Tilt, Lateral Pull and Quality of Movement tests for patellofemoral pain reached moderate to good inter-rater reliability (k = 0.49 to 0.73). Many physical tests appear to reach good inter-rater reliability, but this is based on low-quality and conflicting evidence. High-quality research is required to evaluate the reliability of knee physical examination tests. Copyright © 2016 Elsevier Ltd. All rights reserved.
Pneumothorax size measurements on digital chest radiographs: Intra- and inter- rater reliability.

Science.gov (United States)

Thelle, Andreas; Gjerdevik, Miriam; Grydeland, Thomas; Skorge, Trude D; Wentzel-Larsen, Tore; Bakke, Per S

2015-10-01

Detailed and reliable methods may be important for discussions on the importance of pneumothorax size in clinical decision-making. Rhea's method is widely used to estimate pneumothorax size in percent based on chest X-rays (CXRs) from three measure points. Choi's addendum is used for anterioposterior projections. The aim of this study was to examine the intrarater and interrater reliability of the Rhea and Choi method using digital CXR in the ward based PACS monitors. Three physicians examined a retrospective series of 80 digital CXRs showing pneumothorax, using Rhea and Choi's method, then repeated in a random order two weeks later. We used the analysis of variance technique by Eliasziw et al. to assess the intrarater and interrater reliability in altogether 480 estimations of pneumothorax size. Estimated pneumothorax sizes ranged between 5% and 100%. The intrarater reliability coefficient was 0.98 (95% one-sided lower-limit confidence interval C 0.96), and the interrater reliability coefficient was 0.95 (95% one-sided lower-limit confidence interval 0.93). This study has shown that the Rhea and Choi method for calculating pneumothorax size has high intrarater and interrater reliability. These results are valid across gender, side of pneumothorax and whether the patient is diagnosed with primary or secondary pneumothorax. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
RELIABILITY OF ANKLE-FOOT MORPHOLOGY, MOBILITY, STRENGTH, AND MOTOR PERFORMANCE MEASURES.

Science.gov (United States)

Fraser, John J; Koldenhoven, Rachel M; Saliba, Susan A; Hertel, Jay

2017-12-01

Assessment of foot posture, morphology, intersegmental mobility, strength and motor control of the ankle-foot complex are commonly used clinically, but measurement properties of many assessments are unclear. To determine test-retest and inter-rater reliability, standard error of measurement, and minimal detectable change of morphology, joint excursion and play, strength, and motor control of the ankle-foot complex. Reliability study. 24 healthy, recreationally-active young adults without history of ankle-foot injury were assessed by two clinicians on two occasions, three to ten days apart. Measurement properties were assessed for foot morphology (foot posture index, total and truncated length, width, arch height), joint excursion (weight-bearing dorsiflexion, rearfoot and hallux goniometry, forefoot inclinometry, 1 st metatarsal displacement) and joint play, strength (handheld dynamometry), and motor control rating during intrinsic foot muscle (IFM) exercises. Clinician order was randomized using a Latin Square. The clinicians performed independent examinations and did not confer on the findings for the duration of the study. Test-retest and inter-tester reliability and agreement was assessed using intraclass correlation coefficients (ICC 2,k ) and weighted kappa ( K w ). Test-retest reliability ICC were as follows: morphology: .80-1.00, joint excursion: .58-.97, joint play: -.67-.84, strength: .67-.92, IFM motor rating: K W -.01-.71. Inter-rater reliability ICC were as follows: morphology: .81-1.00, joint excursion: .32-.97, joint play: -1.06-1.00, strength: .53-.90, and IFM motor rating: K w .02-.56. Measures of ankle-foot posture, morphology, joint excursion, and strength demonstrated fair to excellent test-retest and inter-rater reliability. Test-retest reliability for rating of perceived difficulty and motor performance was good to excellent for short-foot, toe-spread-out, and hallux exercises and poor to fair for lesser toe extension. Joint play measures had
The reliability of WorkWell Systems Functional Capacity Evaluation: a systematic review

Science.gov (United States)

2014-01-01

Background Functional capacity evaluation (FCE) determines a person’s ability to perform work-related tasks and is a major component of the rehabilitation process. The WorkWell Systems (WWS) FCE (formerly known as Isernhagen Work Systems FCE) is currently the most commonly used FCE tool in German rehabilitation centres. Our systematic review investigated the inter-rater, intra-rater and test-retest reliability of the WWS FCE. Methods We performed a systematic literature search of studies on the reliability of the WWS FCE and extracted item-specific measures of inter-rater, intra-rater and test-retest reliability from the identified studies. Intraclass correlation coefficients ≥ 0.75, percentages of agreement ≥ 80%, and kappa coefficients ≥ 0.60 were categorised as acceptable, otherwise they were considered non-acceptable. The extracted values were summarised for the five performance categories of the WWS FCE, and the results were classified as either consistent or inconsistent. Results From 11 identified studies, 150 item-specific reliability measures were extracted. 89% of the extracted inter-rater reliability measures, all of the intra-rater reliability measures and 96% of the test-retest reliability measures of the weight handling and strength tests had an acceptable level of reliability, compared to only 67% of the test-retest reliability measures of the posture/mobility tests and 56% of the test-retest reliability measures of the locomotion tests. Both of the extracted test-retest reliability measures of the balance test were acceptable. Conclusions Weight handling and strength tests were found to have consistently acceptable reliability. Further research is needed to explore the reliability of the other tests as inconsistent findings or a lack of data prevented definitive conclusions. PMID:24674029
A reliability analysis of the revised competitiveness index.

Science.gov (United States)

Harris, Paul B; Houston, John M

2010-06-01

This study examined the reliability of the Revised Competitiveness Index by investigating the test-retest reliability, interitem reliability, and factor structure of the measure based on a sample of 280 undergraduates (200 women, 80 men) ranging in age from 18 to 28 years (M = 20.1, SD = 2.1). The findings indicate that the Revised Competitiveness Index has high test-retest reliability, high inter-item reliability, and a stable factor structure. The results support the assertion that the Revised Competitiveness Index assesses competitiveness as a stable trait rather than a dynamic state.
The reliability of three psoriasis assessment tools: Psoriasis area and severity index, body surface area and physician global assessment.

Science.gov (United States)

Bożek, Agnieszka; Reich, Adam

2017-08-01

A wide variety of psoriasis assessment tools have been proposed to evaluate the severity of psoriasis in clinical trials and daily practice. The most frequently used clinical instrument is the psoriasis area and severity index (PASI); however, none of the currently published severity scores used for psoriasis meets all the validation criteria required for an ideal score. The aim of this study was to compare and assess the reliability of 3 commonly used assessment instruments for psoriasis severity: the psoriasis area and severity index (PASI), body surface area (BSA) and physician global assessment (PGA). On the scoring day, 10 trained dermatologists evaluated 9 adult patients with plaque-type psoriasis using the PASI, BSA and PGA. All the subjects were assessed twice by each physician. Correlations between the assessments were analyzed using the Pearson correlation coefficient. Intra-class correlation coefficient (ICC) was calculated to analyze intra-rater reliability, and the coefficient of variation (CV) was used to assess inter-rater variability. Significant correlations were observed among the 3 scales in both assessments. In all 3 scales the ICCs were > 0.75, indicating high intra-rater reliability. The highest ICC was for the BSA (0.96) and the lowest one for the PGA (0.87). The CV for the PGA and PASI were 29.3 and 36.9, respectively, indicating moderate inter-rater variability. The CV for the BSA was 57.1, indicating high inter-rater variability. Comparing the PASI, PGA and BSA, it was shown that the PGA had the highest inter-rater reliability, whereas the BSA had the highest intra-rater reliability. The PASI showed intermediate values in terms of interand intra-rater reliability. None of the 3 assessment instruments showed a significant advantage over the other. A reliable assessment of psoriasis severity requires the use of several independent evaluations simultaneously.
Only Moderate Intra- and Inter-observer Agreement between Radiologists and Surgeons when Grading Blunt Paediatric Hepatic Injury on CT Scan

NARCIS (Netherlands)

Nellensteijn, D. R.; ten Duis, H. J.; Oldenziel, J.; Polak, W. G.; Hulscher, J. B. F.

2009-01-01

Introduction: The American Pediatric Surgical Association developed guidelines for the management of haemodynamically stable children with hepatic or splenic injury, based on grade of injury on CF scan. This study investigated the intra- and inter-observer agreement of radiologists, paediatric
Inter-observer agreement for Crohn's disease sub-phenotypes using the Montreal Classification: How good are we? A multi-centre Australasian study.

Science.gov (United States)

Krishnaprasad, Krupa; Andrews, Jane M; Lawrance, Ian C; Florin, Timothy; Gearry, Richard B; Leong, Rupert W L; Mahy, Gillian; Bampton, Peter; Prosser, Ruth; Leach, Peta; Chitti, Laurie; Cock, Charles; Grafton, Rachel; Croft, Anthony R; Cooke, Sharon; Doecke, James D; Radford-Smith, Graham L

2012-04-01

Crohn's disease (CD) exhibits significant clinical heterogeneity. Classification systems attempt to describe this; however, their utility and reliability depends on inter-observer agreement (IOA). We therefore sought to evaluate IOA using the Montreal Classification (MC). De-identified clinical records of 35 CD patients from 6 Australian IBD centres were presented to 13 expert practitioners from 8 Australia and New Zealand Inflammatory Bowel Disease Consortium (ANZIBDC) centres. Practitioners classified the cases using MC and forwarded data for central blinded analysis. IOA on smoking and medications was also tested. Kappa statistics, with pre-specified outcomes of κ>0.8 excellent; 0.61-0.8 good; 0.41-0.6 moderate and ≤0.4 poor, were used. 97% of study cases had colonoscopy reports, however, only 31% had undergone a complete set of diagnostic investigations (colonoscopy, histology, SB imaging). At diagnosis, IOA was excellent for age, κ=0.84; good for disease location, κ=0.73; only moderate for upper GI disease (κ=0.57) and disease behaviour, κ=0.54; and good for the presence of perianal disease, κ=0.6. At last follow-up, IOA was good for location, κ=0.68; only moderate for upper GI disease (κ=0.43) and disease behaviour, κ=0.46; but excellent for the presence/absence of perianal disease, κ=0.88. IOA for immunosuppressant use ever and presence of stricture were both good (κ=0.79 and 0.64 respectively). IOA using MC is generally good; however some areas are less consistent than others. Omissions and inaccuracies reduce the value of clinical data when comparing cohorts across different centres, and may impair the ability to translate genetic discoveries into clinical practice. Crown Copyright Â© 2011. Published by Elsevier B.V. All rights reserved.
Assessment of dental anomalies on panoramic radiographs: inter- and intraexaminer agreement

NARCIS (Netherlands)

van Parys, K.; Aartman, I.H.A.; Kuitert, R.; Zentner, A.

2011-01-01

The presence of dental anomalies has been rated radiographically in a number of studies. However, since the reliability of the assessment of these anomalies has rarely been investigated, the aim of this study was to examine inter- and intraexaminer agreement in identifying morphological dental
Study for increasing micro-drill reliability by vibrating drilling

International Nuclear Information System (INIS)

Yang Zhaojun; Li Wei; Chen Yanhong; Wang Lijiang

1998-01-01

A study for increasing micro-drill reliability by vibrating drilling is described. Under the experimental conditions of this study it is observed, from reliability testing and the fitting of a life-distribution function, that the lives of micro-drills under ordinary drilling follow the log-normal distribution and the lives of micro-drills under vibrating drilling follow the Weibull distribution. Calculations for reliability analysis show that vibrating drilling can increase the lives of micro-drills and correspondingly reduce the scatter of drill lives. Therefore, vibrating drilling increases the reliability of micro-drills
The reliability of a modified Kalamazoo Consensus Statement Checklist for assessing the communication skills of multidisciplinary clinicians in the simulated environment.

Science.gov (United States)

Peterson, Eleanor B; Calhoun, Aaron W; Rider, Elizabeth A

2014-09-01

With increased recognition of the importance of sound communication skills and communication skills education, reliable assessment tools are essential. This study reports on the psychometric properties of an assessment tool based on the Kalamazoo Consensus Statement Essential Elements Communication Checklist. The Gap-Kalamazoo Communication Skills Assessment Form (GKCSAF), a modified version of an existing communication skills assessment tool, the Kalamazoo Essential Elements Communication Checklist-Adapted, was used to assess learners in a multidisciplinary, simulation-based communication skills educational program using multiple raters. 118 simulated conversations were available for analysis. Internal consistency and inter-rater reliability were determined by calculating a Cronbach's alpha score and intra-class correlation coefficients (ICC), respectively. The GKCSAF demonstrated high internal consistency with a Cronbach's alpha score of 0.844 (faculty raters) and 0.880 (peer observer raters), and high inter-rater reliability with an ICC of 0.830 (faculty raters) and 0.89 (peer observer raters). The Gap-Kalamazoo Communication Skills Assessment Form is a reliable method of assessing the communication skills of multidisciplinary learners using multi-rater methods within the learning environment. The Gap-Kalamazoo Communication Skills Assessment Form can be used by educational programs that wish to implement a reliable assessment and feedback system for a variety of learners. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Quantitative outcome measures for systemic sclerosis-related Microangiopathy - Reliability of image acquisition in Nailfold Capillaroscopy.

Science.gov (United States)

Dinsdale, Graham; Moore, Tonia; O'Leary, Neil; Berks, Michael; Roberts, Christopher; Manning, Joanne; Allen, John; Anderson, Marina; Cutolo, Maurizio; Hesselstrand, Roger; Howell, Kevin; Pizzorni, Carmen; Smith, Vanessa; Sulli, Alberto; Wildt, Marie; Taylor, Christopher; Murray, Andrea; Herrick, Ariane L

2017-09-01

Nailfold capillaroscopic parameters hold increasing promise as outcome measures for clinical trials in systemic sclerosis (SSc). Their inclusion as outcomes would often naturally require capillaroscopy images to be captured at several time points during any one study. Our objective was to assess repeatability of image acquisition (which has been little studied), as well as of measurement. 41 patients (26 with SSc, 15 with primary Raynaud's phenomenon) and 10 healthy controls returned for repeat high-magnification (300×) videocapillaroscopy mosaic imaging of 10 digits one week after initial imaging (as part of a larger study of reliability). Images were assessed in a random order by an expert blinded observer and 4 outcome measures extracted: (1) overall image grade and then (where possible) distal vessel locations were marked, allowing (2) vessel density (across the whole nailfold) to be calculated (3) apex width measurement and (4) giant vessel count. Intra-rater, intra-visit and intra-rater inter-visit (baseline vs. 1week) reliability were examined in 475 and 392 images respectively. A linear, mixed-effects model was used to estimate variance components, from which intra-class correlation coefficients (ICCs) were determined. Intra-visit and inter-visit reliability estimates (ICCs) were (respectively): overall image grade, 0.97 and 0.90; vessel density, 0.92 and 0.65; mean vessel width, 0.91 and 0.79; presence of giant capillary, 0.68 and 0.56. These estimates were conditional on each parameter being measurable. Within-operator image analysis and acquisition are reproducible. Quantitative nailfold capillaroscopy, at least with a single observer, provides reliable outcome measures for clinical studies including randomised controlled trials. Copyright © 2017 Elsevier Inc. All rights reserved.
Reliability of capturing foot parameters using digital scanning and the neutral suspension casting technique

Science.gov (United States)

2011-01-01

Background A clinical study was conducted to determine the intra and inter-rater reliability of digital scanning and the neutral suspension casting technique to measure six foot parameters. The neutral suspension casting technique is a commonly utilised method for obtaining a negative impression of the foot prior to orthotic fabrication. Digital scanning offers an alternative to the traditional plaster of Paris techniques. Methods Twenty one healthy participants volunteered to take part in the study. Six casts and six digital scans were obtained from each participant by two raters of differing clinical experience. The foot parameters chosen for investigation were cast length (mm), forefoot width (mm), rearfoot width (mm), medial arch height (mm), lateral arch height (mm) and forefoot to rearfoot alignment (degrees). Intraclass correlation coefficients (ICC) with 95% confidence intervals (CI) were calculated to determine the intra and inter-rater reliability. Measurement error was assessed through the calculation of the standard error of the measurement (SEM) and smallest real difference (SRD). Results ICC values for all foot parameters using digital scanning ranged between 0.81-0.99 for both intra and inter-rater reliability. For neutral suspension casting technique inter-rater reliability values ranged from 0.57-0.99 and intra-rater reliability values ranging from 0.36-0.99 for rater 1 and 0.49-0.99 for rater 2. Conclusions The findings of this study indicate that digital scanning is a reliable technique, irrespective of clinical experience, with reduced measurement variability in all foot parameters investigated when compared to neutral suspension casting. PMID:21375757
Inter-observer agreement for diagnostic classification of esophageal motility disorders defined in high-resolution manometry

NARCIS (Netherlands)

Fox, M. R.; Pandolfino, J. E.; Sweis, R.; Sauter, M.; Abreu Y Abreu, A. T.; Anggiansah, A.; Bogte, A.; Bredenoord, A. J.; Dengler, W.; Elvevi, A.; Fruehauf, H.; Gellersen, S.; Ghosh, S.; Gyawali, C. P.; Heinrich, H.; Hemmink, M.; Jafari, J.; Kaufman, E.; Kessing, K.; Kwiatek, M.; Lubomyr, B.; Banasiuk, M.; Mion, F.; Pérez-de-la-Serna, J.; Remes-Troche, J. M.; Rohof, W.; Roman, S.; Ruiz-de-León, A.; Tutuian, R.; Uscinowicz, M.; Valdovinos, M. A.; Vardar, R.; Velosa, M.; Waśko-Czopnik, D.; Weijenborg, P.; Wilshire, C.; Wright, J.; Zerbib, F.; Menne, D.

2015-01-01

High-resolution esophageal manometry (HRM) is a recent development used in the evaluation of esophageal function. Our aim was to assess the inter-observer agreement for diagnosis of esophageal motility disorders using this technology. Practitioners registered on the HRM Working Group website were
Reliability and validity of a nutrition and physical activity environmental self-assessment for child care

Directory of Open Access Journals (Sweden)

Ammerman Alice S

2007-07-01

Full Text Available Abstract Background Few assessment instruments have examined the nutrition and physical activity environments in child care, and none are self-administered. Given the emerging focus on child care settings as a target for intervention, a valid and reliable measure of the nutrition and physical activity environment is needed. Methods To measure inter-rater reliability, 59 child care center directors and 109 staff completed the self-assessment concurrently, but independently. Three weeks later, a repeat self-assessment was completed by a sub-sample of 38 directors to assess test-retest reliability. To assess criterion validity, a researcher-administered environmental assessment was conducted at 69 centers and was compared to a self-assessment completed by the director. A weighted kappa test statistic and percent agreement were calculated to assess agreement for each question on the self-assessment. Results For inter-rater reliability, kappa statistics ranged from 0.20 to 1.00 across all questions. Test-retest reliability of the self-assessment yielded kappa statistics that ranged from 0.07 to 1.00. The inter-quartile kappa statistic ranges for inter-rater and test-retest reliability were 0.45 to 0.63 and 0.27 to 0.45, respectively. When percent agreement was calculated, questions ranged from 52.6% to 100% for inter-rater reliability and 34.3% to 100% for test-retest reliability. Kappa statistics for validity ranged from -0.01 to 0.79, with an inter-quartile range of 0.08 to 0.34. Percent agreement for validity ranged from 12.9% to 93.7%. Conclusion This study provides estimates of criterion validity, inter-rater reliability and test-retest reliability for an environmental nutrition and physical activity self-assessment instrument for child care. Results indicate that the self-assessment is a stable and reasonably accurate instrument for use with child care interventions. We therefore recommend the Nutrition and Physical Activity Self-Assessment for
Clinical assessment of scapular positioning in musicians: an intertester reliability study.

Science.gov (United States)

Struyf, Filip; Nijs, Jo; De Coninck, Kris; Giunta, Marco; Mottram, Sarah; Meeusen, Romain

2009-01-01

The reliability of the measurement of the distance between the posterior border of the acromion and the wall and the reliability of the modified lateral scapular slide test have not been studied. Overall, the reliability of the clinical tools used to assess scapular positioning has not been studied in musicians. To examine the intertester reliability of scapular observation and 2 clinical tests for the assessment of scapular positioning in musicians. Intertester reliability study. University research laboratory. Thirty healthy student musicians at a single university. Two assessors performed a standardized observation protocol, the measurement of the distance between the posterior border of the acromion and the wall, and the modified lateral scapular slide test. Each assessor was blinded to the other's findings. The intertester reliability coefficients (kappa) for the observation in relaxed position, during unloaded movement, and during loaded movement were 0.41, 0.63, and 0.36, respectively. The kappa values for the observation of tilting and winging at rest were 0.48 and 0.42, respectively; during unloaded movement, the kappa values were 0.52 and 0.78, respectively; and with a 1-kg load, the kappa values were 0.24 and 0.50, respectively. The intraclass correlation coefficient (ICC) of the measurement of the acromial distance was 0.72 in relaxed position and 0.75 with the participant actively retracting both shoulders. The ICCs for the modified lateral scapular slide test varied between 0.63 and 0.58. Our results demonstrated that the modified lateral scapular slide test was not a reliable tool to assess scapular positioning in these participants. Our data indicated that scapular observation in the relaxed position and during unloaded abduction in the frontal plane was a reliable assessment tool. The reliability of the measurement of the distance between the posterior border of the acromion and the wall in healthy musicians was moderate.
Content Validity Index and Intra- and Inter-Rater Reliability of a New Muscle Strength/Endurance Test Battery for Swedish Soldiers.

Directory of Open Access Journals (Sweden)

Helena Larsson

Full Text Available The objective of this study was to examine the content validity of commonly used muscle performance tests in military personnel and to investigate the reliability of a proposed test battery. For the content validity investigation, thirty selected tests were those described in the literature and/or commonly used in the Nordic and North Atlantic Treaty Organization (NATO countries. Nine selected experts rated, on a four-point Likert scale, the relevance of these tests in relation to five different work tasks: lifting, carrying equipment on the body or in the hands, climbing, and digging. Thereafter, a content validity index (CVI was calculated for each work task. The result showed excellent CVI (≥0.78 for sixteen tests, which comprised of one or more of the military work tasks. Three of the tests; the functional lower-limb loading test (the Ranger test, dead-lift with kettlebells, and back extension, showed excellent content validity for four of the work tasks. For the development of a new muscle strength/endurance test battery, these three tests were further supplemented with two other tests, namely, the chins and side-bridge test. The inter-rater reliability was high (intraclass correlation coefficient, ICC2,1 0.99 for all five tests. The intra-rater reliability was good to high (ICC3,1 0.82-0.96 with an acceptable standard error of mean (SEM, except for the side-bridge test (SEM%>15. Thus, the final suggested test battery for a valid and reliable evaluation of soldiers' muscle performance comprised the following four tests; the Ranger test, dead-lift with kettlebells, chins, and back extension test. The criterion-related validity of the test battery should be further evaluated for soldiers exposed to varying physical workload.

Intra-observer reliability and agreement of manual and digital orthodontic model analysis.

Science.gov (United States)

Koretsi, Vasiliki; Tingelhoff, Linda; Proff, Peter; Kirschneck, Christian

2018-01-23

Digital orthodontic model analysis is gaining acceptance in orthodontics, but its reliability is dependent on the digitalisation hardware and software used. We thus investigated intra-observer reliability and agreement / conformity of a particular digital model analysis work-flow in relation to traditional manual plaster model analysis. Forty-eight plaster casts of the upper/lower dentition were collected. Virtual models were obtained with orthoX®scan (Dentaurum) and analysed with ivoris®analyze3D (Computer konkret). Manual model analyses were done with a dial caliper (0.1 mm). Common parameters were measured on each plaster cast and its virtual counterpart five times each by an experienced observer. We assessed intra-observer reliability within method (ICC), agreement/conformity between methods (Bland-Altman analyses and Lin's concordance correlation), and changing bias (regression analyses). Intra-observer reliability was substantial within each method (ICC ≥ 0.7), except for five manual outcomes (12.8 per cent). Bias between methods was statistically significant, but less than 0.5 mm for 87.2 per cent of the outcomes. In general, larger tooth sizes were measured digitally. Total difference maxilla and mandible had wide limits of agreement (-3.25/6.15 and -2.31/4.57 mm), but bias between methods was mostly smaller than intra-observer variation within each method with substantial conformity of manual and digital measurements in general. No changing bias was detected. Although both work-flows were reliable, the investigated digital work-flow proved to be more reliable and yielded on average larger tooth sizes. Averaged differences between methods were within 0.5 mm for directly measured outcomes but wide ranges are expected for some computed space parameters due to cumulative error. © The Author 2017. Published by Oxford University Press on behalf of the European Orthodontic Society. All rights reserved. For permissions, please email: journals.permissions@oup.com
Inter-rater Reliability of the Dysphagia Outcome and Severity Scale (DOSS): Effects of Clinical Experience, Audio-Recording and Training.

Science.gov (United States)

Zarkada, Angeliki; Regan, Julie

2017-10-19

The Dysphagia Outcome and Severity Scale (DOSS) is widely used to measure dysphagia severity based on videofluoroscopy (VFSS). This study investigated inter-rater reliability (IRR) of the DOSS. It also determined the effect of clinical experience, VFSS audio-recording and training on DOSS IRR. A quantitative prospective research design was used. Seventeen speech and language pathologists (SLPs) were recruited from an acute teaching hospital, Dublin (> 3 years' VFSS experience, n = 10) and from a postgraduate dysphagia programme in a university setting (training session on DOSS rating after which DOSS IRR was re-tested. Cohen's kappa co-efficient was used to establish IRR. IRR of the DOSS presented only fair agreement (κ = 0.36, p training (κ = 0.328) was significantly better comparing to post-training (κ = 0.218) (p < 0.05). Findings raise concerns as the DOSS is frequently used in clinical practice to capture dysphagia severity and to monitor changes.
A longitudinal study of childhood social behaviour : Inter-informant agreement, inter-context agreement, and social preference linkages

NARCIS (Netherlands)

Kuppens, Sofie; Grietens, Hans; Onghena, Patrick; Michiels, Daisy

2009-01-01

This study examined inter-informant agreement, inter-context agreement, and social preference linkages for social behaviour subtypes. On two occasions, data was collected on 600 children (8-10 years old) via mother, father, teacher, and peer reports. Informant reports converged within each context
Link and route availability for Inter-working multi-hop wireless networks

CSIR Research Space (South Africa)

Salami, O

2009-09-01

Full Text Available pairs in inter-working multi-hop wireless networks can be evaluated based on the availability and reliability of radio links that form the communication path linking the nodes. This paper presents an analytical study of the link and route availability...
Scoring sacroiliac joints by magnetic resonance imaging. A Multiple-reader reliability experiment

DEFF Research Database (Denmark)

Landewé, RB; Hermann, KG; van der Heijde, DM

2005-01-01

Magnetic resonance imaging (MRI) of the sacroiliac (SI) joints and the spine is increasingly important in the assessment of inflammatory activity and structural damage in clinical trials with patients with ankylosing spondylitis (AS). We investigated inter-reader reliability and sensitivity...... for 'depth' and 'intensity,' and the fifth method included the SPARCC slice with the maximum score. Inter-reader reliability was investigated by calculating intraclass correlation coefficients (ICC) for all readers together and for all possible reader pairs. Sensitivity to change was investigated...... values close to zero (no agreement) and highest observed values over 0.80 (excellent agreement). In general, agreement of status scores was somewhat better than agreement of change scores, and agreement of the comprehensive SPARCC scoring system was somewhat better than agreement of the more condensed...
Intra- and inter-observer variability and accuracy in the determination of linear and angular measurements in computed tomography

International Nuclear Information System (INIS)

Christiansen, E.L.; Thompson, J.R.; Kopp, S.

1986-01-01

The observer variability and accuracy of linear and angular computed tomography (CT) software measurements in the transaxial plane were investigated for the temporomandibular joint with the General Electric 8800 CT/N Scanner. A dried and measured human mandible was embedded in plastic and scanned in vitro. Sixteen observers participated in the study. The following measurements were tested: inter- and extra-condylar distances, transverse condylar dimension, condylar angulation, and the plastic base of the specimen. Three frozen cadaveric heads were similarly scanned and measured in situ. Intra- and inter-observer variabilities were lowest for the specimen base and highest for condylar angulation. Neuroradiologists had the lowest variability as a group, and the radiology residents and paramedical personell had the highest, but the differences were small. No significant difference was found between CT and macroscopic measurement of the mandible. In situ measurement by CT of condyles with structural changes in the transaxial plane was, however, subject to substantial error. It was concluded that transaxial linear measurements of the condylar processes free of significant structural changes had an error and an accuracy well within acceptable limits. The error for angular measurements was significantly greater than the error for linear measurements
How reliable are case formulations? A systematic literature review.

Science.gov (United States)

Flinn, Lucinda; Braham, Louise; das Nair, Roshan

2015-09-01

This systematic literature review investigated the inter-rater and test-retest reliability of case formulations. We considered the reliability of case formulations across a range of theoretical modalities and the general quality of the primary research studies. A systematic search of five electronic databases was conducted in addition to reference list trawling to find studies that assessed the reliability of case formulation. This yielded 18 studies for review. A methodological quality assessment tool was developed to assess the quality of studies, which informed interpretation of the findings. Results indicated inter-rater reliability mainly ranging from slight (.1-.4) to substantial (.81-1.0). Some studies highlighted that training and increased experience led to higher levels of agreement. In general, psychodynamic formulations appeared to generate somewhat increased levels of reliability than cognitive or behavioural formulations; however, these studies also included methods that may have served to inflate reliability, for example, pooling the scores of judges. Only one study investigated the test-retest reliability of case formulations yielding support for the stability of formulations over a 3-month period. Reliability of case formulations is varied across a range of theoretical modalities, but can be improved; however, further research is required to strengthen our conclusions. Clinical implications: The findings from the review evidence some support for case formulation being congruent with the scientist-practitioner approach. The reliability of case formulation is likely to be improved through training and clinical experience. Limitations: The broad inclusion criteria may have introduced heterogeneity into the sample, which may have affected the results. Studies reviewed were limited to peer-reviewed journal articles written in the English language, which may represent a source of publication and selection bias. © 2014 The British Psychological Society.
Improving the Reliability of Case-Based Reasoning Systems

Directory of Open Access Journals (Sweden)

Xu Xu

2010-09-01

also discussed in this paper, especially the property that whether inter-feature of case exists redundancy. After that, the reliability of an individual suggested solution is studied. To illustrate these ideas, some experiments and their results are discussed in this paper. The results of experiments show a new route concerning on how to improve the reliability of a CBR system at an overall level.
Growing Region Segmentation Software (GRES) for quantitative magnetic resonance imaging of multiple sclerosis: intra- and inter-observer agreement variability: a comparison with manual contouring method

International Nuclear Information System (INIS)

Parodi, Roberto C.; Sardanelli, Francesco; Renzetti, Paolo; Rosso, Elisabetta; Losacco, Caterina; Ferrari, Alessandra; Levrero, Fabrizio; Pilot, Alberto; Inglese, Matilde; Mancardi, Giovanni L.

2002-01-01

Lesion area measurement in multiple sclerosis (MS) is one of the key points in evaluating the natural history and in monitoring the efficacy of treatments. This study was performed to check the intra- and inter-observer agreement variability of a locally developed Growing Region Segmentation Software (GRES), comparing them to those obtained using manual contouring (MC). From routine 1.5-T MRI study of clinically definite multiple sclerosis patients, 36 lesions seen on proton-density-weighted images (PDWI) and 36 enhancing lesion on Gd-DTPA-BMA-enhanced T1-weighted images (Gd-T1WI) were randomly chosen and were evaluated by three observers. The mean range of lesion size was 9.9-536.0 mm 2 on PDWI and 3.6-57.2 mm 2 on Gd-T1WI. The median intra- and inter-observer agreement were, respectively, 97.1 and 90.0% using GRES on PDWI, 81.0 and 70.0% using MC on PDWI, 88.8 and 80.0% using GRES on Gd-T1WI, and 85.8 and 70.0% using MC on Gd-T1WI. The intra- and inter-observer agreements were significantly greater for GRES compared with MC (P<0.0001 and P=0.0023, respectively) for PDWI, while no difference was found between GRES an MC for Gd-T1WI. The intra-observer variability for GRES was significantly lower on both PDWI (P=0.0001) and Gd-T1WI (P=0.0067), whereas for MC the same result was found only for PDWI (P=0.0147). These data indicate that GRES reduces both the intra- and the inter-observer variability in assessing the area of MS lesions on PDWI and may prove useful in multicentre studies. (orig.)
Reliability of Multi-Category Rating Scales

Science.gov (United States)

Parker, Richard I.; Vannest, Kimberly J.; Davis, John L.

2013-01-01

The use of multi-category scales is increasing for the monitoring of IEP goals, classroom and school rules, and Behavior Improvement Plans (BIPs). Although they require greater inference than traditional data counting, little is known about the inter-rater reliability of these scales. This simulation study examined the performance of nine…
Reliability of computed tomography measurements in assessment of thigh muscle cross-sectional area and attenuation

International Nuclear Information System (INIS)

Strandberg, Sören; Wretling, Marie-Louise; Wredmark, Torsten; Shalabi, Adel

2010-01-01

Advancement in technology of computer tomography (CT) and introduction of new medical imaging softwares enables easy and rapid assessment of muscle cross-sectional area (CSA) and attenuation. Before using these techniques in clinical studies there is a need for evaluation of the reliability of the measurements. The purpose of the study was to evaluate the inter- and intra-observer reliability of ImageJ in measuring thigh muscles CSA and attenuation in patients with anterior cruciate ligament (ACL) injury by computer tomography. 31 patients from an ongoing study of rehabilitation and muscle atrophy after ACL reconstruction were included in the study. Axial CT images with slice thickness of 10 mm at the level of 150 mm above the knee joint were analyzed by two investigators independently at two times with a minimum of 3 weeks between the two readings using NIH ImageJ. CSA and the mean attenuation of individual thigh muscles were analyzed for both legs. Mean CSA and mean attenuation values were in good agreement both when comparing the two observers and the two replicates. The inter- and intraclass correlation (ICC) was generally very high with values from 0.98 to 1.00 for all comparisons except for the area of semimembranosus. All the ICC values were significant (p < 0,001). Pearson correlation coefficients were also generally very high with values from 0.98 to 1.00 for all comparisons except for the area of semimembranosus (0.95 for intraobserver and 0.92 for interobserver). This study has presented ImageJ as a method to monitor and evaluate CSA and attenuation of different muscles in the thigh using CT-imaging. The method shows an overall excellent reliability with respect to both observer and replicate
Inter-Rater Reliability and Agreement of the 6-Minute Walk Test in Women With Hip Fracture

DEFF Research Database (Denmark)

Larsen, Camilla Marie; Overgaard, Jan; Tange Kristensen, Morten

MWT in individuals with hip fractures. Methods: Two senior physiotherapy students independently examined (randomized order) a convenient sample of 20 participants; their assessments were separated by two days, and testing followed instructions from the American Thoracic Society(1). Hip pain...... was assessed with the Verbal Ranking Scale. Results: Participants (all women) with a mean (SD) age of 78.1 ± 5.9 years performed the test within a mean of 31.5 ± 5.8 days post-surgery; 10 had a cervical and 10 a trochanteric fracture. Excellent inter-rater reliability; ICC2.1 =0.92 (95% CI, 0.81 - 0...... = -0.196, P = 0.41). On the contrary, participants walked a mean of 21.7 ± 22.6 meters longer, at the second trial (P = 0.002). Participants with moderate hip fracture- related pain walked a shorter distance than those with no or light pain during the first test (P = 0.04), while this was not the case...
Reliability of cone beam computed tomography as a biopsy-independent tool in differential diagnosis of periapical cysts and granulomas: An In vivo Study.

Science.gov (United States)

Chanani, Ankit; Adhikari, Haridas Das

2017-01-01

Differential diagnosis of periapical cysts and granulomas is required as their treatment modalities are different. The aim of this study was to evaluate the efficacy of cone beam computed tomography (CBCT) in the differential diagnosis of periapical cysts from granulomas. A single-centered observational study was carried out in the Department of Conservative Dentistry and Endodontics, Dr. R. Ahmed Dental College and Hospital, using CBCT and dental operating microscope. Forty-five lesions were analyzed using CBCT scans. One evaluator analyzed each CBCT scan for the presence of the following six characteristic radiological features: cyst like-location, shape, periphery, internal structure, effect on the surrounding structures, and cortical plate perforation. Another independent evaluator analyzed the CBCT scans. This process was repeated after 6 months, and inter- and intrarater reliability of CBCT diagnoses was evaluated. Periapical surgeries were performed and tissue samples were obtained for histopathological analysis. To evaluate the efficacy, CBCT diagnoses were compared with histopathological diagnoses, and six receiver operating characteristic (ROC) curve analyses were conducted. ROC curve, Cronbach's alpha (α) test, and Cohen Kappa (κ) test were used for statistical analysis. Both inter- and intrarater reliability were excellent (α = 0.94, κ = 0.75 and 0.77, respectively). ROC curve with regard to ≥4 positive findings revealed the highest area under curve (0.66). CBCT is moderately accurate in the differential diagnosis of periapical cysts and granulomas.
Reliability and validity of the de Morton Mobility Index in individuals with sub-acute stroke.

Science.gov (United States)

Braun, Tobias; Marks, Detlef; Thiel, Christian; Grüneberg, Christian

2018-02-04

To establish the validity and reliability of the de Morton Mobility Index (DEMMI) in patients with sub-acute stroke. This cross-sectional study was performed in a neurological rehabilitation hospital. We assessed unidimensionality, construct validity, internal consistency reliability, inter-rater reliability, minimal detectable change and possible floor and ceiling effects of the DEMMI in adult patients with sub-acute stroke. The study included a total sample of 121 patients with sub-acute stroke. We analysed validity (n = 109) and reliability (n = 51) in two sub-samples. Rasch analysis indicated unidimensionality with an overall fit to the model (chi-square = 12.37, p = 0.577). All hypotheses on construct validity were confirmed. Internal consistency reliability (Cronbach's alpha = 0.94) and inter-rater reliability (intraclass correlation coefficient = 0.95; 95% confidence interval: 0.92-0.97) were excellent. The minimal detectable change with 90% confidence was 13 points. No floor or ceiling effects were evident. These results indicate unidimensionality, sufficient internal consistency reliability, inter-rater reliability, and construct validity of the DEMMI in patients with a sub-acute stroke. Advantages of the DEMMI in clinical application are the short administration time, no need for special equipment and interval level data. The de Morton Mobility Index, therefore, may be a useful performance-based bedside test to measure mobility in individuals with a sub-acute stroke across the whole mobility spectrum. Implications for Rehabilitation The de Morton Mobility Index (DEMMI) is an unidimensional measurement instrument of mobility in individuals with sub-acute stroke. The DEMMI has excellent internal consistency and inter-rater reliability, and sufficient construct validity. The minimal detectable change of the DEMMI with 90% confidence in stroke rehabilitation is 13 points. The lack of any floor or ceiling effects on hospital admission indicates
The modified gait abnormality rating scale in patients with a conversion disorder: a reliability and responsiveness study.

Science.gov (United States)

Vandenberg, Justin M; George, Deanna R; O'Leary, Andrea J; Olson, Lindsay C; Strassburg, Kaitlyn R; Hollman, John H

2015-01-01

Individuals with conversion disorder have neurologic symptoms that are not identified by an underlying organic cause. Often the symptoms manifest as gait disturbances. The modified gait abnormality rating scale (GARS-M) may be useful for quantifying gait abnormalities in these individuals. The purpose of this study was to examine the reliability, responsiveness and concurrent validity of GARS-M scores in individuals with conversion disorder. Data from 27 individuals who completed a rehabilitation program were included in this study. Pre- and post-intervention videos were obtained and walking speed was measured. Five examiners independently evaluated gait performance according to the GARS-M criteria. Inter- and intrarater reliability of GARS-M scores were estimated with intraclass correlation coefficients (ICCs). Responsiveness was estimated with the minimum detectable change (MDC). Pre- to post-treatment changes in GARS-M scores were analyzed with a dependent t-test. The correlation between GARS-M scores and walking speed was analyzed to assess concurrent validity. GARS-M scores were quantified with good-to-excellent inter- (ICC = 0.878) and intrarater reliability (ICC = 0.989). The MDC was 2 points. Mean GARS-M scores decreased from 7 ± 5 at baseline to 1 ± 2 at discharge (t26 = 7.411, p conversion disorder. GARS-M scores provide objective measures upon which treatment effects can be assessed. Copyright © 2014 Elsevier B.V. All rights reserved.
Intra- and interrater reliability of the 'lumbar-locked thoracic rotation test' in competitive swimmers ages 10 through 18 years.

Science.gov (United States)

Feijen, Stef; Kuppens, Kevin; Tate, Angela; Baert, Isabel; Struyf, Thomas; Struyf, Filip

2018-04-17

Measuring thoracic spine mobility can be of interest to competitive swimmers as it has been associated with shoulder girdle function and scapular position in subjects with and without shoulder pain. At present, no reliability data of thoracic spine mobility measurements are available in the swimming population. This study aims to evaluate the within-session intra- and interrater reliability of the "lumbar-locked rotation test" for thoracic spine rotation in competitive swimmers aged 10 to 18 years. This reliability study is part of a larger prospective cohort study investigating potential risk factors for the development of shoulder pain in competitive swimmers. Within-session, intra- and inter-rater reliability. Competitive swimming clubs in Belgium. 21 competitive swimmers. Intra- and inter-rater reliability of the lumbar-locked thoracic rotation test. Intraclass correlation coefficients (ICCs) ranged from 0.91 (95% CI 0.78 to 0.96) to 0.96 (0.89-0.98) for intra-rater reliability. Results for inter-rater reliability ranged from 0.89 (0.72-0.95) to 0.86 (0.65-0.94) respectively for right and left thoracic rotation. Results suggest good to excellent reliability of the lumbar-locked thoracic rotation test, indicating this test can be used reliably in clinical practice. Copyright © 2018 Elsevier Ltd. All rights reserved.
Medial tibial stress syndrome can be diagnosed reliably using history and physical examination.

Science.gov (United States)

Winters, M; Bakker, E W P; Moen, M H; Barten, C C; Teeuwen, R; Weir, A

2017-02-08

The majority of sporting injuries are clinically diagnosed using history and physical examination as the cornerstone. There are no studies supporting the reliability of making a clinical diagnosis of medial tibial stress syndrome (MTSS). Our aim was to assess if MTSS can be diagnosed reliably, using history and physical examination. We also investigated if clinicians were able to reliably identify concurrent lower leg injuries. A clinical reliability study was performed at multiple sports medicine sites in The Netherlands. Athletes with non-traumatic lower leg pain were assessed for having MTSS by two clinicians, who were blinded to each others' diagnoses. We calculated the prevalence, percentage of agreement, observed percentage of positive agreement (Ppos), observed percentage of negative agreement (Pneg) and Kappa-statistic with 95%CI. Forty-nine athletes participated in this study, of whom 46 completed both assessments. The prevalence of MTSS was 74%. The percentage of agreement was 96%, with Ppos and Pneg of 97% and 92%, respectively. The inter-rater reliability was almost perfect; k=0.89 (95% CI 0.74 to 1.00), phistory and physical examination, in clinical practice and research settings. We also found that concurrent lower leg injuries are common in athletes with MTSS. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Intra- and interobserver reliability of glenoid fracture classifications by Ideberg, Euler and AO.

Science.gov (United States)

Gilbert, F; Eden, L; Meffert, R; Konietschke, F; Lotz, J; Bauer, L; Staab, W

2018-03-27

Representing 3%-5% of shoulder girdle injuries scapula fractures are rare. Furthermore, approximately 1% of scapula fractures are intraarticularfractures of the glenoid fossa. Because of uncertain fracture morphology and limited experience, the treatment of glenoid fossa fractures is difficult. The glenoid fracture classification by Ideberg (1984) and Euler (1996) is still commonly used in literature. In 2013 a new glenoid fracture classification was introduced by the AO. The purpose of this study was to examine the new AO classification in clinical practice in comparison with the classifications by Ideberg and Euler. In total CT images of 84 patients with glenoid fossa fractures from 2005 to 2018 were included. Parasagittal, paracoronary and axial reconstructions were examined according to the classifications of Ideberg, Euler and the AO by 3 investigators (orthopedic surgeon, radiologist, student of medicine) at three individual time settings. Inter- and intraobserver reliability of the three classification systems were ascertained by computing Inter- and Intraclass (ICCs) correlation coefficients using Spearman's rank correlation coefficient, 95%-confidence intervals as well as F-tests for correlation coefficients. Inter- and intraobserver reliability for the AO classification showed a perspicuous coherence (R = 0.74 and R = 0.79). Low to moderate intraobserver reliability for Ideberg (R = 0.46) and Euler classification (R = 0.41) was found. Furthermore, data show a low Interobserver reliability for both Ideberg and Euler classification (R reliability using AO is significantly higher than those using Ideberg and Euler (p reliable grading of glenoid fossa fractures with high inter- and intraobserver reliability in 84 patients using CT images. It should possibly be applied in order to enable a valid, reliable and consistent academic description of glenoid fossa fractures. The established classifications by Euler and Ideberg are not capable of
Reliability in endoscopic diagnosis of portal hypertensive gastropathy

Science.gov (United States)

de Macedo, George Fred Soares; Ferreira, Fabio Gonçalves; Ribeiro, Maurício Alves; Szutan, Luiz Arnaldo; Assef, Mauricio Saab; Rossini, Lucio Giovanni Battista

2013-01-01

AIM: To analyze reliability among endoscopists in diagnosing portal hypertensive gastropathy (PHG) and to determine which criteria from the most utilized classifications are the most suitable. METHODS: From January to July 2009, in an academic quaternary referral center at Santa Casa of São Paulo Endoscopy Service, Brazil, we performed this single-center prospective study. In this period, we included 100 patients, including 50 sequential patients who had portal hypertension of various etiologies; who were previously diagnosed based on clinical, laboratory and imaging exams; and who presented with esophageal varices. In addition, our study included 50 sequential patients who had dyspeptic symptoms and were referred for upper digestive endoscopy without portal hypertension. All subjects underwent upper digestive endoscopy, and the images of the exam were digitally recorded. Five endoscopists with more than 15 years of experience answered an electronic questionnaire, which included endoscopic criteria from the 3 most commonly used Portal Hypertensive Gastropathy classifications (McCormack, NIEC and Baveno) and the presence of elevated or flat antral erosive gastritis. All five endoscopists were blinded to the patients’ clinical information, and all images of varices were deliberately excluded for the analysis. RESULTS: The three most common etiologies of portal hypertension were schistosomiasis (36%), alcoholic cirrhosis (20%) and viral cirrhosis (14%). Of the 50 patients with portal hypertension, 84% were Child A, 12% were Child B, 4% were Child C, 64% exhibited previous variceal bleeding and 66% were previously endoscopic treated. The endoscopic parameters, presence or absence of mosaic-like pattern, red point lesions and cherry-red spots were associated with high inter-observer reliability and high specificity for diagnosing Portal Hypertensive Gastropathy. Sensitivity, specificity and reliability for the diagnosis of PHG (%) were as follows: mosaic-like pattern
Construct Validity and Reliability of the SARA Gait and Posture Sub-scale in Early Onset Ataxia

Directory of Open Access Journals (Sweden)

Tjitske F. Lawerman

2017-12-01

Full Text Available Aim: In children, gait and posture assessment provides a crucial marker for the early characterization, surveillance and treatment evaluation of early onset ataxia (EOA. For reliable data entry of studies targeting at gait and posture improvement, uniform quantitative biomarkers are necessary. Until now, the pediatric test construct of gait and posture scores of the Scale for Assessment and Rating of Ataxia sub-scale (SARA is still unclear. In the present study, we aimed to validate the construct validity and reliability of the pediatric (SARAGAIT/POSTURE sub-scale.Methods: We included 28 EOA patients [15.5 (6–34 years; median (range]. For inter-observer reliability, we determined the ICC on EOA SARAGAIT/POSTURE sub-scores by three independent pediatric neurologists. For convergent validity, we associated SARAGAIT/POSTURE sub-scores with: (1 Ataxic gait Severity Measurement by Klockgether (ASMK; dynamic balance, (2 Pediatric Balance Scale (PBS; static balance, (3 Gross Motor Function Classification Scale -extended and revised version (GMFCS-E&R, (4 SARA-kinetic scores (SARAKINETIC; kinetic function of the upper and lower limbs, (5 Archimedes Spiral (AS; kinetic function of the upper limbs, and (6 total SARA scores (SARATOTAL; i.e., summed SARAGAIT/POSTURE, SARAKINETIC, and SARASPEECH sub-scores. For discriminant validity, we investigated whether EOA co-morbidity factors (myopathy and myoclonus could influence SARAGAIT/POSTURE sub-scores.Results: The inter-observer agreement (ICC on EOA SARAGAIT/POSTURE sub-scores was high (0.97. SARAGAIT/POSTURE was strongly correlated with the other ataxia and functional scales [ASMK (rs = -0.819; p < 0.001; PBS (rs = -0.943; p < 0.001; GMFCS-E&R (rs = -0.862; p < 0.001; SARAKINETIC (rs = 0.726; p < 0.001; AS (rs = 0.609; p = 0.002; and SARATOTAL (rs = 0.935; p < 0.001]. Comorbid myopathy influenced SARAGAIT/POSTURE scores by concurrent muscle weakness, whereas comorbid myoclonus predominantly influenced

Translation, reliability, and clinical utility of the Melbourne Assessment 2.

Science.gov (United States)

Gerber, Corinna N; Plebani, Anael; Labruyère, Rob

2017-10-12

The aims were to (i) provide a German translation of the Melbourne Assessment 2 (MA2), a quantitative test to measure unilateral upper limb function in children with neurological disabilities and (ii) to evaluate its reliability and aspects of clinical utility. After its translation into German and approval of the back translation by the original authors, the MA2 was performed and videotaped twice with 30 children with neuromotor disorders. For each participant, two raters scored the video of the first test for inter-rater reliability. To determine test-retest reliability, one rater additionally scored the video of the second test while the other rater repeated the scoring of the first video to evaluate intra-rater reliability. Time needed for rater training, test administration, and scoring was recorded. The four subscale scores showed excellent intra-, inter-rater, and test-retest reliability with intraclass correlation coefficients of 0.90-1.00 (95%-confidence intervals 0.78-1.00). Score items revealed substantial to almost perfect intra-rater reliability (weighted kappa k w = 0.66-1.00) for the more affected side. Score item inter-rater and test-retest reliability of the same extremity were, with one exception, moderate to almost perfect (k w = 0.42-0.97; k w = 0.40-0.89). Furthermore, the MA2 was feasible and acceptable for patients and clinicians. The MA2 showed excellent subscale and moderate to almost perfect score item reliability. Implications for Rehabilitation There is a lack of high-quality studies about psychometric properties of upper limb measurement tools in the neuropediatric population. The Melbourne Assessment 2 is a promising tool for reliable measurement of unilateral upper limb movement quality in the neuropediatric population. The Melbourne Assessment 2 is acceptable and practicable to therapists and patients for routine use in clinical care.
Reliability of patient-reported outcomes in rheumatoid arthritis patients: an observational prospective study.

Science.gov (United States)

Studenic, Paul; Stamm, Tanja; Smolen, Josef S; Aletaha, Daniel

2016-01-01

Patient-reported outcomes (PROs) such as pain, patient global assessment (PGA) and fatigue are regularly assessed in RA patients. In the present study, we aimed to explore the reliability and smallest detectable differences (SDDs) of these PROs, and whether the time between assessments has an impact on reliability. Forty RA patients on stable treatment reported the three PROs daily over two subsequent months. We assessed the reliability of these measures by calculating intraclass correlation coefficients (ICCs) and the SDDs for 1-, 7-, 14- and 28-day test-retest intervals. Overall, SDD and ICC were 25 mm and 0.67 for pain, 25 mm and 0.71 for PGA and 30 mm and 0.66 for fatigue, respectively. SDD was higher with longer time period between assessments, ranging from 19 mm (1-day intervals) to 30 mm (28-day intervals) for pain, 19 to 33 mm for PGA, and 26 to 34 mm for fatigue; correspondingly, ICC was smaller with longer intervals, and ranged between the 1- and the 28-day interval from 0.80 to 0.50 for pain, 0.83 to 0.57 for PGA and 0.76 to 0.58 for fatigue. The baseline simplified disease activity index did not have any influence on reliability. Lower baseline PRO scores led to smaller SDDs. Reliability of pain, PGA and fatigue measurements is dependent on the tested time interval and the baseline levels. The relatively high SDDs, even for patients in the lowest tertiles of their PROs, indicate potential issues for assessment of the presence of remission. © The Author 2015. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Inter-observer variability in diagnosing radiological features of aneurysmal subarachnoid hemorrhage; a preliminary single centre study comparing observers from different specialties and levels of training.

Science.gov (United States)

Siddiqui, Usman T; Khan, Anjum F; Shamim, Muhammad Shahzad; Hamid, Rana Shoaib; Alam, Muhammad Mehboob; Emaduddin, Muhammad

2014-01-01

A noncontrast computed tomography (CT) scan remains the initial radiological investigation of choice for a patient with suspected aneurysmal subarachnoid hemorrhage (aSAH). This initial scan may be used to derive key information about the underlying aneurysm which may aid in further management. The interpretation, however, is subject to the skill and experience of the interpreting individual. The authors here evaluate the interpretation of such CT scans by different individuals at different levels of training, and in two different specialties (Radiology and Neurosurgery). Initial nonontrast CT scan of 35 patients with aSAH was evaluated independently by four different observers. The observers selected for the study included two from Radiology and two from Neurosurgery at different levels of training; a resident currently in mid training and a resident who had recently graduated from training of each specialty. Measured variables included interpreter's suspicion of presence of subarachnoid blood, side of the subarachnoid hemorrhage, location of the aneurysm, the aneurysm's proximity to vessel bifurcation, number of aneurysm(s), contour of aneurysm(s), presence of intraventricular hemorrhage (IVH), intracerebral hemorrhage (ICH), infarction, hydrocephalus and midline shift. To determine the inter-observer variability (IOV), weighted kappa values were calculated. There was moderate agreement on most of the CT scan findings among all observers. Substantial agreement was found amongst all observers for hydrocephalus, IVH, and ICH. Lowest agreement rates were seen in the location of aneurysm being supra or infra tentorial. There were, however, some noteworthy exceptions. There was substantial to almost perfect agreement between the radiology graduate and radiology resident on most CT findings. The lowest agreement was found between the neurosurgery graduate and the radiology graduate. Our study suggests that although agreements were seen in the interpretation of some of
Action research in inter-organisational networks

DEFF Research Database (Denmark)

Goduscheit, René Chester; Rasmussen, Erik Stavnsager; Jørgensen, Jacob Høj

2007-01-01

Traditionally, the literature on action research has been aimed at intra-organisational issues. These studies have distinguished between two researcher roles: The problem-solver and the observer. This article addresses the distinct challenges of action research in inter-organisational projects....... In addition to the problem-solver and observer roles, the researcher in an inter-organisational setting can serve as a legitimiser of the project and manage to involve partners that in an ordinary business-to-business setting would not have participated. Based on an action research project in a Danish inter......-organisational network, this article discusses potential pitfalls in the legitimiser role. Lack of clarity in defining the researcher role and project ownership in relation to the funding organisation and the rest of the network can jeopardise the project and potentially the credibility of the researchers. The article...
Reliability assessments in qualitative health promotion research.

Science.gov (United States)

Cook, Kay E

2012-03-01

This article contributes to the debate about the use of reliability assessments in qualitative research in general, and health promotion research in particular. In this article, I examine the use of reliability assessments in qualitative health promotion research in response to health promotion researchers' commonly held misconception that reliability assessments improve the rigor of qualitative research. All qualitative articles published in the journal Health Promotion International from 2003 to 2009 employing reliability assessments were examined. In total, 31.3% (20/64) articles employed some form of reliability assessment. The use of reliability assessments increased over the study period, ranging from qualitative articles decreased. The articles were then classified into four types of reliability assessments, including the verification of thematic codes, the use of inter-rater reliability statistics, congruence in team coding and congruence in coding across sites. The merits of each type were discussed, with the subsequent discussion focusing on the deductive nature of reliable thematic coding, the limited depth of immediately verifiable data and the usefulness of such studies to health promotion and the advancement of the qualitative paradigm.
Reliability and safety of a new upper cervical spine injury treatment algorithm

Directory of Open Access Journals (Sweden)

Andrei Fernandes Joaquim

Full Text Available ABSTRACT In the present study, we evaluated the reliability and safety of a new upper cervical spine injury treatment algorithm to help in the selection of the best treatment modality for these injuries. Methods Thirty cases, previously treated according to the new algorithm, were presented to four spine surgeons who were questioned about their personal suggestion for treatment, and the treatment suggested according to the application of the algorithm. After four weeks, the same questions were asked again to evaluate reliability (intra- and inter-observer using the Kappa index. Results The reliability of the treatment suggested by applying the algorithm was superior to the reliability of the surgeons’ personal suggestion for treatment. When applying the upper cervical spine injury treatment algorithm, an agreement with the treatment actually performed was obtained in more than 89% of the cases. Conclusion The system is safe and reliable for treating traumatic upper cervical spine injuries. The algorithm can be used to help surgeons in the decision between conservative versus surgical treatment of these injuries.
Approach to assurance of reliability of linear accelerator operation observations

International Nuclear Information System (INIS)

Bakov, S.M.; Borovikov, A.A.; Kavkun, S.L.

1994-01-01

The system approach to solving the task of assuring reliability of observations over the linear accelerator operation is proposed. The basic principles of this method consist in application of dependences between the facility parameters, decrease in the number of the system apparatus channels for data acquisition without replacement of failed channel by reserve one. The signal commutation unit, the introduction whereof into the data acquisition system essentially increases the reliability of the measurement system on the account of active reserve, is considered detail. 8 refs. 6 figs
Seven Reliability Indices for High-Stakes Decision Making: Description, Selection, and Simple Calculation

Science.gov (United States)

Smith, Stacey L.; Vannest, Kimberly J.; Davis, John L.

2011-01-01

The reliability of data is a critical issue in decision-making for practitioners in the school. Percent Agreement and Cohen's kappa are the two most widely reported indices of inter-rater reliability, however, a recent Monte Carlo study on the reliability of multi-category scales found other indices to be more trustworthy given the type of data…
The reliability, validity, and feasibility of physical activity measurement in adults with traumatic brain injury: an observational study.

Science.gov (United States)

Hassett, Leanne; Moseley, Anne; Harmer, Alison; van der Ploeg, Hidde P

2015-01-01

To determine the reliability and validity of the Physical Activity Scale for Individuals with a Physical Disability (PASIPD) in adults with severe traumatic brain injury (TBI) and estimate the proportion of the sample participants who fail to meet the World Health Organization guidelines for physical activity. A single-center observational study recruited a convenience sample of 30 community-based ambulant adults with severe TBI. Participants completed the PASIPD on 2 occasions, 1 week apart, and wore an accelerometer (ActiGraph GT3X; ActiGraph LLC, Pensacola, Florida) for the 7 days between these 2 assessments. The PASIPD test-retest reliability was substantial (intraclass correlation coefficient = 0.85; 95% confidence interval, 0.70-0.92), and the correlation with the accelerometer ranged from too low to be meaningful (R = 0.09) to moderate (R = 0.57). From device-based measurement of physical activity, 56% of participants failed to meet the World Health Organization physical activity guidelines. The PASIPD is a reliable measure of the type of physical activity people with severe TBI participate in, but it is not a valid measure of the amount of moderate to vigorous physical activity in which they engage. Accelerometers should be used to quantify moderate to vigorous physical activity in people with TBI.
Inter- and Intra-Observer Repeatability of Quantitative Whole-Body, Diffusion-Weighted Imaging (WBDWI in Metastatic Bone Disease.

Directory of Open Access Journals (Sweden)

Matthew D Blackledge

Full Text Available Quantitative whole-body diffusion-weighted MRI (WB-DWI is now possible using semi-automatic segmentation techniques. The method enables whole-body estimates of global Apparent Diffusion Coefficient (gADC and total Diffusion Volume (tDV, both of which have demonstrated considerable utility for assessing treatment response in patients with bone metastases from primary prostate and breast cancers. Here we investigate the agreement (inter-observer repeatability between two radiologists in their definition of Volumes Of Interest (VOIs and subsequent assessment of tDV and gADC on an exploratory patient cohort of nine. Furthermore, each radiologist was asked to repeat his or her measurements on the same patient data sets one month later to identify the intra-observer repeatability of the technique. Using a Markov Chain Monte Carlo (MCMC estimation method provided full posterior probabilities of repeatability measures along with maximum a-posteriori values and 95% confidence intervals. Our estimates of the inter-observer Intraclass Correlation Coefficient (ICCinter for log-tDV and median gADC were 1.00 (0.97-1.00 and 0.99 (0.89-0.99 respectively, indicating excellent observer agreement for these metrics. Mean gADC values were found to have ICCinter = 0.97 (0.81-0.99 indicating a slight sensitivity to outliers in the derived distributions of gADC. Of the higher order gADC statistics, skewness was demonstrated to have good inter-user agreement with ICCinter = 0.99 (0.86-1.00, whereas gADC variance and kurtosis performed relatively poorly: 0.89 (0.39-0.97 and 0.96 (0.69-0.99 respectively. Estimates of intra-observer repeatability (ICCintra demonstrated similar results: 0.99 (0.95-1.00 for log-tDV, 0.98 (0.89-0.99 and 0.97 (0.83-0.99 for median and mean gADC respectively, 0.64 (0.25-0.88 for gADC variance, 0.85 (0.57-0.95 for gADC skewness and 0.85 (0.57-0.95 for gADC kurtosis. Further investigation of two anomalous patient cases revealed that a very small
Inter-reader agreement of multi-parametric MR imaging for the detection of prostate cancer. Evaluation of a scoring system

International Nuclear Information System (INIS)

Quentin, M.; Roehlen, S.; Klasen, J.; Antoch, G.; Blondin, D.; Arsov, C.; Albers, P.

2012-01-01

Purpose: Functional prostate MR is performed in varying combinations of T2-weighted images with diffusion-weighted imaging (DWI), dynamic contrast-enhanced MRI (DCE-MRI), and spectroscopic imaging (MRSI). Recently, a European consensus meeting proposed the use of a simple 5-point scale for estimating the probability of a lesion being malignant. The aim of the present study was to determine the inter-reader agreement of MR imaging using a scoring system based on the recommendations of the consensus. Materials and Methods: The appearance of 108 predefined lesions in three different MR sequences (T2-weighted images, DWI, and DCE-MRI) in 50 functional prostate MR examinations were retrospectively scored by three blinded radiologists using a 5-point scale for each MR sequence. After scoring T2/DWI and T2/DWI/DCE-MRI, every lesion was graded based on its probability for malignancy. The inter-observer reliability was evaluated using Kappa statistics (Κ). Results: With respect to T2-weighted images, DWI and DCE-MRI Κ was 0.49, 0.97, and 0.77, respectively. Combined scoring of T2-weighted images and DWI demonstrated correct tumor diagnosis (true positive) in 71 - 88 % (depending on reader) of cases (Κ = 0.78). The accuracy was further improved to 88 - 96 % after scoring all three MR sequences including DCE-MRI (Κ = 0.90). Conclusion: The use of a simple 5-point scoring system for T2-weighted images, DWI, and DCE-MRI is feasible in functional prostate MRI and has high inter-observer reliability.
Clinical observed performance evaluation: a prospective study in final year students of surgery.

LENUS (Irish Health Repository)

Markey, G C

2010-06-24

We report a prospective study of clinical observed performance evaluation (COPE) for 197 medical students in the pre-qualification year of clinical education. Psychometric quality was the main endpoint. Students were assessed in groups of 5 in 40-min patient encounters, with each student the focus of evaluation for 8 min. Each student had a series of assessments in a 25-week teaching programme. Over time, several clinicians from a pool of 16 surgical consultants and registrars evaluated each student by direct observation. A structured rating form was used for assessment data. Variance component analysis (VCA), internal consistency and inter-rater agreement were used to estimate reliability. The predictive and convergent validity of COPE in relation to summative OSCE, long case, and overall final examination was estimated. Median number of COPE assessments per student was 7. Generalisability of a mean score over 7 COPE assessments was 0.66, equal to that of an 8 x 7.5 min station final OSCE. Internal consistency was 0.88-0.97 and inter-rater agreement 0.82. Significant correlations were observed with OSCE performance (R = 0.55 disattenuated) and long case (R = 0.47 disattenuated). Convergent validity was 0.81 by VCA. Overall final examination performance was linearly related to mean COPE score with standard error 3.7%. COPE permitted efficient serial assessment of a large cohort of final year students in a real world setting. Its psychometric quality compared well with conventional assessments and with other direct observation instruments as reported in the literature. Effect on learning, and translation to clinical care, are directions for future research.
Reliability of new software in measuring cervical multifidus diameters and shoulder muscle strength in a synchronized way; an ultrasonographic study

Directory of Open Access Journals (Sweden)

Leila Rahnama

2015-08-01

Full Text Available OBJECTIVES: This study was conducted with the purpose of evaluating the inter-session reliability of new software to measure the diameters of the cervical multifidus muscle (CMM, both at rest and during isometric contractions of the shoulder abductors in subjects with neck pain and in healthy individuals.METHOD: In the present study, the reliability of measuring the diameters of the CMM with the Sonosynch software was evaluated by using 24 participants, including 12 subjects with chronic neck pain and 12 healthy individuals. The anterior-posterior diameter (APD and the lateral diameter (LD of the CMM were measured in a resting state and then repeated during isometric contraction of the shoulder abductors. Measurements were taken on separate occasions 3 to 7 days apart in order to determine inter-session reliability. Intraclass correlation coefficient (ICC, standard error of measurement (SEM, and smallest detectable difference (SDD were used to evaluate the relative and absolute reliability, respectively.RESULTS: The Sonosynch software has shown to be highly reliable in measuring the diameters of the CMM both in healthy subjects and in those with neck pain. The ICCs 95% CI for APD ranged from 0.84 to 0.94 in subjects with neck pain and from 0.86 to 0.94 in healthy subjects. For LD, the ICC 95% CI ranged from 0.64 to 0.95 in subjects with neck pain and from 0.82 to 0.92 in healthy subjects.CONCLUSIONS: Ultrasonographic measurement of the diameters of the CMM using Sonosynch has proved to be reliable especially for APD in healthy subjects as well as subjects with neck pain.
Confiabilidade intra e interexaminador da análise por padrões de impressão de plantigrafias de pessoas diabéticas obtidas com o Harris Mat Inter- and intra-examiner reliability of footprint pattern analysis obtained from diabetics using the Harris Mat

Directory of Open Access Journals (Sweden)

Lígia L. Cisneros

2010-06-01

Full Text Available INTRODUÇÃO: A hiperpressão plantar é um fator de risco comprovado para a ulceração em portadores de diabetes mellitus. O "Harris and Beath Footprinting Mat" é um dos instrumentos usados nas avaliações para rastreamento do risco de ulceração nos pés desses pacientes. Não há relatos na literatura sobre estudos da confiabilidade da análise das impressões plantares usando o critério de padrões de impressão. OBJETIVO: O objetivo deste estudo foi avaliar a confiabilidade inter e intraexaminador da análise dos padrões de impressão plantar obtida com o "Harris and Beath Footprinting Mat". MÉTODOS: As impressões plantares de 41 sujeitos foram obtidas usando o plantígrafo. As imagens foram submetidas à análise de três examinadores independentes. Para verificar a confiabilidade intraexaminador, um dos examinadores repetiu a análise após uma semana. RESULTADOS: O coeficiente de Kappa ponderado foi excelente (Kp>0,80 para as análises inter e intraexaminador para a maioria dos pontos estudados em ambos os pés. CONCLUSÃO: O critério de análise por padrões de impressão plantar obtidas com o "Harris and Beath Footprinting Mat" apresentou boa confiabilidade e de alta a excelente concordância para as condições inter e intraexaminador. Esse método é confiável para análises que envolvam um ou mais examinadores.INTRODUCTION: High plantar pressure is a proven risk factor for ulceration among individuals with diabetes mellitus. The Harris and Beath footprinting mat is one of the tools used in screening for foot ulceration risk among these subjects. There are no reports in the literature on the reliability of footprint analysis using print pattern criteria. OBJECTIVE: The aim of this study was to evaluate the inter- and intra-examiner reliability of the analysis of footprint patterns obtained using the Harris and Beath footprinting mat. METHODS: Footprints were taken from 41 subjects using the footprinting mat. The images were
Reliability of the Alzheimer's disease assessment scale (ADAS-Cog) in longitudinal studies.

Science.gov (United States)

Khan, Anzalee; Yavorsky, Christian; DiClemente, Guillermo; Opler, Mark; Liechti, Stacy; Rothman, Brian; Jovic, Sofija

2013-11-01

Considering the scarcity of longitudinal assessments of reliability, there is need for a more precise understanding of cognitive decline in Alzheimer's Disease (AD). The primary goal was to assess longitudinal changes in inter-rater reliability, test retest reliability and internal consistency of scores of the ADAS-Cog. 2,618 AD subjects were enrolled in seven randomized, double-blind, placebo-controlled, multicenter-trials from 1986 to 2009. Reliability, internal-consistency and cross-sectional analysis of ADAS-Cog and MMSE across seven visits were examined. Intra-class correlation (ICC) for ADAS-Cog was moderate to high supporting their reliability. Absolute Agreement ICCs 0.392 (Visit-7) to 0.806 (Visit-2) showed a progressive decrease in correlations across time. Item analysis revealed a decrease in item correlations, with the lowest correlations for Visit 7 for Commands (ICC=0.148), Comprehension (ICC=0.092), Spoken Language (ICC=0.044). Suitable assessment of AD treatments is maintained through accurate measurement of clinically significant outcomes. Targeted rater education ADAS-Cog items over-time can improve ability to administer and score the scale.
A study of operational and testing reliability in software reliability analysis

International Nuclear Information System (INIS)

Yang, B.; Xie, M.

2000-01-01

Software reliability is an important aspect of any complex equipment today. Software reliability is usually estimated based on reliability models such as nonhomogeneous Poisson process (NHPP) models. Software systems are improving in testing phase, while it normally does not change in operational phase. Depending on whether the reliability is to be predicted for testing phase or operation phase, different measure should be used. In this paper, two different reliability concepts, namely, the operational reliability and the testing reliability, are clarified and studied in detail. These concepts have been mixed up or even misused in some existing literature. Using different reliability concept will lead to different reliability values obtained and it will further lead to different reliability-based decisions made. The difference of the estimated reliabilities is studied and the effect on the optimal release time is investigated
The health preoccupation diagnostic interview: inter-rater reliability of a structured interview for diagnostic assessment of DSM-5 somatic symptom disorder and illness anxiety disorder.

Science.gov (United States)

Axelsson, Erland; Andersson, Erik; Ljótsson, Brjánn; Wallhed Finn, Daniel; Hedman, Erik

2016-06-01

Somatic symptom disorder (SSD) and illness anxiety disorder (IAD) are two new diagnoses introduced in the DSM-5. There is a need for reliable instruments to facilitate the assessment of these disorders. We therefore developed a structured diagnostic interview, the Health Preoccupation Diagnostic Interview (HPDI), which we hypothesized would reliably differentiate between SSD, IAD, and no diagnosis. Persons with clinically significant health anxiety (n = 52) and healthy controls (n = 52) were interviewed using the HPDI. Diagnoses were then compared with those made by an independent assessor, who listened to audio recordings of the interviews. Ratings generally indicated moderate to almost perfect inter-rater agreement, as illustrated by an overall Cohen's κ of .85. Disagreements primarily concerned (a) the severity of somatic symptoms, (b) the differential diagnosis of panic disorder, and (c) SSD specifiers. We conclude that the HPDI can be used to reliably diagnose DSM-5 SSD and IAD.
The reliability and reproducibility of the Hertel classification for comminuted proximal humeral fractures compared with the Neer classification

NARCIS (Netherlands)

Iordens, Gijs I. T.; Mahabier, Kiran C.; Buisman, Florian E.; Schep, Niels W. L.; Muradin, Galied S. R.; Beenen, Ludo F. M.; Patka, Peter; van Lieshout, Esther M. M.; den Hartog, Dennis

2016-01-01

The Neer classification is the most commonly used fracture classification system for proximal humeral fractures. Inter- and intra-observer agreement is limited, especially for comminuted fractures. A possibly more straightforward and reliable classification system is the Hertel classification. The
Intra- and inter-observer reproducibility of global and regional magnetic resonance feature tracking derived strain parameters of the left and right ventricle

Energy Technology Data Exchange (ETDEWEB)

Schmidt, Björn, E-mail: bjoernschmidt1989@gmx.de [Department of Radiology, University Hospital of Cologne, Kerpener Str. 62, D-50937, Cologne (Germany); Dick, Anastasia, E-mail: anastasia-dick@web.de [Department of Radiology, University Hospital of Cologne, Kerpener Str. 62, D-50937, Cologne (Germany); Treutlein, Melanie, E-mail: melanie-treutlein@web.de [Department of Radiology, University Hospital of Cologne, Kerpener Str. 62, D-50937, Cologne (Germany); Schiller, Petra, E-mail: petra.schiller@uni-koeln.de [Institute of Medical Statistics, Informatics and Epidemiology, University of Cologne, Kerpener Str. 62, D-50937, Cologne (Germany); Bunck, Alexander C., E-mail: alexander.bunck@uk-koeln.de [Department of Radiology, University Hospital of Cologne, Kerpener Str. 62, D-50937, Cologne (Germany); Maintz, David, E-mail: david.maintz@uk-koeln.de [Department of Radiology, University Hospital of Cologne, Kerpener Str. 62, D-50937, Cologne (Germany); Baeßler, Bettina, E-mail: bettina.baessler@uk-koeln.de [Department of Radiology, University Hospital of Cologne, Kerpener Str. 62, D-50937, Cologne (Germany)

2017-04-15

Highlights: • Left and right ventricular CMR feature tracking is highly reproducible. • The only exception is radial strain and strain rate. • Sample size estimations are presented as a practical reference for future studies. - Abstract: Objectives: To investigate the reproducibility of regional and global strain and strain rate (SR) parameters of both ventricles and to determine sample sizes for all investigated strain and SR parameters in order to generate a practical reference for future studies. Materials and methods: The study population consisted of 20 healthy individuals and 20 patients with acute myocarditis. Cine sequences in three horizontal long axis views and a stack of short axis views covering the entire left and right ventricle (LV, RV) were retrospectively analysed using a dedicated feature tracking (FT) software algorithm (TOMTEC). For intra-observer analysis, one observer analysed CMR images of all patients and volunteers twice. For inter-observer analysis, three additional blinded observers analysed the same datasets once. Intra- and inter-observer reproducibility were tested in all patients and controls using Bland-Altman analyses, intra-class correlation coefficients (ICCs) and coefficients of variation. Results: Intra-observer reproducibility of global LV strain and SR parameters was excellent (range of ICCs: 0.81–1.00), the only exception being global radial SR with a poor reproducibility (ICC 0.23). On a regional level, basal and midventricular strain and SR parameters were more reproducible when compared to apical parameters. Inter-observer reproducibility of all LV parameters was slightly lower than intra-observer reproducibility, yet still good to excellent for all global and regional longitudinal and circumferential strain and SR parameters (range of ICCs: 0.66–0.93). Similar to the LV, all global RV longitudinal and circumferential strain and SR parameters showed an excellent reproducibility, (range of ICCs: 0.75–0
Intra- and inter-observer reproducibility of global and regional magnetic resonance feature tracking derived strain parameters of the left and right ventricle

International Nuclear Information System (INIS)

Schmidt, Björn; Dick, Anastasia; Treutlein, Melanie; Schiller, Petra; Bunck, Alexander C.; Maintz, David; Baeßler, Bettina

2017-01-01

Highlights: • Left and right ventricular CMR feature tracking is highly reproducible. • The only exception is radial strain and strain rate. • Sample size estimations are presented as a practical reference for future studies. - Abstract: Objectives: To investigate the reproducibility of regional and global strain and strain rate (SR) parameters of both ventricles and to determine sample sizes for all investigated strain and SR parameters in order to generate a practical reference for future studies. Materials and methods: The study population consisted of 20 healthy individuals and 20 patients with acute myocarditis. Cine sequences in three horizontal long axis views and a stack of short axis views covering the entire left and right ventricle (LV, RV) were retrospectively analysed using a dedicated feature tracking (FT) software algorithm (TOMTEC). For intra-observer analysis, one observer analysed CMR images of all patients and volunteers twice. For inter-observer analysis, three additional blinded observers analysed the same datasets once. Intra- and inter-observer reproducibility were tested in all patients and controls using Bland-Altman analyses, intra-class correlation coefficients (ICCs) and coefficients of variation. Results: Intra-observer reproducibility of global LV strain and SR parameters was excellent (range of ICCs: 0.81–1.00), the only exception being global radial SR with a poor reproducibility (ICC 0.23). On a regional level, basal and midventricular strain and SR parameters were more reproducible when compared to apical parameters. Inter-observer reproducibility of all LV parameters was slightly lower than intra-observer reproducibility, yet still good to excellent for all global and regional longitudinal and circumferential strain and SR parameters (range of ICCs: 0.66–0.93). Similar to the LV, all global RV longitudinal and circumferential strain and SR parameters showed an excellent reproducibility, (range of ICCs: 0.75–0

Examiner Reliability of Fluorosis Scoring: A Comparison of Photographic and Clinical Examination Findings

Science.gov (United States)

Cruz-Orcutt, Noemi; Warren, John J.; Broffitt, Barbara; Levy, Steven M.; Weber-Gasparoni, Karin

2012-01-01

Objective To assess and compare examiner reliability of clinical and photographic fluorosis examinations using the Fluorosis Risk Index (FRI) among children in the Iowa Fluoride Study (IFS). Methods The IFS examined 538 children for fluorosis and dental caries at age 13 and obtained intra-oral photographs from nearly all of them. To assess examiner reliability, duplicate clinical examinations were conducted for 40 of the subjects. In addition, 200 of the photographs were scored independently for fluorosis by two examiners in a standardized manner. Fluorosis data were compared between examiners for the clinical exams and separately for the photographic exams, and a comparison was made between clinical and photographic exams. For all 3 comparisons, examiner reliability was assessed using kappa statistics at the tooth level. Results Inter-examiner reliability for the duplicate clinical exams on the sample of 40 subjects as measured by kappa was 0.59, while the repeat exams of the 200 photographs yielded a kappa of 0.64. For the comparison of photographic and clinical exams, inter-examiner reliability, as measured by weighted kappa, was 0.46. FRI scores obtained using the photographs were higher on average than those obtained from the clinical exams. Fluorosis prevalence was higher for photographs (33%) than found for clinical exam (18%). Conclusion Results suggest inter-examiner reliability is greater and fluorosis scores higher when using photographic compared to clinical examinations. PMID:22316120
Assessing environmental features related to mental health: a reliability study of visual streetscape images.

Science.gov (United States)

Wu, Yu-Tzu; Nash, Paul; Barnes, Linda E; Minett, Thais; Matthews, Fiona E; Jones, Andy; Brayne, Carol

2014-10-22

An association between depressive symptoms and features of built environment has been reported in the literature. A remaining research challenge is the development of methods to efficiently capture pertinent environmental features in relevant study settings. Visual streetscape images have been used to replace traditional physical audits and directly observe the built environment of communities. The aim of this work is to examine the inter-method reliability of the two audit methods for assessing community environments with a specific focus on physical features related to mental health. Forty-eight postcodes in urban and rural areas of Cambridgeshire, England were randomly selected from an alphabetical list of streets hosted on a UK property website. The assessment was conducted in July and August 2012 by both physical and visual image audits based on the items in Residential Environment Assessment Tool (REAT), an observational instrument targeting the micro-scale environmental features related to mental health in UK postcodes. The assessor used the images of Google Street View and virtually "walked through" the streets to conduct the property and street level assessments. Gwet's AC1 coefficients and Bland-Altman plots were used to compare the concordance of two audits. The results of conducting the REAT by visual image audits generally correspond to direct observations. More variations were found in property level items regarding physical incivilities, with broad limits of agreement which importantly lead to most of the variation in the overall REAT score. Postcodes in urban areas had lower consistency between the two methods than rural areas. Google Street View has the potential to assess environmental features related to mental health with fair reliability and provide a less resource intense method of assessing community environments than physical audits.
Reliability of the Phi angle to assess rotational alignment of the talar component in total ankle replacement.

Science.gov (United States)

Manzi, Luigi; Villafañe, Jorge Hugo; Indino, Cristian; Tamini, Jacopo; Berjano, Pedro; Usuelli, Federico Giuseppe

2017-11-08

The purpose of this study was to investigate the test-retest reliability of the Phi angle in patients undergoing total ankle replacement (TAR) for end stage ankle osteoarthritis (OA) to assess the rotational alignment of the talar component. Retrospective observational cross-sectional study of prospectively collected data. Post-operative anteroposterior radiographs of the foot of 170 patients who underwent TAR for the ankle OA were evaluated. Three physicians measured Phi on the 170 randomly sorted and anonymized radiographs on two occasions, one week apart (test and retest conditions), inter and intra-observer agreement were evaluated. Test-retest reliability of Phi angle measurement was excellent for patients with Hintegra TAR (ICC=0.995; pPhi angle measurement between patients with Hintegra vs. Zimmer implants (p>0.05). Measurement of Phi angle on weight-bearing dorsoplantar radiograph showed an excellent reliability among orthopaedic surgeons in determining the position of the talar component in the axial plane. Level II, cross sectional study. Copyright © 2017 European Foot and Ankle Society. Published by Elsevier Ltd. All rights reserved.
The reliability of the Brazilian version of the Composite International Diagnostic Interview (CIDI 2.1

Directory of Open Access Journals (Sweden)

Quintana M.I.

2004-01-01

Full Text Available The objective of the present study was to determine the reliability of the Brazilian version of the Composite International Diagnostic Interview 2.1 (CIDI 2.1 in clinical psychiatry. The CIDI 2.1 was translated into Portuguese using WHO guidelines and reliability was studied using the inter-rater reliability method. The study sample consisted of 186 subjects from psychiatric hospitals and clinics, primary care centers and community services. The interviewers consisted of a group of 13 lay and three non-lay interviewers submitted to the CIDI training. The average interview time was 2 h and 30 min. General reliability ranged from kappa 0.50 to 1. For lifetime diagnoses the reliability ranged from kappa 0.77 (Bipolar Affective Disorder to 1 (Substance-Related Disorder, Alcohol-Related Disorder, Eating Disorders. Previous year reliability ranged from kappa 0.66 (Obsessive-Compulsive Disorder to 1 (Dissociative Disorders, Maniac Disorders, Eating Disorders. The poorest reliability rate was found for Mild Depressive Episode (kappa = 0.50 during the previous year. Training proved to be a fundamental factor for maintaining good reliability. Technical knowledge of the questionnaire compensated for the lack of psychiatric knowledge of the lay personnel. Inter-rater reliability was good to excellent for persons in psychiatric practice.
Reliability of lip prints in personal identification: An inter-racial pilot study.

Science.gov (United States)

Kumar, Laliytha Bijai; Jayaraman, Venkatesh; Mathew, Philips; Ramasamy, S; Austin, Ravi David

2016-01-01

Forensic science is a branch of science that deals with the application of science and technology in solving a crime and this requires a multidisciplinary team effort. The word "Forensic" is derived from the Latin word, "Forensis" which means the study of public. Dental professionals should develop interests in contributing to legal issues. To study the lip prints among people of different races. Descriptive study. The present study comprised of ninety subjects of which Group A comprised of Africans, Group B comprised of Dravidian, and Group C of Mongoloid race. Each group was then further divided into 15 males and 15 females for whom the lip prints were recorded and evaluated. ANOVA test. ANOVA statistical analysis was used to compare three races of African, Dravidian, and Mongoloid races. The observed data among male and female were found to be significant with a P = 0.000492. The present study showed a significant difference in lip pattern among the three races. Perhaps future studies with a larger sample size and comparison between many other races may be done for better personal identification.
MR imaging of alar ligament in whiplash-associated disorders: an observer study

Energy Technology Data Exchange (ETDEWEB)

Wilmink, J.T. [Dept. of Neuroradiology, University of Maastricht (Netherlands); Patijn, J. [Dept. of Anesthesiology, University Hospital Maastricht (Netherlands)

2001-10-01

Rotational CT studies have been previously used in whiplash-associated disorders (WAD) to document rotatory instability of the upper cervical spine thought to be due to alar ligamentous injury. More recently MR imaging has been employed to image such injury more directly. Our study aimed to assess the reliability and reproducibility of such MRI findings. In 12 WAD patients and six asymptomatic controls the alar ligaments were imaged in the coronal plane with an 0.5-T MRI system using a quadrature neck coil and applying a fast spin echo proton density/T2-weighted sequence P(TR/TE/ETL 2,500/18 ms/16, FOV 140 mm, matrix 200 x 256, 16 x 3 mm slices, scan time 25 min). Images were graded for symmetry of imaging plane using a 3-point scale and also for presence of ligamentous injury with a 4-point scale, by two independent observers on two separate occasions. The alar ligaments could be identified in all cases. Asymmetry of the imaging plane was found to some degree in over half of the cases. Such images were much more likely to be graded as indicating injury. Of a total of 72 assessments, clearly and probably normal grades were given in 75 %, and clearly or probably abnormal grades in 25 %. Kappa values for intra- and inter-observer agreement were moderate to very poor, however, and the grading system could not reliably distinguish between patients and controls. It was concluded that with MRI techniques presently employed, alar ligamentous damage as a causative factor in WAD has not been proven. (orig.)
MR imaging of alar ligament in whiplash-associated disorders: an observer study

International Nuclear Information System (INIS)

Wilmink, J.T.; Patijn, J.

2001-01-01

Rotational CT studies have been previously used in whiplash-associated disorders (WAD) to document rotatory instability of the upper cervical spine thought to be due to alar ligamentous injury. More recently MR imaging has been employed to image such injury more directly. Our study aimed to assess the reliability and reproducibility of such MRI findings. In 12 WAD patients and six asymptomatic controls the alar ligaments were imaged in the coronal plane with an 0.5-T MRI system using a quadrature neck coil and applying a fast spin echo proton density/T2-weighted sequence P(TR/TE/ETL 2,500/18 ms/16, FOV 140 mm, matrix 200 x 256, 16 x 3 mm slices, scan time 25 min). Images were graded for symmetry of imaging plane using a 3-point scale and also for presence of ligamentous injury with a 4-point scale, by two independent observers on two separate occasions. The alar ligaments could be identified in all cases. Asymmetry of the imaging plane was found to some degree in over half of the cases. Such images were much more likely to be graded as indicating injury. Of a total of 72 assessments, clearly and probably normal grades were given in 75 %, and clearly or probably abnormal grades in 25 %. Kappa values for intra- and inter-observer agreement were moderate to very poor, however, and the grading system could not reliably distinguish between patients and controls. It was concluded that with MRI techniques presently employed, alar ligamentous damage as a causative factor in WAD has not been proven. (orig.)
MR imaging of alar ligament in whiplash-associated disorders: an observer study.

Science.gov (United States)

Wilmink, J T; Patijn, J

2001-10-01

Rotational CT studies have been previously used in whiplash-associated disorders (WAD) to document rotatory instability of the upper cervical spine thought to be due to alar ligamentous injury. More recently MR imaging has been employed to image such injury more directly. Our study aimed to assess the reliability and reproducibility of such MRI findings. In 12 WAD patients and six asymptomatic controls the alar ligaments were imaged in the coronal plane with an 0.5-T MRI system using a quadrature neck coil and applying a fast spin echo proton density/T2-weighted sequence (TR/TE/ETL 2,500/18 ms/16, FOV 140 mm, matrix 200 x 256, 16 x 3 mm slices, scan time 25 min). Images were graded for symmetry of imaging plane using a 3-point scale and also for presence of ligamentous injury with a 4-point scale, by two independent observers on two separate occasions. The alar ligaments could be identified in all cases. Asymmetry of the imaging plane was found to some degree in over half of the cases. Such images were much more likely to be graded as indicating injury. Of a total of 72 assessments, clearly and probably normal grades were given in 75%, and clearly or probably abnormal grades in 25%. Kappa values for intra- and inter-observer agreement were moderate to very poor, however, and the grading system could not reliably distinguish between patients and controls. It was concluded that with MRI techniques presently employed, alar ligamentous damage as a causative factor in WAD has not been proven.
Reliability, validity and minimal detectable change of the Mini-BESTest in Greek participants with chronic stroke.

Science.gov (United States)

Lampropoulou, Sofia I; Billis, Evdokia; Gedikoglou, Ingrid A; Michailidou, Christina; Nowicky, Alexander V; Skrinou, Dimitra; Michailidi, Fotini; Chandrinou, Danae; Meligkoni, Margarita

2018-02-23

This study aimed to investigate the psychometric characteristics of reliability, validity and ability to detect change of a newly developed balance assessment tool, the Mini-BESTest, in Greek patients with stroke. A prospective, observational design study with test-retest measures was conducted. A convenience sample of 21 Greek patients with chronic stroke (14 male, 7 female; age of 63 ± 16 years) was recruited. Two independent examiners administered the scale, for the inter-rater reliability, twice within 10 days for the test-retest reliability. Bland Altman Analysis for repeated measures assessed the absolute reliability and the Standard Error of Measurement (SEM) and the Minimum Detectable Change at 95% confidence interval (MDC 95% ) were established. The Greek Mini-BESTest (Mini-BESTest GR ) was correlated with the Greek Berg Balance Scale (BBS GR ) for assessing the concurrent validity and with the Timed Up and Go (TUG), the Functional Reach Test (FRT) and the Greek Falls Efficacy Scale-International (FES-I GR ) for the convergent validity. The Mini-BESTestGR demonstrated excellent inter-rater reliability (ICC (95%CI) = 0.997 (0.995-0.999, SEM = 0.46) with the scores of two raters within the limits of agreement (mean dif = -0.143 ± 0.727, p > 0.05) and test-retest reliability (ICC (95%CI) = 0.966 (0.926-0.988), SEM = 1.53). Additionally, the Mini-BESTest GR yielded very strong to moderate correlations with BBS GR (r = 0.924, p reliability and the equally good validity of the Mini-BESTest GR , strongly support its utility in Greek people with chronic stroke. Its ability to identify clinically meaningful changes and falls risk need further investigation.
Relationship between severe obesity and depth to the cricothyroid membrane in third-trimester non-labouring parturients: a prospective observational study.

Science.gov (United States)

Gadd, K; Wills, K; Harle, R; Terblanche, N

2018-05-01

Severely obese parturients have increased 'cannot intubate, cannot oxygenate' risk during Caesarean section under general anaesthesia. Front-of-neck access (FONA) at the cricothyroid membrane (CTM) is definitive management; however, attempted FONA can fail. Point-of-care ultrasonography may provide useful information about CTM depth to aid FONA in obesity. This study determined the difference in CTM depth between severely obese and non-obese parturients, utilising ultrasonography. In this prospective observational study, two anaesthetists performed airway ultrasonography on 15 severely obese (BMI >45 kg m -2 ) and 15 normal-weight (BMI ≤25 kg m -2 ) parturients in the third trimester, using the transverse and longitudinal planes, sniffing and extended head positions, and nil and firm transducer pressures. The primary outcome was CTM depth (millimetres) measured in the transverse plane with the head extended and nil transducer pressure. Secondary outcomes included CTM depth measurements using other factor configurations. Intra-class correlation coefficients assessed the inter-observer reliability. CTM depth measured in the transverse plane with head extended and nil transducer pressure was significantly greater in severely obese parturients, mean 18.0 mm (95% confidence interval 16.3-19.8), vs 10.6 mm (8.81-12.4) in non-obese (P<0.001); mean difference 7.4 mm (4.9-9.9; P<0.001). CTM depths were increased in the severely obese group regardless of scanning plane, head and neck position, or transducer pressure (all P<0.001). There was excellent inter-observer reliability. Cricothyroid membrane depth is significantly increased in severely obese vs normal-weight parturients independently of scanning plane, head and neck position, or transducer pressure. Copyright © 2018 British Journal of Anaesthesia. All rights reserved.
Construct validity and inter-rater reliability of the Dutch activity measure for post-acute care "6-clicks" basic mobility form to assess the mobility of hospitalized patients.

Science.gov (United States)

Geelen, Sven Jacobus Gertruda; Valkenet, Karin; Veenhof, Cindy

2018-05-12

To evaluate the construct validity and the inter-rater reliability of the Dutch Activity Measure for Post-Acute Care "6-clicks" Basic Mobility short form measuring the patient's mobility in Dutch hospital care. First, the "6-clicks" was translated by using a forward-backward translation protocol. Next, 64 patients were assessed by the physiotherapist to determine the validity while being admitted to the Internal Medicine wards of a university medical center. Six hypotheses were tested regarding the construct "mobility" which showed that: Better "6-clicks" scores were related to less restrictive pre-admission living situations (p = 0.011), less restrictive discharge locations (p = 0.001), more independence in activities of daily living (p = 0.001) and less physiotherapy visits (p Dutch "6-clicks" shows a good construct validity and moderate-to-excellent inter-rater reliability when used to assess the mobility of hospitalized patients. Implications for Rehabilitation Even though various measurement tools have been developed, it appears the majority of physiotherapists working in a hospital currently do not use these tools as a standard part of their care. The Activity Measure for Post-Acute Care "6-clicks" Basic Mobility is the only tool which is designed to be short, easy to use within usual care and has been validated in the entire hospital population. This study shows that the Dutch version of the Activity Measure for Post-Acute Care "6-clicks" Basic Mobility form is a valid, easy to use, quick tool to assess the basic mobility of Dutch hospitalized patients.
Reliability of Single-Leg Balance and Landing Tests in Rugby Union; Prospect of Using Postural Control to Monitor Fatigue.

Science.gov (United States)

Troester, Jordan C; Jasmin, Jason G; Duffield, Rob

2018-06-01

The present study examined the inter-trial (within test) and inter-test (between test) reliability of single-leg balance and single-leg landing measures performed on a force plate in professional rugby union players using commercially available software (SpartaMARS, Menlo Park, USA). Twenty-four players undertook test - re-test measures on two occasions (7 days apart) on the first training day of two respective pre-season weeks following 48h rest and similar weekly training loads. Two 20s single-leg balance trials were performed on a force plate with eyes closed. Three single-leg landing trials were performed by jumping off two feet and landing on one foot in the middle of a force plate 1m from the starting position. Single-leg balance results demonstrated acceptable inter-trial reliability (ICC = 0.60-0.81, CV = 11-13%) for sway velocity, anterior-posterior sway velocity, and mediolateral sway velocity variables. Acceptable inter-test reliability (ICC = 0.61-0.89, CV = 7-13%) was evident for all variables except mediolateral sway velocity on the dominant leg (ICC = 0.41, CV = 15%). Single-leg landing results only demonstrated acceptable inter-trial reliability for force based measures of relative peak landing force and impulse (ICC = 0.54-0.72, CV = 9-15%). Inter-test results indicate improved reliability through the averaging of three trials with force based measures again demonstrating acceptable reliability (ICC = 0.58-0.71, CV = 7-14%). Of the variables investigated here, total sway velocity and relative landing impulse are the most reliable measures of single-leg balance and landing performance, respectively. These measures should be considered for monitoring potential changes in postural control in professional rugby union.
Reliability of Single-Leg Balance and Landing Tests in Rugby Union; Prospect of Using Postural Control to Monitor Fatigue

Directory of Open Access Journals (Sweden)

Jordan C. Troester, Jason G. Jasmin, Rob Duffield

2018-06-01

Full Text Available The present study examined the inter-trial (within test and inter-test (between test reliability of single-leg balance and single-leg landing measures performed on a force plate in professional rugby union players using commercially available software (SpartaMARS, Menlo Park, USA. Twenty-four players undertook test – re-test measures on two occasions (7 days apart on the first training day of two respective pre-season weeks following 48h rest and similar weekly training loads. Two 20s single-leg balance trials were performed on a force plate with eyes closed. Three single-leg landing trials were performed by jumping off two feet and landing on one foot in the middle of a force plate 1m from the starting position. Single-leg balance results demonstrated acceptable inter-trial reliability (ICC = 0.60-0.81, CV = 11-13% for sway velocity, anterior-posterior sway velocity, and mediolateral sway velocity variables. Acceptable inter-test reliability (ICC = 0.61-0.89, CV = 7-13% was evident for all variables except mediolateral sway velocity on the dominant leg (ICC = 0.41, CV = 15%. Single-leg landing results only demonstrated acceptable inter-trial reliability for force based measures of relative peak landing force and impulse (ICC = 0.54-0.72, CV = 9-15%. Inter-test results indicate improved reliability through the averaging of three trials with force based measures again demonstrating acceptable reliability (ICC = 0.58-0.71, CV = 7-14%. Of the variables investigated here, total sway velocity and relative landing impulse are the most reliable measures of single-leg balance and landing performance, respectively. These measures should be considered for monitoring potential changes in postural control in professional rugby union.
Impact of image quality on reliability of the measurements of left ventricular systolic function and global longitudinal strain in 2D echocardiography.

Science.gov (United States)

Nagata, Yasufumi; Kado, Yuichiro; Onoue, Takeshi; Otani, Kyoko; Nakazono, Akemi; Otsuji, Yutaka; Takeuchi, Masaaki

2018-03-01

Left ventricular ejection fraction (LVEF) and global longitudinal strain (GLS) play important roles in diagnosis and management of cardiac diseases. However, the issue of the accuracy and reliability of LVEF and GLS remains to be solved. Image quality is one of the most important factors affecting measurement variability. The aim of this study was to investigate whether improved image quality could reduce observer variability. Two sets of three apical images were acquired using relatively old- and new-generation ultrasound imaging systems (Vivid 7 and Vivid E95) in 308 subjects. Image quality was assessed by endocardial border delineation index (EBDI) using a 3-point scoring system. Three observers measured the LVEF and GLS, and these values and inter-observer variability were investigated. Image quality was significantly better with Vivid E95 (EBDI: 26.8 ± 5.9) than that with Vivid 7 (22.8 ± 6.3, P image quality yielded benefits to both LVEF and GLS measurement reliability. Multivariate analysis showed that image quality was indeed an important factor of observer variability in the measurement of LVEF and GLS. The new-generation ultrasound imaging system offers improved image quality and reduces inter-observer variability in the measurement of LVEF and GLS. © 2018 The authors.
Reliability of Autism-Tics, AD/HD, and other Comorbidities (A-TAC) inventory in a test-retest design.

Science.gov (United States)

Larson, Tomas; Kerekes, Nóra; Selinus, Eva Norén; Lichtenstein, Paul; Gumpert, Clara Hellner; Anckarsäter, Henrik; Nilsson, Thomas; Lundström, Sebastian

2014-02-01

The Autism-Tics, AD/HD, and other Comorbidities (A-TAC) inventory is used in epidemiological research to assess neurodevelopmental problems and coexisting conditions. Although the A-TAC has been applied in various populations, data on retest reliability are limited. The objective of the present study was to present additional reliability data. The A-TAC was administered by lay assessors and was completed on two occasions by parents of 400 individual twins, with an average interval of 70 days between test sessions. Intra- and inter-rater reliability were analysed with intraclass correlations and Cohen's kappa. A-TAC showed excellent test-retest intraclass correlations for both autism spectrum disorder and attention deficit hyperactivity disorder (each at .84). Most modules in the A-TAC had intra- and inter-rater reliability intraclass correlation coefficients of > or = .60. Cohen's kappa indi- cated acceptable reliability. The current study provides statistical evidence that the A-TAC yields good test-retest reliability in a population-based cohort of children.
Fidedignidade inter e intradias de um teste de potência muscular Inter and intraday reliability of a test of muscle power

Directory of Open Access Journals (Sweden)

Roberto Simão

2001-08-01

Full Text Available Com o envelhecimento há rápida e relevante perda de potência muscular (PM, prejudicando a autonomia e a qualidade de vida, tornando conveniente avaliar a PM. O objetivo do estudo foi determinar a fidedignidade inter e intradias de um teste simples de PM, realizado em uma carga previamente individualizada. Avaliamos 18 jovens adultos saudáveis, sendo 12 mulheres, inexperientes quanto a exercícios de fortalecimento. Determinou-se inicialmente 1RM com medida simultânea da velocidade e da potência (Fitrodyne, Bratislava, no exercício remada alta, até a altura mesoesternal na posição ortostática, obtendo-se ainda a carga na qual se conseguia a maior PM. Na semana seguinte, por cinco dias consecutivos, eles realizaram em quatro dias 2x2 repetições e em um dia 10x2 repetições (3s de intervalo entre as repetições, o mais rápido possível na fase concêntrica, com a carga de PM. Comparando os resultados pela ANOVA para medidas repetidas e teste de Bonferroni, verificou-se que a PM máxima não diferia - médias entre 262 e 267W (p = 0,69. Para a variabilidade dos dados individuais, encontraram-se valores de 3 e 8%, respectivamente, para o coeficiente de variação (CV e para a média da variação dos resultados pela média dos indivíduos M-m/X. Nas dez séries consecutivas os valores ficaram entre 242 e 263W, somente identificando-se diferenças entre as séries 1 e 4 e 6 (p Aging induces fast and relevant decrements in muscle power (MP, reducing autonomy and quality of life. The authors' aim was to determine reliability data in a new, simple protocol for muscle power using a previously individualized load. The authors evaluated 18 healthy young adults (12 women, unaccustomed to strengthening exercises. Initially, 1-RM was determined with simultaneous measurement of velocity and MP (Fitrodyne, Bratislava in an upright row exercise, performed up to mesosternal level in standing position, and measurement of load in which MP was
RELIABILITY OF THE DYNAMIC OCCUPATIONAL THERAPY COGNITIVE ASSESSMENT FOR CHILDREN (DOTCA-CH: THAI VERSION OF ORIENTATION, SPATIAL PERCEPTION, AND THINKING OPERATIONS SUBTESTS

Directory of Open Access Journals (Sweden)

Suchitporn Lersilp

2014-06-01

Full Text Available The Dynamic Occupational Therapy Cognitive Assessment for Children (DOTCA-Ch is a tool for finding out about cognitive problems in school-aged children. However, the DOTCA-Ch was developed in English for Western children. For this reason, it’s not appropriate for Thai children because of the differences of culture and language. The objectives of this study were aimed at translating the DOTCA-Ch in Orientation, Spatial Perception, and Thinking Operations subtests to a Thai version with a World Health Organization back-translation process, and to examine its internal consistency, inter-rater reliability and test-retest reliability. The participants consisted of 38 intellectually impaired and learning disabled individuals between the ages of 6–12. Results from this study revealed high internal consistency in the Orientation subtest (α=.83 Spatial Perception subtest (α=.82 and Thinking Operations subtest (α=.82, high inter-rater reliability in the Orientation subtest (ICC =.83, Spatial Perception subtest (ICC =.84 and Thinking Operations subtest (ICC =.74 and high test-retest reliability in the Orientation subtest (ICC =.84 Spatial Perception subtest (ICC =.86 and Thinking Operations subtest (ICC =.85. These results indicate that the Thai version of the DOTCA-Ch in Orientation, Spatial Perception, and Thinking Operations subtests might be used as an appropriate assessment tool for Thai children, based on psychometric evidence including internal consistency, inter-rater reliability and test-retest reliability. However, additional study of other psychometric properties, including, predictive validity, concurrent reliability, and inter-rater reliability during the mediation process of this assessment tool needs to be carried out.
Reliability of Semiautomated Computational Methods for Estimating Tibiofemoral Contact Stress in the Multicenter Osteoarthritis Study

Directory of Open Access Journals (Sweden)

Donald D. Anderson

2012-01-01

Full Text Available Recent findings suggest that contact stress is a potent predictor of subsequent symptomatic osteoarthritis development in the knee. However, much larger numbers of knees (likely on the order of hundreds, if not thousands need to be reliably analyzed to achieve the statistical power necessary to clarify this relationship. This study assessed the reliability of new semiautomated computational methods for estimating contact stress in knees from large population-based cohorts. Ten knees of subjects from the Multicenter Osteoarthritis Study were included. Bone surfaces were manually segmented from sequential 1.0 Tesla magnetic resonance imaging slices by three individuals on two nonconsecutive days. Four individuals then registered the resulting bone surfaces to corresponding bone edges on weight-bearing radiographs, using a semi-automated algorithm. Discrete element analysis methods were used to estimate contact stress distributions for each knee. Segmentation and registration reliabilities (day-to-day and interrater for peak and mean medial and lateral tibiofemoral contact stress were assessed with Shrout-Fleiss intraclass correlation coefficients (ICCs. The segmentation and registration steps of the modeling approach were found to have excellent day-to-day (ICC 0.93–0.99 and good inter-rater reliability (0.84–0.97. This approach for estimating compartment-specific tibiofemoral contact stress appears to be sufficiently reliable for use in large population-based cohorts.
Training less-experienced faculty improves reliability of skills assessment in cardiac surgery.

Science.gov (United States)

Lou, Xiaoying; Lee, Richard; Feins, Richard H; Enter, Daniel; Hicks, George L; Verrier, Edward D; Fann, James I

2014-12-01

Previous work has demonstrated high inter-rater reliability in the objective assessment of simulated anastomoses among experienced educators. We evaluated the inter-rater reliability of less-experienced educators and the impact of focused training with a video-embedded coronary anastomosis assessment tool. Nine less-experienced cardiothoracic surgery faculty members from different institutions evaluated 2 videos of simulated coronary anastomoses (1 by a medical student and 1 by a resident) at the Thoracic Surgery Directors Association Boot Camp. They then underwent a 30-minute training session using an assessment tool with embedded videos to anchor rating scores for 10 components of coronary artery anastomosis. Afterward, they evaluated 2 videos of a different student and resident performing the task. Components were scored on a 1 to 5 Likert scale, yielding an average composite score. Inter-rater reliabilities of component and composite scores were assessed using intraclass correlation coefficients (ICCs) and overall pass/fail ratings with kappa. All components of the assessment tool exhibited improvement in reliability, with 4 (bite, needle holder use, needle angles, and hand mechanics) improving the most from poor (ICC range, 0.09-0.48) to strong (ICC range, 0.80-0.90) agreement. After training, inter-rater reliabilities for composite scores improved from moderate (ICC, 0.76) to strong (ICC, 0.90) agreement, and for overall pass/fail ratings, from poor (kappa = 0.20) to moderate (kappa = 0.78) agreement. Focused, video-based anchor training facilitates greater inter-rater reliability in the objective assessment of simulated coronary anastomoses. Among raters with less teaching experience, such training may be needed before objective evaluation of technical skills. Published by Elsevier Inc.
Reliability of the International Spinal Cord Injury Musculoskeletal Basic Data Set

DEFF Research Database (Denmark)

Baunsgaard, C B; Chhabra, H S; Harvey, L A

2016-01-01

STUDY DESIGN: Psychometric study. OBJECTIVES: To determine the intra- and inter-rater reliability and content validity of the International Spinal Cord Injury (SCI) Musculoskeletal Basic Data Set (ISCIMSBDS). SETTING: Four centers with one in each of the countries in Australia, England, India and...

Validity and reliability of a low-cost digital dynamometer for measuring isometric strength of lower limb.

Science.gov (United States)

Romero-Franco, Natalia; Jiménez-Reyes, Pedro; Montaño-Munuera, Juan A

2017-11-01

Lower limb isometric strength is a key parameter to monitor the training process or recognise muscle weakness and injury risk. However, valid and reliable methods to evaluate it often require high-cost tools. The aim of this study was to analyse the concurrent validity and reliability of a low-cost digital dynamometer for measuring isometric strength in lower limb. Eleven physically active and healthy participants performed maximal isometric strength for: flexion and extension of ankle, flexion and extension of knee, flexion, extension, adduction, abduction, internal and external rotation of hip. Data obtained by the digital dynamometer were compared with the isokinetic dynamometer to examine its concurrent validity. Data obtained by the digital dynamometer from 2 different evaluators and 2 different sessions were compared to examine its inter-rater and intra-rater reliability. Intra-class correlation (ICC) for validity was excellent in every movement (ICC > 0.9). Intra and inter-tester reliability was excellent for all the movements assessed (ICC > 0.75). The low-cost digital dynamometer demonstrated strong concurrent validity and excellent intra and inter-tester reliability for assessing isometric strength in the main lower limb movements.
Fresh biological reference materials. Use in inter laboratory studies and as CRMs

International Nuclear Information System (INIS)

De Boer, J.

1999-01-01

Biological reference materials were prepared and packed in tins and glass jars to be used in inter laboratory studies on chlorobiphenyls and organochlorine pesticides, and trace metals, respectively. The materials were homogenised, sterilised and packed as wet tissue, which is unique for the purpose of inter laboratory studies and offers the advantage of studying the extraction and destruction steps of the analytical methods. In addition to their use in inter laboratory studies, some materials have been prepared or are being prepared as certified reference material for chlorobiphenyl analysis. (author)
The reliability of language performance measurement in language sample analysis of children aged 5-6 years

Directory of Open Access Journals (Sweden)

Zahra Soleymani

2014-04-01

Full Text Available Background and Aim: The language sample analysis (LSA is more common in other languages than Persian to study language development and assess language pathology. We studied some psychometric properties of language sample analysis in this research such as content validity of written story and its pictures, test-retest reliability, and inter-rater reliability.Methods: We wrote a story based on Persian culture from Schneider’s study. The validity of written story and drawn pictures was approved by experts. To study test-retest reliability, 30 children looked at the pictures and told their own story twice with 7-10 days interval. Children generated the story themselves and tester did not give any cue about the story. Their audio-taped story was transcribed and analyzed. Sentence and word structures were detected in the analysis.Results: Mean of experts' agreement with the validity of written story was 92.28 percent. Experts scored the quality of pictures high and excellent. There was correlation between variables in sentence and word structure (p<0.05 in test-retest, except complex sentences (p=0.137. The agreement rate was 97.1 percent in inter-rater reliability assessment of transcription. The results of inter-rater reliability of language analysis showed that correlation coefficients were significant.Conclusion: The results confirmed that the tool was valid for eliciting language sample. The consistency of language performance in repeated measurement varied from mild to high in language sample analysis approach.
Reliability of injury grading systems for patients with blunt splenic trauma.

Science.gov (United States)

Olthof, D C; van der Vlies, C H; Scheerder, M J; de Haan, R J; Beenen, L F M; Goslings, J C; van Delden, O M

2014-01-01

The most widely used grading system for blunt splenic injury is the American Association for the Surgery of Trauma (AAST) organ injury scale. In 2007 a new grading system was developed. This 'Baltimore CT grading system' is superior to the AAST classification system in predicting the need for angiography and embolization or surgery. The objective of this study was to assess inter- and intraobserver reliability between radiologists in classifying splenic injury according to both grading systems. CT scans of 83 patients with blunt splenic injury admitted between 1998 and 2008 to an academic Level 1 trauma centre were retrospectively reviewed. Inter and intrarater reliability were expressed in Cohen's or weighted Kappa values. Overall weighted interobserver Kappa coefficients for the AAST and 'Baltimore CT grading system' were respectively substantial (kappa=0.80) and almost perfect (kappa=0.85). Average weighted intraobserver Kappa's values were in the 'almost perfect' range (AAST: kappa=0.91, 'Baltimore CT grading system': kappa=0.81). The present study shows that overall the inter- and intraobserver reliability for grading splenic injury according to the AAST grading system and 'Baltimore CT grading system' are equally high. Because of the integration of vascular injury, the 'Baltimore CT grading system' supports clinical decision making. We therefore recommend use of this system in the classification of splenic injury. Copyright © 2012 Elsevier Ltd. All rights reserved.
Validity and reliability of balance assessment software using the Nintendo Wii balance board: usability and validation.

Science.gov (United States)

Park, Dae-Sung; Lee, GyuChang

2014-06-10

A balance test provides important information such as the standard to judge an individual's functional recovery or make the prediction of falls. The development of a tool for a balance test that is inexpensive and widely available is needed, especially in clinical settings. The Wii Balance Board (WBB) is designed to test balance, but there is little software used in balance tests, and there are few studies on reliability and validity. Thus, we developed a balance assessment software using the Nintendo Wii Balance Board, investigated its reliability and validity, and compared it with a laboratory-grade force platform. Twenty healthy adults participated in our study. The participants participated in the test for inter-rater reliability, intra-rater reliability, and concurrent validity. The tests were performed with balance assessment software using the Nintendo Wii balance board and a laboratory-grade force platform. Data such as Center of Pressure (COP) path length and COP velocity were acquired from the assessment systems. The inter-rater reliability, the intra-rater reliability, and concurrent validity were analyzed by an intraclass correlation coefficient (ICC) value and a standard error of measurement (SEM). The inter-rater reliability (ICC: 0.89-0.79, SEM in path length: 7.14-1.90, SEM in velocity: 0.74-0.07), intra-rater reliability (ICC: 0.92-0.70, SEM in path length: 7.59-2.04, SEM in velocity: 0.80-0.07), and concurrent validity (ICC: 0.87-0.73, SEM in path length: 5.94-0.32, SEM in velocity: 0.62-0.08) were high in terms of COP path length and COP velocity. The balance assessment software incorporating the Nintendo Wii balance board was used in our study and was found to be a reliable assessment device. In clinical settings, the device can be remarkably inexpensive, portable, and convenient for the balance assessment.
Reliability of an Automated High-Resolution Manometry Analysis Program across Expert Users, Novice Users, and Speech-Language Pathologists

Science.gov (United States)

Jones, Corinne A.; Hoffman, Matthew R.; Geng, Zhixian; Abdelhalim, Suzan M.; Jiang, Jack J.; McCulloch, Timothy M.

2014-01-01

Purpose: The purpose of this study was to investigate inter- and intrarater reliability among expert users, novice users, and speech-language pathologists with a semiautomated high-resolution manometry analysis program. We hypothesized that all users would have high intrarater reliability and high interrater reliability. Method: Three expert…
Values of a Patient and Observer Scar Assessment Scale to Evaluate the Facial Skin Graft Scar.

Science.gov (United States)

Chae, Jin Kyung; Kim, Jeong Hee; Kim, Eun Jung; Park, Kun

2016-10-01

The patient and observer scar assessment scale (POSAS) recently emerged as a promising method, reflecting both observer's and patient's opinions in evaluating scar. This tool was shown to be consistent and reliable in burn scar assessment, but it has not been tested in the setting of skin graft scar in skin cancer patients. To evaluate facial skin graft scar applied to POSAS and to compare with objective scar assessment tools. Twenty three patients, who diagnosed with facial cutaneous malignancy and transplanted skin after Mohs micrographic surgery, were recruited. Observer assessment was performed by three independent rates using the observer component of the POSAS and Vancouver scar scale (VSS). Patient self-assessment was performed using the patient component of the POSAS. To quantify scar color and scar thickness more objectively, spectrophotometer and ultrasonography was applied. Inter-observer reliability was substantial with both VSS and the observer component of the POSAS (average measure intraclass coefficient correlation, 0.76 and 0.80, respectively). The observer component consistently showed significant correlations with patients' ratings for the parameters of the POSAS (all p -valuesskin graft scar assessment in skin cancer patients, the POSAS showed acceptable inter-observer reliability. This tool was more comprehensive and had higher correlation with patient's opinion.
Reliability of Patient-Led Screening with the Malnutrition Screening Tool: Agreement between Patient and Health Care Professional Scores in the Cancer Care Ambulatory Setting.

Science.gov (United States)

Di Bella, Alexandra; Blake, Claire; Young, Adrienne; Pelecanos, Anita; Brown, Teresa

2018-02-01

The prevalence of malnutrition in patients with cancer is reported as high as 60% to 80%, and malnutrition is associated with lower survival, reduced response to treatment, and poorer functional status. The Malnutrition Screening Tool (MST) is a validated tool when administered by health care professionals; however, it has not been evaluated for patient-led screening. This study aims to assess the reliability of patient-led MST screening through assessment of inter-rater reliability between patient-led and dietitian-researcher-led screening and intra-rater reliability between an initial and a repeat patient screening. This cross-sectional study included 208 adults attending ambulatory cancer care services in a metropolitan teaching hospital in Queensland, Australia, in October 2016 (n=160 inter-rater reliability; n=48 intra-rater reliability measured in a separate sample). Primary outcome measures were MST risk categories (MST 0-1: not at risk, MST ≥2: at risk) as determined by screening completed by patients and a dietitian-researcher, patient test-retest screening, and patient acceptability. Percent and chance-corrected agreement (Cohen's kappa coefficient, κ) were used to determine agreement between patient-MST and dietitian-MST (inter-rater reliability) and MST completed by patient on admission to unit (patient-MSTA) and MST completed by patient 1 to 3 hours after completion of initial MST (patient-MSTB) (intra-rater reliability). High inter-rater reliability and intra-rater reliability were observed. Agreement between patient-MST and dietitian-MST was 96%, with "almost perfect" chance-adjusted agreement (κ=0.92, 95% CI 0.84 to 0.97). Agreement between repeated patient-MSTA and patient-MSTB was 94%, with "almost perfect" chance-adjusted agreement (κ=0.88, 95% CI 0.71 to 1.00). Based on dietitian-MST, 33% (n=53) of patients were identified as being at risk for malnutrition, and 40% of these reported not seeing a dietitian. Of 156 patients who provided
Inter-rater agreement of the PEWS tools used in Central Denmark Region

DEFF Research Database (Denmark)

Jensen, Claus Sixtus; Aagaard, Hanne; Olesen, Hanne Vebert

2017-01-01

BACKGROUND: Paediatric early warning score (PEWS) assessment tools can assist healthcare providers in the timely detection and recognition of subtle patient condition changes signalling clinical deterioration. However, PEWS tools instrument data are only as reliable and accurate as the caregivers...... agreement. The nurses assigned the exact same aggregated score for both PEWS models in 76% of the cases. In 98% of the PEWS assessments, the aggregated PEWS scores assigned by the nurses were equal to or below 1 point in both models. CONCLUSION: The study showed good to very good inter-rater reliability...
The reliability and validity of the Saliba Postural Classification System.

Science.gov (United States)

Collins, Cristiana Kahl; Johnson, Vicky Saliba; Godwin, Ellen M; Pappas, Evangelos

2016-07-01

To determine the reliability and validity of the Saliba Postural Classification System (SPCS). Two physical therapists classified pictures of 100 volunteer participants standing in their habitual posture for inter and intra-tester reliability. For validity, 54 participants stood on a force plate in a habitual and a corrected posture, while a vertical force was applied through the shoulders until the clinician felt a postural give. Data were extracted at the time the give was felt and at a time in the corrected posture that matched the peak vertical ground reaction force (VGRF) in the habitual posture. Inter-tester reliability demonstrated 75% agreement with a Kappa = 0.64 (95% CI = 0.524-0.756, SE = 0.059). Intra-tester reliability demonstrated 87% agreement with a Kappa = 0.8, (95% CI = 0.702-0.898, SE = 0.05) and 80% agreement with a Kappa = 0.706, (95% CI = 0.594-0818, SE = 0.057). The examiner applied a significantly higher (p < 0.001) peak vertical force in the corrected posture prior to a postural give when compared to the habitual posture. Within the corrected posture, the %VGRF was higher when the test was ongoing vs. when a postural give was felt (p < 0.001). The %VGRF was not different between the two postures when comparing the peaks (p = 0.214). The SPCS has substantial agreement for inter- and intra-tester reliability and is largely a valid postural classification system as determined by the larger vertical forces in the corrected postures. Further studies on the correlation between the SPCS and diagnostic classifications are indicated.
The quadrant method measuring four points is as a reliable and accurate as the quadrant method in the evaluation after anatomical double-bundle ACL reconstruction.

Science.gov (United States)

Mochizuki, Yuta; Kaneko, Takao; Kawahara, Keisuke; Toyoda, Shinya; Kono, Norihiko; Hada, Masaru; Ikegami, Hiroyasu; Musha, Yoshiro

2017-11-20

The quadrant method was described by Bernard et al. and it has been widely used for postoperative evaluation of anterior cruciate ligament (ACL) reconstruction. The purpose of this research is to further develop the quadrant method measuring four points, which we named four-point quadrant method, and to compare with the quadrant method. Three-dimensional computed tomography (3D-CT) analyses were performed in 25 patients who underwent double-bundle ACL reconstruction using the outside-in technique. The four points in this study's quadrant method were defined as point1-highest, point2-deepest, point3-lowest, and point4-shallowest, in femoral tunnel position. Value of depth and height in each point was measured. Antero-medial (AM) tunnel is (depth1, height2) and postero-lateral (PL) tunnel is (depth3, height4) in this four-point quadrant method. The 3D-CT images were evaluated independently by 2 orthopaedic surgeons. A second measurement was performed by both observers after a 4-week interval. Intra- and inter-observer reliability was calculated by means of intra-class correlation coefficient (ICC). Also, the accuracy of the method was evaluated against the quadrant method. Intra-observer reliability was almost perfect for both AM and PL tunnel (ICC > 0.81). Inter-observer reliability of AM tunnel was substantial (ICC > 0.61) and that of PL tunnel was almost perfect (ICC > 0.81). The AM tunnel position was 0.13% deep, 0.58% high and PL tunnel position was 0.01% shallow, 0.13% low compared to quadrant method. The four-point quadrant method was found to have high intra- and inter-observer reliability and accuracy. This method can evaluate the tunnel position regardless of the shape and morphology of the bone tunnel aperture for use of comparison and can provide measurement that can be compared with various reconstruction methods. The four-point quadrant method of this study is considered to have clinical relevance in that it is a detailed and accurate tool for
Study on the reliability of large coal-fired and nuclear power plants. Factors affecting power plant reliability. Volume I. Final report

International Nuclear Information System (INIS)

1975-01-01

The study consisted of a comparative evaluation of 2 nuclear units (Indian Point 2 - Consolidated Edison of New York, Turkey Point 4 - Florida Power and Light Company) and 2 coal-fired units (Bull Run and Widows Creek Unit 8 - Tennessee Valley Authority). The purpose of the study was to identify and assess the underlying causes of unit reliability and the causes of the observed differences in reliability performance of the units. Recommended actions for improving the reliability of one of the study units was to be presented in a format useful to other utility companies for improving reliability of their generating units. The emphasis of the study was on the aspects of management, manning, operations, and maintenance which had a significant impact on unit reliability. Volume 1 includes a summary, a description of the major findings from the comparative evaluation, conclusions based on these findings, and recommendations for improving the reliability of the below average units
Development of a valid and reliable test to assess trauma radiograph interpretation performance

International Nuclear Information System (INIS)

Neep, M.J.; Steffens, T.; Riley, V.; Eastgate, P.; McPhail, S.M.

2017-01-01

Objectives: The purpose of this investigation was to develop and examine the preliminary validity and reliability among radiographers of a test to assess trauma radiograph interpretation performance suitable for use among health professionals. Methods: Stage 1 examined 14,159 consecutive appendicular and axial examinations from a hospital emergency department over a 12 month period to quantify a typical anatomical region case-mix of trauma radiographs. A sample of radiographic cases representative of affected anatomical regions was then developed into the Image Interpretation Test (IIT). Stage 2 involved prospective investigations of the IIT's reliability (inter-rater, intra-rater, internal consistency) and validity (concurrent) among 41 radiographers. Results: The IIT included 60 cases. The median (interquartile range) clinical experience of participants was 5 (2–10) years. Case scores were internally consistent (Cronbach's alpha = 0.90). Favourable inter-rater reliability (kappa > 0.70 for 58/60 cases, Intra-class correlation coefficient (ICC) > 0.99 for total score) and intra-rater reliability (kappa > 0.90 for 60/60 cases, ICC > 0.99 for total score) was observed. There was a positive association between radiographers' confidence in image interpretation and IIT score (coefficient = 1.52, r-squared = 0.60, p < 0.001). Conclusions: The IIT developed during this investigation included a selection of radiographic cases consistent with anatomical regions represented in an adult trauma case-mix. This study has also provided foundational preliminary evidence to support the reliability and validity of the IIT among radiographers. The findings suggest that it is possible to assess image interpretation performance of adult trauma radiographs with this test. - Highlights: • Development of an Image Interpretation Test (IIT). • Cases consistent with anatomical regions represented in a typical adult trauma case-mix. • Development of a
Overview of the InterGroup protocols

Energy Technology Data Exchange (ETDEWEB)

Berket, Karlo [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Agarwal, Deborah A. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Melliar-Smith, P. Michael [Univ. of California, Santa Barbara, CA (United States); Moser, Louise E. [Univ. of California, Santa Barbara, CA (United States)

2001-03-01

Existing reliable ordered group communication protocols have been developed for local-area networks and do not, in general, scale well to large numbers of nodes and wide-area networks. The InterGroup suite of protocols is a scalable group communication system that introduces a novel approach to handling group membership, and supports a receiver-oriented selection of service. The protocols are intended for a wide-area network, with a large number of nodes, that has highly variable delays and a high message loss rate, such as the Internet. The levels of the message delivery service range from unreliable unordered to reliable group timestamp ordered.
Intra- and inter-rater reliability of movement and palpation tests in patients with neck pain: A systematic review.

Science.gov (United States)

Jonsson, Anders; Rasmussen-Barr, Eva

2018-03-01

Neck pain is common and often becomes chronic. Various clinical tests of the cervical spine are used to direct and evaluate treatment. This systematic review aimed to identify studies examining the intra- and/or interrater reliability of tests used in clinical examination of patients with neck pain. A database search up to April 2016 was conducted in PubMed, CINAHL, and AMED. The Quality Appraisal of Reliability Studies Checklist (QAREL) was used to assess risk of bias. Eleven studies were included, comprising tests of active and passive movement and pain evaluating participants with ongoing neck pain. One study was assessed with a low risk of bias, three with medium risk, while the rest were assessed with high risk of bias. The results showed differing reliabilities for the included tests ranging from poor to almost perfect. In conclusion, active movement and pain for pain or mobility overall presented acceptable to very good reliability (Kappa >0.40); while passive intervertebral tests had lower Kappa values, suggesting poor reliability. It may be a coincidence that the studies indicating very good reliability tended to be of higher quality (low to moderate risk of bias), while studies finding poor reliability tended to be of lower quality (high risk of bias). Regardless, the current recommendation from this review would suggest the clinical use of tests with acceptable reliability and avoiding the use of tests that have been shown to not be reliable. Finally, it is critical that all future reliability studies are of higher quality with low risk of bias.
A Structured Clinical Interview for Kleptomania (SCI-K): preliminary validity and reliability testing.

Science.gov (United States)

Grant, Jon E; Kim, Suck Won; McCabe, James S

2006-06-01

Kleptomania presents difficulties in diagnosis for clinicians. This study aimed to develop and test a DSM-IV-based diagnostic instrument for kleptomania. To assess for current kleptomania the Structured Clinical Interview for Kleptomania (SCI-K) was administered to 112 consecutive subjects requesting psychiatric outpatient treatment for a variety of disorders. Reliability and validity were determined. Classification accuracy was examined using the longitudinal course of illness. The SCI-K demonstrated excellent test-retest (Phi coefficient = 0.956 (95% CI = 0.937, 0.970)) and inter-rater reliability (phi coefficient = 0.718 (95% CI = 0.506, 0.848)) in the diagnosis of kleptomania. Concurrent validity was observed with a self-report measure using DSM-IV kleptomania criteria (phi coefficient = 0.769 (95% CI = 0.653, 0.850)). Discriminant validity was observed with a measure of depression (point biserial coefficient = -0.020 (95% CI = -0.205, 0.166)). The SCI-K demonstrated both high sensitivity and specificity based on longitudinal assessment. The SCI-K demonstrated excellent reliability and validity in diagnosing kleptomania in subjects presenting with various psychiatric problems. These findings require replication in larger groups, including non-psychiatric populations, to examine their generalizability. Copyright (c) 2006 John Wiley & Sons, Ltd.
Functional leg length discrepancy between theories and reliable instrumental assessment: a study about newly invented NPoS system.

Science.gov (United States)

Mahmoud, Asmaa; Abundo, Paolo; Basile, Luisanna; Albensi, Caterina; Marasco, Morena; Bellizzi, Letizia; Galasso, Franco; Foti, Calogero

2017-01-01

In spite the instinct social&financial impact of Leg Length Discrepancy (LLD), controversial and conflicting results still exist regarding a reliable assessment/correction method. For proper management it's essential to discriminate between anatomical&functional Leg Length Discrepancy (FLLD). With the newly invented NPoS (New Postural Solution), under the umbrella of the collaboration of PRM Department, Tor Vergata University with Baro Postural Instruments srl, positive results were observed in both measuring& compensating the hemi-pelvic antero-medial rotation in FLLD through personalized bilateral heel raise using two NPoS components: Foot Image System (FIS) and Postural Optimizer System (POS). This led our research interest to test the validity of NPoS as a preliminary step before evaluating its implementations in postural disorders. After clinical evaluation, 4 subjects with FLLD have been assessed by NPoS. Over a period of 2 months, every subject was evaluated 12 times by two different operators, 48 measurements in total, results have been verified in correlation to BTS GaitLab results. Intra-Operator&inter-operator variability analysis showed statistically insignificant differences, while inter-method variability between NPoS and BTS parameters expressed a linear correlation. Results suggest a significant validity of NPoS in assessment&correction of FLLD, with high degree of reproducibility with minimal operator dependency. This can be considered a base for promising clinical implications of NPoS as a reliable cost effective postural assessment/corrective tool. V.
The reliability of the Adelaide in-shoe foot model.

Science.gov (United States)

Bishop, Chris; Hillier, Susan; Thewlis, Dominic

2017-07-01

Understanding the biomechanics of the foot is essential for many areas of research and clinical practice such as orthotic interventions and footwear development. Despite the widespread attention paid to the biomechanics of the foot during gait, what largely remains unknown is how the foot moves inside the shoe. This study investigated the reliability of the Adelaide In-Shoe Foot Model, which was designed to quantify in-shoe foot kinematics and kinetics during walking. Intra-rater reliability was assessed in 30 participants over five walking trials whilst wearing shoes during two data collection sessions, separated by one week. Sufficient reliability for use was interpreted as a coefficient of multiple correlation and intra-class correlation coefficient of >0.61. Inter-rater reliability was investigated separately in a second sample of 10 adults by two researchers with experience in applying markers for the purpose of motion analysis. The results indicated good consistency in waveform estimation for most kinematic and kinetic data, as well as good inter-and intra-rater reliability. The exception is the peak medial ground reaction force, the minimum abduction angle and the peak abduction/adduction external hindfoot joint moments which resulted in less than acceptable repeatability. Based on our results, the Adelaide in-shoe foot model can be used with confidence for 24 commonly measured biomechanical variables during shod walking. Copyright © 2017 Elsevier B.V. All rights reserved.
Once is not enough : Establishing reliability criteria for teacher evaluation based on classroom observations

NARCIS (Netherlands)

van der Lans, Rikkert; van de Grift, Wim; van Veen, Klaas

2016-01-01

Classroom observation is the most implemented method to evaluate teaching. To ensure reliability, researchers often train observers extensively. However, schools have limited resources to train observers and often lesson observation is performed by limitedly trained or untrained colleagues. In this
Inter-observer variation in estimates by nuclear angiography of left ventricular ejection fraction and ejection rate

International Nuclear Information System (INIS)

Young, K.C.; Railton, R.

1980-01-01

The recent decline in the cost of computing has led to the introduction of data processing of gamma-camera images in many medical centres, allowing the development and widespread use of radionuclide techniques for assessing left ventricular performance. Methods such as ECG-gated blood-pool imaging have the advantage of being less invasive than contrast ventriculography and do not rely on geometrical assumptions about the shape of the ventricle. A study has been made of the inter-observer variation in estimates of ejection fraction and average and maximum systolic contraction rates using a micro-computer (VIP-450 Video Image Processor, Ohio-Nuclear Limited, Rugby) to analyse gated blood-pool images of the left ventricle. (author)

Radiographic measurement reliability of lumbar lordosis in ankylosing spondylitis.

Science.gov (United States)

Lee, Jung Sub; Goh, Tae Sik; Park, Shi Hwan; Lee, Hong Seok; Suh, Kuen Tak

2013-04-01

Intraobserver and interobserver reliabilities of the several different methods to measure lumbar lordosis have been reported. However, it has not been studied sofar in patients with ankylosing spondylitis (AS). We evaluated the inter and intraobserver reliabilities of six specific measures of global lumbar lordosis in patients with AS. Ninety-one consecutive patients with AS who met the most recently modified New York criteria were enrolled and underwent anteroposterior and lateral radiographs of whole spine. The radiographs were divided into non-ankylosis (no bony bridge in the lumbar spine), incomplete ankylosis (lumbar spines were partially connected by bony bridge) and complete ankylosis groups to evaluate the reliability of the Cobb L1-S1, Cobb L1-L5, centroid, posterior tangent L1-S1, posterior tangent L1-L5, and TRALL methods. The radiographs were composed of 39 non-ankylosis, 27 incomplete ankylosis and 25 complete ankylosis. Intra- and inter-class correlation coefficients (ICCs) of all six methods were generally high. The ICCs were all ≥0.77 (excellent) for the six radiographic methods in the combined group. However, a comparison of the ICCs, 95 % confidence intervals and mean absolute difference (MAD) between groups with varying degrees of ankylosis showed that the reliability of the lordosis measurements decreased in proportion to the severity of ankylosis. The Cobb L1-S1, Cobb L1-L5 and posterior tangent L1-S1 method demonstrated higher ICCs for both inter and intraobserver comparisons and the other methods showed lower ICCs in all groups. The intraobserver MAD was similar in the Cobb L1-S1 and Cobb L1-L5 (2.7°-4.3°), but the other methods showed higher intraobserver MAD. Interobserver MAD of Cobb L1-L5 only showed low in all group. These results are the first to provide a reliability analysis of different global lumbar lordosis measurement methods in AS. The findings in this study demonstrated that the Cobb L1-L5 method is reliable for measuring
OMERACT Rheumatoid Arthritis Magnetic Resonance Imaging Studies. Exercise 5: an international multicenter reliability study using computerized MRI erosion volume measurements

DEFF Research Database (Denmark)

Bird, P; Ejbjerg, B; McQueen, F

2003-01-01

with metacarpophalangeal (MCP) joints 2 to 5 of the dominant hand included in the field of view. Three readers were instructed to grade MCP 2 and 3 using the OMERACT grading system and then to measure the erosion volume of the same joints using OSIRIS software. The inter-reader reliability of the grading method...
Reliability of length measurements collected by community nurses and health volunteers in rural growth monitoring and promotion services.

Science.gov (United States)

Laar, Matilda E; Marquis, Grace S; Lartey, Anna; Gray-Donald, Katherine

2018-02-17

Length measurements are important in growth, monitoring and promotion (GMP) for the surveillance of a child's weight-for-length and length-for-age. These two indices provide an indication of a child's risk of becoming wasted or stunted, and are more informative about a child's growth than the widely used weight-for-age index (underweight). Although the introduction of length measurements in GMP is recommended by the World Health Organization, concerns about the reliability of length measurements collected in rural outreach settings have been expressed by stakeholders. Our aim was to describe the reliability and challenges associated with community health personnel measuring length for rural outreach GMP activities. Two reliability studies (A and B), using 10 children less than 24 months each, were conducted in the GMP services of a rural district in Ghana. Fifteen nurses and 15 health volunteers (HV) with no prior experience in length measurements were trained. Intra- and inter-observer technical error of measurement (TEM), average bias from expert anthropometrist, and coefficient of reliability (R) of length measurements were assessed and compared across sessions. Observations and interviews were used to understand the ability and experiences of health personnel with measuring length at outreach GMP. Inter-observer TEM was larger than intra-observer TEM for both nurses and HV at both sessions and was unacceptably (compared to error standards) high in both groups at both time points. Average biases from expert's measurements were within acceptable limits, however, both groups tended to underestimate length measurements. The R for lengths collected by nurses (92.3%) was higher at session B compared to that of HV (87.5%). Length measurements taken by nurses and HV, and those taken by an experienced anthropometrist at GMP sessions were of moderate agreement (kappa = 0.53, p reliability of length measurements improved after two refresher trainings for nurses but
Observational constraints on the inter-binary stellar flare hypothesis for the gamma-ray bursts

Science.gov (United States)

Rao, A. R.; Vahia, M. N.

1994-01-01

The Gamma Ray Observatory/Burst and Transient Source Experiment (GRO/BATSE) results on the Gamma Ray Bursts (GRBs) have given an internally consistent set of observations of about 260 GRBs which have been released for analysis by the BATSE team. Using this database we investigate our earlier suggestion (Vahia and Rao, 1988) that GRBs are inter-binary stellar flares from a group of objects classified as Magnetically Active Stellar Systems (MASS) which includes flare stars, RS CVn binaries and cataclysmic variables. We show that there exists an observationally consistent parameter space for the number density, scale height and flare luminosity of MASS which explains the complete log(N) - log(P) distribution of GRBs as also the observed isotropic distribution. We further use this model to predict anisotropy in the GRB distribution at intermediate luminosities. We make definite predictions under the stellar flare hypothesis that can be tested in the near future.
Reliability of photogrammetry in the evaluation of the postural aspects of individuals with structural scoliosis.

Science.gov (United States)

Saad, Karen Ruggeri; Colombo, Alexandra Siqueira; Ribeiro, Ana Paula; João, Sílvia Maria Amado

2012-04-01

The purpose of this study was to investigate the reliability of photogrammetry in the measurement of the postural deviations in individuals with idiopathic scoliosis. Twenty participants with scoliosis (17 women and three men), with a mean age of 23.1 ± 9 yrs, were photographed from the posterior and lateral views. The postural aspects were measured with CorelDRAW software. High inter-rater and test-retest reliability indices were found. It was observed that with more severity of scoliosis, greater were the variations between the thoracic kyphosis and lumbar lordosis measures obtained by the same examiner from the left lateral view photographs. A greater body mass index (BMI) was associated with greater variability of the trunk rotation measures obtained by two independent examiners from the right, lateral view (r = 0.656; p = 0.002). The severity of scoliosis was also associated with greater inter-rater variability measures of trunk rotation obtained from the left, lateral view (r = 0.483; p = 0.036). Photogrammetry demonstrated to be a reliable method for the measurement of postural deviations from the posterior and lateral views of individuals with idiopathic scoliosis and could be complementarily employed for the assessment procedures, which could reduce the number of X-rays used for the follow-up assessments of these individuals. Copyright © 2011 Elsevier Ltd. All rights reserved.
An observation tool for studying patient-oriented workflow in hospital emergency departments.

Science.gov (United States)

Ozkaynak, M; Brennan, P

2013-01-01

Studying workflow is a critical step in designing, implementing and evaluating informatics interventions in complex sociotechnical settings, such as hospital emergency departments (EDs). Known approaches to studying workflow in clinical settings attend to the activities of individual clinicians, thus being inadequate to characterize patient care as a cooperative work. The purpose of this paper is twofold. First, we introduce a novel, theory-driven patient-oriented workflow methodology, which better addresses the complex, multiple-provider nature of patient care. Second, we report the development of an observational tool and protocol for use in studies of this type, and the results of an evaluation study. We created a tablet computer implementation of an instrument to efficiently capture patient-oriented workflow, and evaluated it through a field study in three EDs. We focused on activities occurring over time during a single patient care episode as well as the roles of the ED staff members who conducted the activities. The evidence generated supports the validity, viability, and reliability of the tool. The coverage of the tool in terms of activities and roles was satisfactory. The tool was able to capture the sequence of activity-role pairs for 108 patient care episodes. The inter-rater reliability assessment yielded a high kappa value (0.79). The patient-oriented workflow methodology has the potential to facilitate modeling patient care in EDs by characterizing both roles and activities in sequence. The methodology also provides researchers and practitioners a more realistic and comprehensive workflow perspective that can inform the design, implementation and evaluation of health information technology interventions.
Inter-organizational network studies - a literature review

DEFF Research Database (Denmark)

Bergenholtz, Carsten; Waldstrøm, Christian

literature review of the last 12 years' research on inter-organizational networks, with a focus on the methodological aspects. The findings of this paper is that few of the previous studies have used the full methodological (and thus theoretical) scope of the available data and that the most cited papers...
Reliable and valid assessment of competence in endoscopic ultrasonography and fine-needle aspiration for mediastinal staging of non-small cell lung cancer.

Science.gov (United States)

Konge, L; Vilmann, P; Clementsen, P; Annema, J T; Ringsted, C

2012-10-01

Fine-needle aspiration (FNA) guided by endoscopic ultrasonography (EUS) is important in mediastinal staging of non-small cell lung cancer (NSCLC). Training standards and implementation strategies of this technique are currently under discussion. The aim of this study was to explore the reliability and validity of a newly developed EUS Assessment Tool (EUSAT) designed to measure competence in EUS - FNA for mediastinal staging of NSCLC. A total of 30 patients with proven or suspected NSCLC underwent EUS - FNA for mediastinal staging by three trainees and three experienced physicians. Their performances were assessed prospectively by three experts in EUS under direct observation and again 2 months later in a blinded fashion using digital video-recordings. Based on the assessments, intra-rater reliability, inter-rater reliability, and construct validity were explored. The intra-rater reliability was good (Cronbach's α = 0.80), but comparison of results based on direct observations and blinded video-recordings indicated a significant bias favoring consultants (P = 0.022). Inter-rater reliability was very good (Cronbach's α = 0.93). However, one rater assessing five procedures or two raters each assessing four procedures were necessary to secure a generalizability coefficient of 0.80. The assessment tool demonstrated construct validity by discriminating between trainees and experienced physicians (P = 0.034). Competency in mediastinal staging of NSCLC using EUS and EUS - FNA can be assessed in a reliable and valid way using the EUSAT assessment tool. Measuring and defining competency and training requirements could improve EUS quality and benefit patient care. © Georg Thieme Verlag KG Stuttgart · New York.
Inter-decadal change of the lagged inter-annual relationship between local sea surface temperature and tropical cyclone activity over the western North Pacific

Science.gov (United States)

Zhao, Haikun; Wu, Liguang; Raga, G. B.

2018-02-01

This study documents the inter-decadal change of the lagged inter-annual relationship between the TC frequency (TCF) and the local sea surface temperature (SST) in the western North Pacific (WNP) during 1979-2014. An abrupt shift of the lagged relationship between them is observed to occur in 1998. Before the shift (1979-1997), a moderately positive correlation (0.35) between previous-year local SST and TCF is found, while a significantly negative correlation (- 0.71) is found since the shift (1998-2014). The inter-decadal change of the lagged relationship between TCF and local SST over the WNP is also accompanied by an inter-decadal change in the lagged inter-annual relationship between large-scale factors affecting TCs and local SST over the WNP. During 1998-2014, the previous-year local SST shows a significant negative correlation with the mid-level moisture and a significant positive correlation with the vertical wind shear over the main development region of WNP TC genesis. Almost opposite relationships are seen during 1979-1997, with a smaller magnitude of the correlation coefficients. These changes are consistent with the changes of the lagged inter-annual relationship between upper- and lower-level winds and local SST over the WNP. Analyses further suggests that the inter-decadal shift of the lagged inter-annual relationship between WNP TCF and local SST may be closely linked to the inter-decadal change of inter-annual SST transition over the tropical central-eastern Pacific associated with the climate regime shift in the late 1990s. Details on the underlying physical process need further investigation using observations and simulations.
[Reliability and Validity of the Behavioral Check List for Preschool Children to Measure Attention Deficit Hyperactivity Behaviors].

Science.gov (United States)

Tsuno, Kanami; Yoshimasu, Kouichi; Hayashi, Takashi; Tatsuta, Nozomi; Ito, Yuki; Kamijima, Michihiro; Nakai, Kunihiko

2018-01-01

Nowadays, attention deficit hyperactivity (ADH) problems are observed commonly among school-age children. However, questionnaires specific to ADH behaviors among preschool children are very few. The aim of this study was to investigate the reliability and validity of the 25-item Behavioral Check List (BCL), which was developed from interviews of parents with children who were diagnosed as having Attention-deficit/hyperactivity disorder (ADHD) and measures ADH behaviors in preschool age. We recruited 22 teachers from 10 nurseries/kindergartens in Miyagi Prefecture, Japan. A total of 138 preschool children were assessed using the BCL. To investigate inter-rater reliability, two teachers from each facility assess seven to twenty children in their class, and intraclass correlation coefficients (ICCs) were calculated. The teachers additionally answered questions in the 1/5-5 Caregiver-Teacher Report Form (C-TRF) to investigate the criterion validity of the BCL. To investigate structural validity, exploratory factor analysis with promax rotation and confirmatory factor analysis were performed. The internal consistency reliability of the BCL was good (α = 0.92) and correlation analyses also confirmed its excellent criterion validity. Although exploratory factor analysis for the BCL yielded a five-factor model that consisted of a factor structure different from that of the original one, the results were similar to the original six factors. The ICCs of the BCL were 0.38-0.99 and it was not high enough for inter-rater reliability in some facilities. However, there is a possibility to improve it by giving raters adequate explanations when using BCL. The present study showed acceptable levels of reliability and validity of the BCL among Japanese preschool children.
Validity and Reliability of 10-Hz Global Positioning System to Assess In-line Movement and Change of Direction.

Science.gov (United States)

Nikolaidis, Pantelis T; Clemente, Filipe M; van der Linden, Cornelis M I; Rosemann, Thomas; Knechtle, Beat

2018-01-01

The objectives of the present study were to examine the validity and reliability of the 10 Hz Johan GPS unit in assessing in-line movement and change of direction. The validity was tested against the criterion measure of 200 m track-and-field (track-and-field athletes, n = 8) and 20 m shuttle run endurance test (female soccer players, n = 20). Intra-unit and inter-unit reliability was tested by intra-class correlation coefficient (ICC) and coefficient of variation (CV), respectively. An analysis of variance examined differences between the GPS measurement and five laps of 200 m at 15 km/h, and t -test examined differences between the GPS measurement and 20 m shuttle run endurance test. The difference between the GPS measurement and 200 m distance ranged from -0.13 ± 3.94 m (95% CI -3.42; 3.17) in the first lap to 2.13 ± 2.64 m (95% CI -0.08; 4.33) in the fifth lap. A good intra-unit reliability was observed in 200 m (ICC = 0.833, 95% CI 0.535; 0.962). Inter-unit CV ranged from 1.31% (fifth lap) to 2.20% (third lap). The difference between the GPS measurement and 20 m shuttle run endurance test ranged from 0.33 ± 4.16 m (95% CI -10.01; 10.68) in 11.5 km/h to 9.00 ± 5.30 m (95% CI 6.44; 11.56) in 8.0 km/h. A moderate intra-unit reliability was shown in the second and third stage of the 20 m shuttle run endurance test (ICC = 0.718, 95% CI 0.222;0.898) and good reliability in the fifth, sixth, seventh and eighth (ICC = 0.831, 95% CI -0.229;0.996). Inter-unit CV ranged from 2.08% (11.5 km/h) to 3.92% (8.5 km/h). Based on these findings, it was concluded that the 10 Hz Johan system offers an affordable valid and reliable tool for coaches and fitness trainers to monitor training and performance.
Inter-observer variation in delineation of the heart and left anterior descending coronary artery in radiotherapy for breast cancer: a multi-centre study from Denmark and the UK

DEFF Research Database (Denmark)

Lorenzen, Ebbe L; Taylor, Carolyn W; Maraldo, Maja

2013-01-01

BACKGROUND AND PURPOSE: To determine the extent of inter-observer variation in delineation of the heart and left anterior descending coronary artery (LADCA) and its impact on estimated doses. METHODS AND MATERIALS: Nine observers from five centres delineated the heart and LADCA on fifteen patient...... guidelines were used. In contrast, for the LADCA there was substantial variation in the estimated dose, which was not reduced with guidelines....
Density functional theory study of inter-layer coupling in bulk tin selenide

Science.gov (United States)

Song, Hong-Yue; Lü, Jing-Tao

2018-03-01

We study the inter-layer coupling in bulk tin selenide (SnSe) through density functional theory based calculations. Different approximations for the exchange-correlation functionals and the van der Waals interaction are employed. By performing comparison with graphite, MoS2 and black phosphorus, we analyze the inter-layer coupling from different points of view, including the binding energy, the low frequency inter-layer optical phonons, and the inter-layer charge transfer. We find that, there is a strong charge transfer between layers of SnSe, resulting in the strongest inter-layer coupling. Moreover, the charge transfer renders the inter-layer coupling in SnSe not of van der Waals type. Mechanical exfoliation has been used to fabricate mono- or few-layer graphene, MoS2 and black phosphorus. But, our results show that it may be difficult to apply similar technique to SnSe.
The reliability of a maximal isometric hip strength and simultaneous surface EMG screening protocol in elite, junior rugby league athletes.

Science.gov (United States)

Charlton, Paula C; Mentiplay, Benjamin F; Grimaldi, Alison; Pua, Yong-Hao; Clark, Ross A

2017-02-01

Firstly to describe the reliability of assessing maximal isometric strength of the hip abductor and adductor musculature using a hand held dynamometry (HHD) protocol with simultaneous wireless surface electromyographic (sEMG) evaluation of the gluteus medius (GM) and adductor longus (AL). Secondly, to describe the correlation between isometric strength recorded with the HHD protocol and a laboratory standard isokinetic device. Reliability and correlational study. A sample of 24 elite, male, junior, rugby league athletes, age 16-20 years participated in repeated HHD and isometric Kin-Com (KC) strength testing with simultaneous sEMG assessment, on average (range) 6 (5-7) days apart by a single assessor. Strength tests included; unilateral hip abduction (ABD) and adduction (ADD) and bilateral ADD assessed with squeeze (SQ) tests in 0 and 45° of hip flexion. HHD demonstrated good to excellent inter-session reliability for all outcome measures (ICC (2,1) =0.76-0.91) and good to excellent association with the laboratory reference KC (ICC (2,1) =0.80-0.88). Whilst intra-session, inter-trial reliability of EMG activation and co-activation outcome measures ranged from moderate to excellent (ICC (2,1) =0.70-0.94), inter-session reliability was poor (all ICC (2,1) Isometric strength testing of the hip ABD and ADD musculature using HHD may be measured reliably in elite, junior rugby league athletes. Due to the poor inter-session reliability of sEMG measures, it is not recommended for athlete screening purposes if using the techniques implemented in this study. Copyright © 2016 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.
Computerized system to measure interproximal alveolar bone levels in epidemiologic, radiographic investigations. II. Intra- and inter-examinar variation study

Energy Technology Data Exchange (ETDEWEB)

Wouters, F.R.; Frithiof, L.; Soeder, P.Oe.; Hellden, L.; Lavstedt, S.; Salonen, L.

1988-01-01

The study was aimed at analyzing intra- and inter-examiner variations in computerized measurement and in non-measurability of alveolar bone level in a cross-sectional, epidemiologic material. At each interproximal tooth surface, alveolar bone height in percentage of root length (B/R) and tooth length (B/T) were determined twice by one examiner and once by a second examiner from X5-magnified periapical radiographs. The overall intra- and inter-examiner variations in measurement were 2.85% and 3.84% of root length and 1.97% and 2.82% of tooth length, respectively. The varations were different for different tooth groups and for different degrees of severity of marginal periodontitis. The overall proportions on non-measurable tooth surfaces varied with examiner from 32% to 39% and from 43% to 48% of the available interproximal tooth surfaces for B/R and B/T, respectively. With regard to the level of reliability, the computerized method reported is appropriate to cross-sectional, epidemiologic investigations from radiographs.
Software reliability studies

Science.gov (United States)

Hoppa, Mary Ann; Wilson, Larry W.

1994-01-01

There are many software reliability models which try to predict future performance of software based on data generated by the debugging process. Our research has shown that by improving the quality of the data one can greatly improve the predictions. We are working on methodologies which control some of the randomness inherent in the standard data generation processes in order to improve the accuracy of predictions. Our contribution is twofold in that we describe an experimental methodology using a data structure called the debugging graph and apply this methodology to assess the robustness of existing models. The debugging graph is used to analyze the effects of various fault recovery orders on the predictive accuracy of several well-known software reliability algorithms. We found that, along a particular debugging path in the graph, the predictive performance of different models can vary greatly. Similarly, just because a model 'fits' a given path's data well does not guarantee that the model would perform well on a different path. Further we observed bug interactions and noted their potential effects on the predictive process. We saw that not only do different faults fail at different rates, but that those rates can be affected by the particular debugging stage at which the rates are evaluated. Based on our experiment, we conjecture that the accuracy of a reliability prediction is affected by the fault recovery order as well as by fault interaction.
Rater reliability and construct validity of a mobile application for posture analysis.

Science.gov (United States)

Szucs, Kimberly A; Brown, Elena V Donoso

2018-01-01

[Purpose] Measurement of posture is important for those with a clinical diagnosis as well as researchers aiming to understand the impact of faulty postures on the development of musculoskeletal disorders. A reliable, cost-effective and low tech posture measure may be beneficial for research and clinical applications. The purpose of this study was to determine rater reliability and construct validity of a posture screening mobile application in healthy young adults. [Subjects and Methods] Pictures of subjects were taken in three standing positions. Two raters independently digitized the static standing posture image twice. The app calculated posture variables, including sagittal and coronal plane translations and angulations. Intra- and inter-rater reliability were calculated using the appropriate ICC models for complete agreement. Construct validity was determined through comparison of known groups using repeated measures ANOVA. [Results] Intra-rater reliability ranged from 0.71 to 0.99. Inter-rater reliability was good to excellent for all translations. ICCs were stronger for translations versus angulations. The construct validity analysis found that the app was able to detect the change in the four variables selected. [Conclusion] The posture mobile application has demonstrated strong rater reliability and preliminary evidence of construct validity. This application may have utility in clinical and research settings.
Assessment of the nursing care product (APROCENF): a reliability and construct validity study.

Science.gov (United States)

Cucolo, Danielle Fabiana; Perroca, Márcia Galan

2017-04-06

to verify the reliability and construct validity estimates of the "Assessment of nursing care product" scale (APROCENF) and its applicability. this validation study included a sample of 40 (inter-rater reliability) and 172 (construct validity) assessments performed by nurses at the end of the work shift at nine inpatient services of a teaching hospital in the Brazilian Southeast. The data were collected between February and September/2014 with interruptions. Cronbach's alpha and Spearman's correlation coefficients were calculated, as well as the intraclass correlation and the weighted kappa index (inter-rater reliability). Exploratory factor analysis was used with principal component extraction and varimax rotation (construct validity). the internal consistency revealed an alpha coefficient of 0.85, item-item correlation ranging between 0.13 and 0.61 and item-total correlation between 0.43 and 0.69. Inter-rater equivalence was obtained and all items evidenced significant factor loadings. this research evidenced the reliability and construct validity of the scale to assess the nursing care product. Its application in nursing practice permits identifying improvements needed in the production process, contributing to management and care decisions. verificar as estimativas de confiabilidade e validade de construto da escala "Avaliação do produto do cuidar em enfermagem" (APROCENF) e sua aplicabilidade. este estudo de validação incluiu em sua amostra 40 (confiabilidade interavaliadores) e 172 (validade de construto) avaliações realizadas por enfermeiros ao final do turno de trabalho em nove unidades de internação de um hospital universitário do sudeste brasileiro. A coleta de dados ocorreu entre fevereiro e setembro de 2014 de forma interrupta. Foram calculados os coeficientes alfa de Cronbach e correlação de Spearman (consistência interna), a correlação intraclasse e Kappa ponderado (confiabilidade interavaliadores) e a análise fatorial exploratória foi
Inter-Ethnic/Racial Facial Variations: A Systematic Review and Bayesian Meta-Analysis of Photogrammetric Studies.

Science.gov (United States)

Wen, Yi Feng; Wong, Hai Ming; Lin, Ruitao; Yin, Guosheng; McGrath, Colman

2015-01-01

Numerous facial photogrammetric studies have been published around the world. We aimed to critically review these studies so as to establish population norms for various angular and linear facial measurements; and to determine inter-ethnic/racial facial variations. A comprehensive and systematic search of PubMed, ISI Web of Science, Embase, and Scopus was conducted to identify facial photogrammetric studies published before December, 2014. Subjects of eligible studies were either Africans, Asians or Caucasians. A Bayesian hierarchical random effects model was developed to estimate posterior means and 95% credible intervals (CrI) for each measurement by ethnicity/race. Linear contrasts were constructed to explore inter-ethnic/racial facial variations. We identified 38 eligible studies reporting 11 angular and 18 linear facial measurements. Risk of bias of the studies ranged from 0.06 to 0.66. At the significance level of 0.05, African males were found to have smaller nasofrontal angle (posterior mean difference: 8.1°, 95% CrI: 2.2°-13.5°) compared to Caucasian males and larger nasofacial angle (7.4°, 0.1°-13.2°) compared to Asian males. Nasolabial angle was more obtuse in Caucasian females than in African (17.4°, 0.2°-35.3°) and Asian (9.1°, 0.4°-17.3°) females. Additional inter-ethnic/racial variations were revealed when the level of statistical significance was set at 0.10. A comprehensive database for angular and linear facial measurements was established from existing studies using the statistical model and inter-ethnic/racial variations of facial features were observed. The results have implications for clinical practice and highlight the need and value for high quality photogrammetric studies.
Inter-Ethnic/Racial Facial Variations: A Systematic Review and Bayesian Meta-Analysis of Photogrammetric Studies

Science.gov (United States)

Wen, Yi Feng; Wong, Hai Ming; Lin, Ruitao; Yin, Guosheng; McGrath, Colman

2015-01-01

Background Numerous facial photogrammetric studies have been published around the world. We aimed to critically review these studies so as to establish population norms for various angular and linear facial measurements; and to determine inter-ethnic/racial facial variations. Methods and Findings A comprehensive and systematic search of PubMed, ISI Web of Science, Embase, and Scopus was conducted to identify facial photogrammetric studies published before December, 2014. Subjects of eligible studies were either Africans, Asians or Caucasians. A Bayesian hierarchical random effects model was developed to estimate posterior means and 95% credible intervals (CrI) for each measurement by ethnicity/race. Linear contrasts were constructed to explore inter-ethnic/racial facial variations. We identified 38 eligible studies reporting 11 angular and 18 linear facial measurements. Risk of bias of the studies ranged from 0.06 to 0.66. At the significance level of 0.05, African males were found to have smaller nasofrontal angle (posterior mean difference: 8.1°, 95% CrI: 2.2°–13.5°) compared to Caucasian males and larger nasofacial angle (7.4°, 0.1°–13.2°) compared to Asian males. Nasolabial angle was more obtuse in Caucasian females than in African (17.4°, 0.2°–35.3°) and Asian (9.1°, 0.4°–17.3°) females. Additional inter-ethnic/racial variations were revealed when the level of statistical significance was set at 0.10. Conclusions A comprehensive database for angular and linear facial measurements was established from existing studies using the statistical model and inter-ethnic/racial variations of facial features were observed. The results have implications for clinical practice and highlight the need and value for high quality photogrammetric studies. PMID:26247212

Inter-Ethnic/Racial Facial Variations: A Systematic Review and Bayesian Meta-Analysis of Photogrammetric Studies.

Directory of Open Access Journals (Sweden)

Yi Feng Wen

Full Text Available Numerous facial photogrammetric studies have been published around the world. We aimed to critically review these studies so as to establish population norms for various angular and linear facial measurements; and to determine inter-ethnic/racial facial variations.A comprehensive and systematic search of PubMed, ISI Web of Science, Embase, and Scopus was conducted to identify facial photogrammetric studies published before December, 2014. Subjects of eligible studies were either Africans, Asians or Caucasians. A Bayesian hierarchical random effects model was developed to estimate posterior means and 95% credible intervals (CrI for each measurement by ethnicity/race. Linear contrasts were constructed to explore inter-ethnic/racial facial variations. We identified 38 eligible studies reporting 11 angular and 18 linear facial measurements. Risk of bias of the studies ranged from 0.06 to 0.66. At the significance level of 0.05, African males were found to have smaller nasofrontal angle (posterior mean difference: 8.1°, 95% CrI: 2.2°-13.5° compared to Caucasian males and larger nasofacial angle (7.4°, 0.1°-13.2° compared to Asian males. Nasolabial angle was more obtuse in Caucasian females than in African (17.4°, 0.2°-35.3° and Asian (9.1°, 0.4°-17.3° females. Additional inter-ethnic/racial variations were revealed when the level of statistical significance was set at 0.10.A comprehensive database for angular and linear facial measurements was established from existing studies using the statistical model and inter-ethnic/racial variations of facial features were observed. The results have implications for clinical practice and highlight the need and value for high quality photogrammetric studies.
Reliability and validity of the international dementia alliance schedule for the assessment and staging of care in China.

Science.gov (United States)

Wang, Xiao; Sun, Zhenghai; Xiong, Lingchuan; Semrau, Maya; He, Jianhua; Li, Yang; Zhu, Jianzhong; Zhang, Nan; Wang, Aimin; Jiang, Qinpu; Mu, Nan; Zhao, Yuping; Chen, Wei; Wu, Donghui; Zheng, Zhanjie; Sun, Yongan; Zhang, Jing; Xu, Jun; Meng, Xue; Zhao, Mei; Zhang, Haifeng; Lv, Xiaozhen; Sartorius, Norman; Li, Tao; Yu, Xin; Wang, Huali

2017-11-21

Clinical and social services both are important for dementia care. The International Dementia Alliance (IDEAL) Schedule for the Assessment and Staging of Care was developed to guide clinical and social care for dementia. Our study aimed to assess the validity and reliability of the IDEAL schedule in China. Two hundred eighty-two dementia patients and their caregivers were recruited from 15 hospitals in China. Each patient-caregiver dyad was assessed with the IDEAL schedule by a rater and an observer simultaneously. The Clinical Dementia Rating (CDR), Mini-Mental Status Examination (MMSE), and Caregiver Burden Inventory (CBI) were assessed for criterion validity. IDEAL repeated assessment was conducted 7-10 days after the initial interview for 62 dyads. Two hundred seventy-seven patient-caregiver dyads completed the IDEAL assessment. Inter-rater reliability for the total score of the IDEAL schedule was 0.93 (95%CI = 0.92-0.95). The inter-class coefficient for the total score of IDEAL was 0.95 for the interviewers and 0.93 for the silent raters. The IDEAL total score correlated with the global CDR score (ρ = 0.72, p valid and reliable tool for the staging of care for dementia in the Chinese population.
Photographic assessment of burn size and depth: reliability and validity

NARCIS (Netherlands)

Hop, M.; Moues, C.; Bogomolova, K.; Nieuwenhuis, M.; Oen, I.; Middelkoop, E.; Breederveld, R.; de Baar, M.

2014-01-01

Objective: The aim of this study was to examine the reliability and validity of using photographs of burns to assess both burn size and depth. Method: Fifty randomly selected photographs taken on day 0-1 post burn were assessed by seven burn experts and eight referring physicians. Inter-rater
Target volume delineation in external beam partial breast irradiation: less inter-observer variation with preoperative- compared to postoperative delineation

NARCIS (Netherlands)

Leij, F. van der; Elkhuizen, P.H.M.; Janssen, T.M.; Poortmans, P.M.P.; Sangen, M. van der; Scholten, A.N.; Vliet-Vroegindeweij, C. van; Boersma, L.J.

2014-01-01

The challenge of adequate target volume definition in external beam partial breast irradiation (PBI) could be overcome with preoperative irradiation, due to less inter-observer variation. We compared the target volume delineation for external beam PBI on preoperative versus postoperative CT scans of
Intra- and inter-rater reliability of the Sollerman hand function test in patients with chronic stroke

DEFF Research Database (Denmark)

Brogårdh, Christina; Persson, Ann L; Sjölund, Bengt H

2007-01-01

PURPOSE: To examine whether the Sollerman hand function test is reliable in a test-retest situation in patients with chronic stroke. METHOD: Three independent examiners observed each patient at three experimental sessions; two days in week 1 (short-term test-retest) and one day in week 4 (long...... test seems to be a reliable test in patients with chronic stroke, but we recommend that the same examiner evaluates a patient's hand function pre- and post-treatment.......-term test-retest). A total of 24 patients with chronic stroke (mean age; 59.7 years, mean time since stroke onset 29.6 months) participated. The examiners simultaneously assessed the patients' ability to perform 20 subtests. Both ordinal data (generalized kappa) and total sum scores (Spearman's rank...
Inter- and intrarater reliability of two proprioception tests using clinical applicable measurement tools in subjects with and without knee osteoarthritis.

Science.gov (United States)

Baert, Isabel A C; Lluch, Enrique; Struyf, Thomas; Peeters, Greta; Van Oosterwijck, Sophie; Tuynman, Joanna; Rufai, Salim; Struyf, Filip

2018-06-01

The therapeutic value of proprioceptive-based exercises in knee osteoarthritis (KOA) management warrants investigation of proprioceptive testing methods easily accessible in clinical practice. To estimate inter- and intrarater reliability of the knee joint position sense (KJPS) test and knee force sense (KFS) test in subjects with and without KOA. Cross-sectional test-retest design. Two blinded raters performed independently repeated measures of the KJPS and KFS test, using an analogue inclinometer and handheld dynamometer, respectively, in eight KOA patients (12 symptomatic knees) and 26 healthy controls (52 asymptomatic knees). Intraclass correlation coefficients (ICCs; model 2,1), standard error of measurement (SEM) and minimal detectable change with 95% confidence bounds (MDC 95 ) were calculated. For KJPS, results showed good to excellent test-retest agreement (ICCs 0.70-0.95 in KOA patients; ICCs 0.65-0.85 in healthy controls). A 2° measurement error (SEM 1°) was reported when measuring KJPS in multiple test positions and calculating mean repositioning error. Testing KOA patients pre and post therapy a repositioning error larger than 4° (MDC 95 ) is needed to consider true change. Measuring KFS using handheld dynamometry showed poor to fair interrater and poor to excellent intrarater reliability in subjects with and without KOA. Measuring KJPS in multiple test positions using an analogue inclinometer and calculating mean repositioning error is reliable and can be used in clinical practice. We do not recommend the use of the KFS test to clinicians. Further research is required to establish diagnostic accuracy and validity of our KJPS test in larger knee pain populations. Copyright © 2017 Elsevier Ltd. All rights reserved.
Inter-particle and interfacial interaction of magnetic nanoparticles

International Nuclear Information System (INIS)

Bae, Che Jin; Hwang, Yosun; Park, Jongnam; An, Kwangjin; Lee, Youjin; Lee, Jinwoo; Hyeon, Taeghwan; Park, J.-G.

2007-01-01

In order to understand inter-particle as well as interfacial interaction of magnetic nanoparticles, we have prepared several Fe 3 O 4 nanoparticles in the ranges from 3 to 50 nm. These nanoparticles are particularly well characterized in terms of size distribution with a standard deviation (σ) in size less than 0.4 nm. We investigated the inter-particle interaction by measuring the magnetic properties of the nanoparticles while controlling inter-particle distances by diluting the samples with solvents. According to this study, blocking temperatures dropped by 8-17 K with increasing the inter-particle distances from a few nm to 140 nm while the overall shape and qualitative behavior of the magnetization remain unchanged. It implies that most features observed in the magnetic properties of the nanoparticles are due to the intrinsic properties of the nanoparticles, not due to the inter-particle interaction. We then examined possible interfacial magnetic interaction in the core-shell structure of our Fe 3 O 4 nanoparticles
Seasonal, annual and inter-annual features of turbulence parameters over the tropical station Pune (18°32' N, 73°51' E observed with UHF wind profiler

Directory of Open Access Journals (Sweden)

N. Singh

2008-11-01

Full Text Available The present study is specifically focused on the seasonal, annual and inter-annual variations of the refractive index structure parameter (Cn2 using three years of radar observations. Energy dissipation rates (ε during different seasons for a particular year are also computed over a tropical station, Pune. Doppler spectral width measurements made by the Wind Profiler, under various atmospheric conditions, are utilized to estimate the turbulence parameters. The refractive index structure parameter varies from 10−17.5 to 10−13 m−2/3 under clear air to precipitation conditions in the height region of 1.05 to 10.35 km. During the monsoon months, observed Cn2 values are up to 1–2 orders of magnitude higher than those during pre-monsoon and post-monsoon seasons. Spectral width correction for various non-turbulent spectral broadenings such as beam broadening and shear broadening are made in the observed spectral width for reliable estimation of ε under non-precipitating conditions. It is found that in the lower tropospheric height region, values of ε are in the range of 10−6 to 10−3 m2 s−3. In summer and monsoon seasons the observed values of ε are larger than those in post-monsoon and winter seasons in the lower troposphere. A comparison of Cn2 observed with the wind profiler and that estimated using Radio Sonde/Radio Wind (RS/RW data of nearby Met station Chikalthana has been made for the month of July 2003.
Tradução, adaptação e confiabilidade interexaminadores do manual de administração da escala de Fugl-Meyer Translation, adaptation and inter-rater reliability of the administration manual for the Fugl-Meyer assessment

Directory of Open Access Journals (Sweden)

Stella M Michaelsen

2011-02-01

a single evaluator who applied the test. When different raters apply the scale, the reliability may depend on the interpretation given to the assessment sheet. In such cases, a clear administration manual is essential for ensuring homogeneity of application. OBJECTIVES: To translate and adapt the French Canadian version of the FMA administration manual into Brazilian Portuguese and to evaluate the inter-rater reliability when different evaluators apply the FMA on the basis of the information contained in the manual. METHODS: Eighteen adults (59±10 years with chronic hemiparesis (38±35 months after a stroke took part in this study. Eight patients participated in the first part of the study and 10 in the second part. Based on analyzing the results from part 1, an adapted version was developed, in which information and photos were added to illustrate the positions of the patient and evaluator. The inter-rater reliability was assessed using the intraclass correlation coefficient (ICC. RESULTS: The reliability of the FMA based on the adapted version of the manual was excellent for the total motor scores for the upper limbs (ICC=0.98 and lower limbs (ICC=0.90, as well as for movement sense (ICC=0.98 and upper and lower-limb passive range of motion (ICC=0.84 and 0.90, respectively. The reliability was moderate for tactile sensitivity (0.75. The joint pain assessment presented low reliability. CONCLUSIONS: The results showed that, except for pain assessment, application of the FMA based on the adapted version of the application manual for Brazilian Portuguese presented adequate inter-rater reliability.
Inter-observer reproducibility before and after web-based education in the Gleason grading of the prostate adenocarcinoma among the Iranian pathologists.

Directory of Open Access Journals (Sweden)

Alireza Abdollahi

2014-05-01

Full Text Available This study was aimed at determining intra and inter-observer concordance rates in the Gleason scoring of prostatic adenocarcinoma, before and after a web-based educational course. In this self-controlled study, 150 tissue samples of prostatic adenocarcinoma are re-examined to be scored according to the Gleason scoring system. Then all pathologists attend a free web-based course. Afterwards, the same 150 samples [with different codes compared to the previous ones] are distributed differently among the pathologists to be assigned Gleason scores. After gathering the data, the concordance rate in the first and second reports of pathologists is determined. In the pre web-education, the mean kappa value of Interobserver agreement was 0.25 [fair agreement]. Post web-education significantly improved with the mean kappa value of 0.52 [moderate agreement]. Using weighted kappa values, significant improvement was observed in inter-observer agreement in higher scores of Gleason grade; Score 10 was achieved for the mean kappa value in post web-education was 0.68 [substantial agreement] compared to 0.25 (fair agreement in pre web-education. Web-based training courses are attractive to pathologists as they do not need to spend much time and money. Therefore, such training courses are strongly recommended for significant pathological issues including the grading of the prostate adenocarcinoma. Through web-based education, pathologists can exchange views and contribute to the rise in the level of reproducibility. Such programs need to be included in post-graduation programs.
Reliability and Validity of the Turkish Version of Patient and Observer Scar Assessment Scale in Patients with Burns

Directory of Open Access Journals (Sweden)

Ayşe Kabuk

2017-12-01

Full Text Available Objective: To evaluation reliability and validity of the Turkish version of Patient and Observer Scar Assessment Scale (POSAS in patients with burns. Methods: This is a methodologically study. Data were collected using POSAS, survey form and plexiglas. Patient Scar Assessment Scale (PSAS was completed by patients (n=53 and Observer Scar Assessment Scale (OSAS was completed by two observers separately. The test-retest was measured applying the scales in 25 patients after two weeks. Data were analyzed by Kruskal-Wallis and Mann-Whitney U test. Content validity was determined using Kaiser-Meyer-Olkin, Barlett’s test and structure validity was performed by explanatory factor analysis (EFA and confirmatory factor analysis (CFA; reliability was evaluated using internal consistency, Cronbach’s alpha and intraclass correlation coefficient (ICC. Results: Factor weights were in appropriate range according to EFA, 6 items single factor structure of the original scale was valid and had high consistency index according to CFA, ICC between the 7th item and the total points was proportional, inner consistency was highly reliable (PSAS a=0.992, OSAS a=0.993, consistency between the observers was high (a=0.952, r=0.909. It was determined OSAS scores increased as the burn degree increased (p<0.05. Conclusion: POSAS was determined to be a valid and reliable scale in patients with burns in the Turkish society.
Reliability of infrared thermometric measurements of skin temperature in the hand.

Science.gov (United States)

Packham, Tara L; Fok, Diana; Frederiksen, Karen; Thabane, Lehana; Buckley, Norman

2012-01-01

Clinical measurement study. Skin temperature asymmetries (STAs) are used in the diagnosis of complex regional pain syndrome (CRPS), but little evidence exists for reliability of the equipment and methods. This study examined the reliability of an inexpensive infrared (IR) thermometer and measurement points in the hand for the study of STA. ST was measured three times at five points on both hands with an IR thermometer by two raters in 20 volunteers (12 normals and 8 CRPS). ST measurement results using IR thermometers support inter-rater reliability: intraclass correlation coefficient (ICC) estimate for single measures 0.80; all ST measurement points were also highly reliable (ICC single measures, 0.83-0.91). The equipment demonstrated excellent reliability, with little difference in the reliability of the five measurement sites. These preliminary findings support their use in future CRPS research. Not applicable. Copyright © 2012 Hanley & Belfus. Published by Elsevier Inc. All rights reserved.
Reliability and smallest real difference of the ankle lunge test post ankle fracture.

Science.gov (United States)

Simondson, David; Brock, Kim; Cotton, Susan

2012-02-01

This study aimed to determine the reliability and the smallest real difference of the Ankle Lunge test in an ankle fracture patient population. In the post immobilisation stage of ankle fracture, ankle dorsiflexion is an important measure of progress and outcome. The Ankle Lunge test measures weight bearing dorsiflexion, resulting in negative scores (knee to wall distance) and positive scores (toe to wall distance), for which the latter has proven reliability in normal subjects only. A consecutive sample of ankle fracture patients with permission to commence weight bearing, were recruited to the study. Three measurements of the Ankle Lunge Test were performed each by two raters, one senior and one junior physiotherapist. These occurred prior to therapy sessions in the second week after plaster removal. A standardised testing station was utilised and allowed for both knee to wall distance and toe to wall distance measurement. Data was collected from 10 individuals with ankle fracture, with an average age of 36 years (SD 14.8). Seventy seven percent of observations were negative. Intra and inter-rater reliability yielded intra class correlations at or above 0.97, p Ankle Lunge test is a practical and reliable tool for measuring weightbearing dorsiflexion post ankle fracture. Copyright © 2011 Elsevier Ltd. All rights reserved.
Reliability of sonographic assessment of tendinopathy in tennis elbow.

Science.gov (United States)

Poltawski, Leon; Ali, Syed; Jayaram, Vijay; Watson, Tim

2012-01-01

To assess the reliability and compute the minimum detectable change using sonographic scales to quantify the extent of pathology and hyperaemia in the common extensor tendon in people with tennis elbow. The lateral elbows of 19 people with tennis elbow were assessed sonographically twice, 1-2 weeks apart. Greyscale and power Doppler images were recorded for subsequent rating of abnormalities. Tendon thickening, hypoechogenicity, fibrillar disruption and calcification were each rated on four-point scales, and scores were summed to provide an overall rating of structural abnormality; hyperaemia was scored on a five point scale. Inter-rater reliability was established using the intraclass correlation coefficient (ICC) to compare scores assigned independently to the same set of images by a radiologist and a physiotherapist with training in musculoskeletal imaging. Test-retest reliability was assessed by comparing scores assigned by the physiotherapist to images recorded at the two sessions. The minimum detectable change (MDC) was calculated from the test-retest reliability data. ICC values for inter-rater reliability ranged from 0.35 (95% CI: 0.05, 0.60) for fibrillar disruption to 0.77 (0.55, 0.88) for overall greyscale score, and 0.89 (0.79, 0.95) for hyperaemia. Test-retest reliability ranged from 0.70 (0.48, 0.84) for tendon thickening to 0.82 (0.66, 0.90) for overall greyscale score and 0.86 (0.73, 0.93) for calcification. The MDC for the greyscale total score was 2.0/12 and for the hyperaemia score was 1.1/5. The sonographic scoring system used in this study may be used reliably to quantify tendon abnormalities and change over time. A relatively inexperienced imager can conduct the assessment and use the rating scales reliably.
[French version of structured interviews for the Glasgow Outcome Scale: guidelines and first studies of validation].

Science.gov (United States)

Fayol, P; Carrière, H; Habonimana, D; Preux, P-M; Dumond, J-J

2004-05-01

The Glasgow Outcome Scale (GOS) is the most widely used outcome measure after traumatic brain injury. The GOS's reliability is improved by a structured interview. The two aims of this paper were to present a French version of the structured interview for the five-point Glasgow Outcome Scale and the extended eight-point GOS (GOSE) and to study their validity. The French version was developed using back-translation. Concurrent validity was studied by comparison with GOS/GOSE without structured interview. Inter-rater reliability was studied by comparison between assignments made by untrained head injury observers and trained head injury observers. Strength of agreement between ratings was assessed using the Kappa statistic. The French version and the guidelines for their use are given in the Appendix. Ratings were made for 25 brain injured patients and 25 relatives. Concurrent validity was good and inter-rater reliability was excellent. Using the structured interview for the GOS will give a more reliable assessment of the outcome of brain injured patients by French-speaking rehabilitation teams and a more precise assessment with the extended GOS.
Reliability, Construct Validity and Interpretability of the Brazilian version of the Rapid Upper Limb Assessment (RULA) and Strain Index (SI).

Science.gov (United States)

Valentim, Daniela Pereira; Sato, Tatiana de Oliveira; Comper, Maria Luiza Caíres; Silva, Anderson Martins da; Boas, Cristiana Villas; Padula, Rosimeire Simprini

There are very few observational methods for analysis of biomechanical exposure available in Brazilian-Portuguese. This study aimed to cross-culturally adapt and test the measurement properties of the Rapid Upper Limb Assessment (RULA) and Strain Index (SI). The cross-cultural adaptation and measurement properties test were established according to Beaton et al. and COSMIN guidelines, respectively. Several tasks that required static posture and/or repetitive motion of upper limbs were evaluated (n>100). The intra-raters' reliability for the RULA ranged from poor to almost perfect (k: 0.00-0.93), and SI from poor to excellent (ICC 2.1 : 0.05-0.99). The inter-raters' reliability was very poor for RULA (k: -0.12 to 0.13) and ranged from very poor to moderate for SI (ICC 2.1 : 0.00-0.53). The agreement was good for RULA (75-100% intra-raters, and 42.24-100% inter-raters) and to SI (EPM: -1.03% to 1.97%; intra-raters, and -0.17% to 1.51% inter-raters). The internal consistency was appropriate for RULA (α=0.88), and low for SI (α=0.65). Moderate construct validity were observed between RULA and SI, in wrist/hand-wrist posture (rho: 0.61) and strength/intensity of exertion (rho: 0.39). The adapted versions of the RULA and SI presented semantic and cultural equivalence for the Brazilian Portuguese. The RULA and SI had reliability estimates ranged from very poor to almost perfect. The internal consistency for RULA was better than the SI. The correlation between methods was moderate only of muscle request/movement repetition. Previous training is mandatory to use of observations methods for biomechanical exposure assessment, although it does not guarantee good reproducibility of these measures. Copyright © 2017 Associação Brasileira de Pesquisa e Pós-Graduação em Fisioterapia. Publicado por Elsevier Editora Ltda. All rights reserved.
Inter- and intra-observer variability in prostate definition with tissue harmonic and brightness mode imaging.

Science.gov (United States)

Sandhu, Gurpreet Kaur; Dunscombe, Peter; Meyer, Tyler; Pavamani, Simon; Khan, Rao

2012-01-01

The objective of this study was to compare the relative utility of tissue harmonic (H) and brightness (B) transrectal ultrasound (TRUS) images of the prostate by studying interobserver and intraobserver variation in prostate delineation. Ten patients with early-stage disease were randomly selected. TRUS images of prostates were acquired using B and H modes. The prostates on all images were contoured by an experienced radiation oncologist (RO) and five equally trained observers. The observers were blinded to information regarding patient and imaging mode. The volumes of prostate glands and areas of midgland slices were calculated. Volumes contoured were compared among the observers and between observer group and RO. Contours on one patient were repeated five times by four observers to evaluate the intraobserver variability. A one-sample Student t-test showed the volumes outlined by five observers are in agreement (p > 0.05) with the RO. Paired Student t-test showed prostate volumes (p = 0.008) and midgland areas (p = 0.006) with H mode were significantly smaller than that with B mode. Two-factor analysis of variances showed significant interobserver variability (p standard deviation of mean volumes and areas, and concordance indices. It was found that for small glands (≤35 cc) H mode provided greater interobserver consistency; however, for large glands (≥35 cc), B mode provided more consistent estimates. H mode provided superior inter- and intraobserver agreement in prostate volume definition for small to medium prostates. In large glands, H mode does not exhibit any additional advantage. Although harmonic imaging has not proven advantageous for all cases, its utilization seems to be judicious for small prostates. Crown Copyright © 2012. Published by Elsevier Inc. All rights reserved.
Accuracy and inter-observer variability of 3D versus 4D cone-beam CT based image-guidance in SBRT for lung tumors

International Nuclear Information System (INIS)

Sweeney, Reinhart A; Seubert, Benedikt; Stark, Silke; Homann, Vanessa; Müller, Gerd; Flentje, Michael; Guckenberger, Matthias

2012-01-01

To analyze the accuracy and inter-observer variability of image-guidance (IG) using 3D or 4D cone-beam CT (CBCT) technology in stereotactic body radiotherapy (SBRT) for lung tumors. Twenty-one consecutive patients treated with image-guided SBRT for primary and secondary lung tumors were basis for this study. A respiration correlated 4D-CT and planning contours served as reference for all IG techniques. Three IG techniques were performed independently by three radiation oncologists (ROs) and three radiotherapy technicians (RTTs). Image-guidance using respiration correlated 4D-CBCT (IG-4D) with automatic registration of the planning 4D-CT and the verification 4D-CBCT was considered gold-standard. Results were compared with two IG techniques using 3D-CBCT: 1) manual registration of the planning internal target volume (ITV) contour and the motion blurred tumor in the 3D-CBCT (IG-ITV); 2) automatic registration of the planning reference CT image and the verification 3D-CBCT (IG-3D). Image quality of 3D-CBCT and 4D-CBCT images was scored on a scale of 1–3, with 1 being best and 3 being worst quality for visual verification of the IGRT results. Image quality was scored significantly worse for 3D-CBCT compared to 4D-CBCT: the worst score of 3 was given in 19 % and 7.1 % observations, respectively. Significant differences in target localization were observed between 4D-CBCT and 3D-CBCT based IG: compared to the reference of IG-4D, tumor positions differed by 1.9 mm ± 0.9 mm (3D vector) on average using IG-ITV and by 3.6 mm ± 3.2 mm using IG-3D; results of IG-ITV were significantly closer to the reference IG-4D compared to IG-3D. Differences between the 4D-CBCT and 3D-CBCT techniques increased significantly with larger motion amplitude of the tumor; analogously, differences increased with worse 3D-CBCT image quality scores. Inter-observer variability was largest in SI direction and was significantly larger in IG using 3D-CBCT compared to 4D-CBCT: 0.6 mm versus 1.5 mm
Accuracy and inter-observer variability of 3D versus 4D cone-beam CT based image-guidance in SBRT for lung tumors

Directory of Open Access Journals (Sweden)

Sweeney Reinhart A

2012-06-01

Full Text Available Abstract Background To analyze the accuracy and inter-observer variability of image-guidance (IG using 3D or 4D cone-beam CT (CBCT technology in stereotactic body radiotherapy (SBRT for lung tumors. Materials and methods Twenty-one consecutive patients treated with image-guided SBRT for primary and secondary lung tumors were basis for this study. A respiration correlated 4D-CT and planning contours served as reference for all IG techniques. Three IG techniques were performed independently by three radiation oncologists (ROs and three radiotherapy technicians (RTTs. Image-guidance using respiration correlated 4D-CBCT (IG-4D with automatic registration of the planning 4D-CT and the verification 4D-CBCT was considered gold-standard. Results were compared with two IG techniques using 3D-CBCT: 1 manual registration of the planning internal target volume (ITV contour and the motion blurred tumor in the 3D-CBCT (IG-ITV; 2 automatic registration of the planning reference CT image and the verification 3D-CBCT (IG-3D. Image quality of 3D-CBCT and 4D-CBCT images was scored on a scale of 1–3, with 1 being best and 3 being worst quality for visual verification of the IGRT results. Results Image quality was scored significantly worse for 3D-CBCT compared to 4D-CBCT: the worst score of 3 was given in 19 % and 7.1 % observations, respectively. Significant differences in target localization were observed between 4D-CBCT and 3D-CBCT based IG: compared to the reference of IG-4D, tumor positions differed by 1.9 mm ± 0.9 mm (3D vector on average using IG-ITV and by 3.6 mm ± 3.2 mm using IG-3D; results of IG-ITV were significantly closer to the reference IG-4D compared to IG-3D. Differences between the 4D-CBCT and 3D-CBCT techniques increased significantly with larger motion amplitude of the tumor; analogously, differences increased with worse 3D-CBCT image quality scores. Inter-observer variability was largest in SI direction and was
Reliability analysis of the epidural spinal cord compression scale.

Science.gov (United States)

Bilsky, Mark H; Laufer, Ilya; Fourney, Daryl R; Groff, Michael; Schmidt, Meic H; Varga, Peter Paul; Vrionis, Frank D; Yamada, Yoshiya; Gerszten, Peter C; Kuklo, Timothy R

2010-09-01

The evolution of imaging techniques, along with highly effective radiation options has changed the way metastatic epidural tumors are treated. While high-grade epidural spinal cord compression (ESCC) frequently serves as an indication for surgical decompression, no consensus exists in the literature about the precise definition of this term. The advancement of the treatment paradigms in patients with metastatic tumors for the spine requires a clear grading scheme of ESCC. The degree of ESCC often serves as a major determinant in the decision to operate or irradiate. The purpose of this study was to determine the reliability and validity of a 6-point, MR imaging-based grading system for ESCC. To determine the reliability of the grading scale, a survey was distributed to 7 spine surgeons who participate in the Spine Oncology Study Group. The MR images of 25 cervical or thoracic spinal tumors were distributed consisting of 1 sagittal image and 3 axial images at the identical level including T1-weighted, T2-weighted, and Gd-enhanced T1-weighted images. The survey was administered 3 times at 2-week intervals. The inter- and intrarater reliability was assessed. The inter- and intrarater reliability ranged from good to excellent when surgeons were asked to rate the degree of spinal cord compression using T2-weighted axial images. The T2-weighted images were superior indicators of ESCC compared with T1-weighted images with and without Gd. The ESCC scale provides a valid and reliable instrument that may be used to describe the degree of ESCC based on T2-weighted MR images. This scale accounts for recent advances in the treatment of spinal metastases and may be used to provide an ESCC classification scheme for multicenter clinical trial and outcome studies.

Analysis of the reliability and reproducibility of goniometry compared to hand photogrammetry

Science.gov (United States)

de Carvalho, Rosana Martins Ferreira; Mazzer, Nilton; Barbieri, Claudio Henrique

2012-01-01

Objective: To evaluate the intra- and inter-examiner reliability and reproducibility of goniometry in relation to photogrammetry of hand, comparing the angles of thumb abduction, PIP joint flexion of the II finger and MCP joint flexion of the V finger. Methods: The study included 30 volunteers, who were divided into three groups: one group of 10 physiotherapy students, one group of 10 physiotherapists, and a third group of 10 therapists of the hand. Each examiner performed the measurements on the same hand mold, using the goniometer followed by two photogrammetry software programs; CorelDraw® and ALCimagem®. Results: The results revealed that the groups and the methods proposed presented inter-examiner reliability, generally rated as excellent (ICC 0.998 I.C. 95% 0.995 - 0.999). In the intra-examiner evaluation, an excellent level of reliability was found between the three groups. In the comparison between groups for each angle and each method, no significant differences were found between the groups for most of the measurements. Conclusion: Goniometry and photogrammetry are reliable and reproducible methods for evaluating measurements of the hand. However, due to the lack of similar references, detailed studies are needed to define the normal parameters between the methods in the joints of the hand. Level of Evidence II, Diagnostic Study. PMID:24453594
Inter- and intrarater reliability of goniometry and hand held dynamometry for patients with subacromial impingement syndrome.

Science.gov (United States)

Fieseler, Georg; Laudner, Kevin G; Irlenbusch, Lars; Meyer, Henrike; Schulze, Stephan; Delank, Karl-Stefan; Hermassi, Souhail; Bartels, Thomas; Schwesig, René

2017-12-01

The purpose of this study was to examine the intra- and interrater reliability of measuring shoulder range of motion (ROM) and strength among patients diagnosed with subacromial impingement syndrome (SAIS). Twenty-five patients (14 female patients; mean age, 60.4± 7.84 years) diagnosed with SAIS were assessed to determine the intrarater reliability for glenohumeral ROM. Twenty-five patients (16 female patients; mean age, 60.4± 7.80 years) and 76 asymptomatic volunteers (52 female volunteers; mean age, 29.4± 14.1 years) were assessed for interrater reliability. Dependent variables were active shoulder ROM and isometric strength. Intrarater reliability was fair-to-excellent for the SAIS patients (intraclass correlation coefficient [ICC], 0.52-0.97; standard error of measurement [SEM], 4.4°-9.9° N; coefficient of variation [CV], 7.1%-44.9%). Based on the ICC, 11 of 12 parameters (92%) displayed an excellent reliability (ICC> 0.75). The interrater reliability showed fair-to-excellent results (SAIS patients: ICC, 0.13-0.98; SEM, 2.3°-8.8°; CV, 3.6%-37.0%; controls: ICC, 0.11-0.96; SEM, 3.0°-35.4°; CV, 5.6%-26.4%). In accordance with the intrarater reliability, glenohumeral adduction ROM was the only parameter with an ICC below 0.75 for both samples. Painful shoulder ROM in the SAIS patients showed no influence on the quality of reliability for measurement. Therefore, these protocols should be considered reliable assessment techniques in the prevention, diagnosis, and treatment of painful shoulder conditions such as SAIS.
Target volume delineation in external beam partial breast irradiation: Less inter-observer variation with preoperative- compared to postoperative delineation

International Nuclear Information System (INIS)

Leij, Femke van der; Elkhuizen, Paula H.M.; Janssen, Tomas M.; Poortmans, Philip; Sangen, Maurice van der; Scholten, Astrid N.; Vliet-Vroegindeweij, Corine van; Boersma, Liesbeth J.

2014-01-01

The challenge of adequate target volume definition in external beam partial breast irradiation (PBI) could be overcome with preoperative irradiation, due to less inter-observer variation. We compared the target volume delineation for external beam PBI on preoperative versus postoperative CT scans of twenty-four breast cancer patients
Inter-professional relationships issues among iranian nurses and physicians: A qualitative study

Directory of Open Access Journals (Sweden)

Samaneh Nakhaee

2017-01-01

Full Text Available Introduction: Nurse–physician inter-professional relationship is an important issue in health care system that can affect job satisfaction and patient care quality. The present study explores the major issues of nurse–physician inter-professional relationships in Iran. Materials and Methods: In this in-depth qualitative content analysis study conducted in 2014, 12 participants (5 physicians and 7 nurses were recruited from two educational hospitals. The data were collected from deep, open, and unstructured interviews, and analyzed based on content analysis. Results: The participants in this study included 12 individuals, 6 females and 6 males, with the age ranging 27–48 years and tenure ranging 4–17 years. Four themes were identified, namely, divergent attitudes, uneven distribution of power, mutual trust destructors, and prudence imposed on nurses. Conclusions: The results revealed some major inter-professional issues and challenges in nurse–physician relationships, some of which are context-specific whereas others should be regarded as universal. It is through a deep knowledge of these issues that nurses and physicians can establish better collaborative inter-professional relationships.
Reliability of hand-held dynamometry for measurement of lower limb muscle strength in children with Duchenne and Becker muscular dystrophy

Directory of Open Access Journals (Sweden)

Wei SHI

2015-05-01

Full Text Available Objective To determine the reliability of hand-held dynamometry (HHD for lower limb isometric muscle strength measurement in children with Duchenne and Becker muscular dystrophy (DMD/BMD. Methods A total of 21 children [20 males and one female; mean age was (7.88 ± 2.87 years, ranging between 3.96-14.09 years; mean age at diagnosis was (5.88 ± 2.88 years, ranging between 1.35-12.89 years; mean height was (120.64 ± 16.30 cm, ranging between 97-153 cm; mean body weight was (24.62 ± 9.05 kg, ranging between 14-50 kg] with DMD (19/21 and BMD (2/21 were involved from Rehabilitation Center of Children's Hospital of Fudan University. The muscle strength of hip, knee and ankle was measured by HHD under standardized test methods. The test-retest results were compared to determine the inter-test reliability, and the results among testers were compared to determine the inter-tester reliability. Results HHD showed fine inter-tester reliability (ICC = 0.762-0.978 and inter-test reliability (ICC = 0.690-0.938 in measuring lower limb muscle strength of children with DMD/BMD. Results also showed relatively poor reliability in distal muscle groups (foot plantar flexion and dorsiflexion. Conclusions HHD, showing fine inter-tester and inter-test reliability in measuring the lower limb muscle strength of children with DMD/BMD, can be used in monitoring muscle strength changing and assessing effects of clinical interventions. DOI: 10.3969/j.issn.1672-6731.2015.05.009
Reliability of the Matson Evaluation of Social Skills with Youngsters (MESSY) for Children with Autism Spectrum Disorders

Science.gov (United States)

Matson, Johnny L.; Horovitz, Max; Mahan, Sara; Fodstad, Jill

2013-01-01

The purpose of this paper was to update the psychometrics of the "Matson Evaluation of Social Skills for Youngsters" ("MESSY") with children with Autism Spectrum Disorders (ASD), specifically with respect to internal consistency, split-half reliability, and inter-rater reliability. In Study 1, 114 children with ASD (Autistic Disorder, Asperger's…
The reliability of magnetic resonance imaging in traumatic brain injury lesion detection

NARCIS (Netherlands)

Geurts, B.H.J.; Andriessen, T.M.J.C.; Goraj, B.M.; Vos, P.E.

2012-01-01

Objective: This study compares inter-rater-reliability, lesion detection and clinical relevance of T2-weighted imaging (T2WI), Fluid Attenuated Inversion Recovery (FLAIR), T2*-gradient recalled echo (T2*-GRE) and Susceptibility Weighted Imaging (SWI) in Traumatic Brain Injury (TBI). Methods: Three
Adaptation of My Classroom Activities Scale to Turkish Culture: Validity and Reliability Study

Directory of Open Access Journals (Sweden)

Kaan Zülfikar DENİZ

2017-06-01

Full Text Available Student interest in class activities, their enjoyment of activity topics, their ability to make choices about the activity topics, and opportunities for students to challenge themselves during activities are among basic components that support their higher level learning. Properties of educational activities that make them interesting, enjoyable, and challenging while allowing students with choices are also among properties that are known to be necessary in all educational content, processes, and products within educational systems of the 21st century. Consequently, measuring these properties is also of great importance. The goal of this study is to perform the Turkish adaptation of the My Class Activities Scale, developed by Gentry and Gable (2001 in the United States and subsequently adapted to the Korean, Chinese, and Arabic languages. To this end, data was collected from 214 students attending 3rd, 4th, 5th, 6th, 7th, and 8th grades during the 2015-2016 academic year. As part of the validity study for the scale, the factor structure obtained from the original development of the scale was tested using the Confirmatory Factor Analysis (CFA method. Moreover, item-total correlation and inter-dimensional correlation analyses were also performed as part of the validity study. In studying the reliability of the scale, the Cronbach-Alpha reliability coefficients were estimated (Cronbach alpha values ranged between 0.82-0.90. Based on the results, the factor structure of the scale was verified in parallel with the original development work for the scale. In conclusion, the validity and reliability of using the scale in Turkey was established, contributing a new scale adaptation to the Turkish literature for use in different studies.
Feasibility and reliability of digital imaging for estimating food selection and consumption from students' packed lunches.

Science.gov (United States)

Taylor, Jennifer C; Sutter, Carolyn; Ontai, Lenna L; Nishina, Adrienne; Zidenberg-Cherr, Sheri

2018-01-01

Although increasing attention is placed on the quality of foods in children's packed lunches, few studies have examined the capacity of observational methods to reliably determine both what is selected and consumed from these lunches. The objective of this project was to assess the feasibility and inter-rater reliability of digital imaging for determining selection and consumption from students' packed lunches, by adapting approaches previously applied to school lunches. Study 1 assessed feasibility and reliability of data collection among a sample of packed lunches (n = 155), while Study 2 further examined reliability in a larger sample of packed (n = 386) as well as school (n = 583) lunches. Based on the results from Study 1, it was feasible to collect and code most items in packed lunch images; missing data were most commonly attributed to packaging that limited visibility of contents. Across both studies, there was satisfactory reliability for determining food types selected, quantities selected, and quantities consumed in the eight food categories examined (weighted kappa coefficients 0.68-0.97 for packed lunches, 0.74-0.97 for school lunches), with lowest reliability for estimating condiments and meats/meat alternatives in packed lunches. In extending methods predominately applied to school lunches, these findings demonstrate the capacity of digital imaging for the objective estimation of selection and consumption from both school and packed lunches. Copyright © 2017 Elsevier Ltd. All rights reserved.
Inter-organizational network studies – a literature review

DEFF Research Database (Denmark)

Bergenholtz, Carsten; Waldstrøm, Christian

2011-01-01

of the methodological issues (e.g. unit of analysis and boundary specification) are more easily addressed. In order to map the different methodological approaches in the field of inter-organizational networks, this paper presents a large-scale systematic literature review of the last 12 years’ research on inter...
The reliability and validity of the Turkish version of Fullerton Advanced Balance (FAB-T) scale.

Science.gov (United States)

Iyigun, Gozde; Kirmizigil, Berkiye; Angin, Ender; Oksuz, Sevim; Can, Filiz; Eker, Levent; Rose, Debra J

2018-06-04

The aim of this study was to evaluate the reliability and validity of the Turkish version of the FAB(FAB-T) scale in the older Turkish adults. The reliability and validity of the scale was tested on 200 community-dwelling older adults. FAB-T scale was scored by different physiotherapists on different days to evaluate inter-rater and intrarater reliability. The Berg Balance Scale (BBS) was used for the evaluation of convergent validity, and the content validity of the FAB-T scale was investigated. The FAB-T scale showed very high inter- and intra-rater reliability. For inter-rater agreement, on the individual test items and total score ICC values were 0.92 (95 %CI; 0.90-0.94) and 0.96 (95% CI; 0.95-0.97) respectively. The intra-rater agreement, on the individual test items and total score ICC values were 0.93 (95 %CI; 0.91- 0.95) and 0.96 (95% CI; 0.95- 0.97) respectively. There was a good agreement between the FAB-T and BBS scales. A high correlation was found between the BBS and FAB-T scales [rho = 0.70 (%95 CI; 0.62-0.76)] indicating good convergent validity. Considering the content validity of the FAB-T scale, no floor (floor score: 0%) or ceiling (ceiling score: 6.5%) effect was detected. The FAB-T scale was successfully translated from the original English version (FAB) and demonstrated strong psychometric features. It was found that the FAB-T scale has very high inter-rater and intra-rater reliability. Considering the convergent validity, the scale has high correlation with the BBS. The FAB-T has no floor and ceiling effect. Copyright © 2018 Elsevier B.V. All rights reserved.
Reliability and minimal detectable change of a modified passive neck flexion test in patients with chronic nonspecific neck pain and asymptomatic subjects.

Science.gov (United States)

López-de-Uralde-Villanueva, Ibai; Acuyo-Osorio, Mario; Prieto-Aldana, María; La Touche, Roy

2017-04-01

The Passive Neck Flexion Test (PNFT) can diagnose meningitis and potential spinal disorders. Little evidence is available concerning the use of a modified version of the PNFT (mPNFT) in patients with chronic nonspecific neck pain (CNSNP). To assess the reliability of the mPNFT in subjects with and without CNSNP. The secondary objective was to assess the differences in the symptoms provoked by the mPNFT between these two populations. We used repeated measures concordance design for the main objective and cross-sectional design for the secondary objective. A total of 30 asymptomatic subjects and 34 patients with CNSNP were recruited. The following measures were recorded: the range of motion at the onset of symptoms (OS-mPNFT), the range of motion at the submaximal pain (SP-mPNFT), and evoked pain intensity on the mPNFT (VAS-mPNFT). Good to excellent reliability was observed for OS-mPNFT and SP-mPNFT in the asymptomatic group (intra-examiner reliability: 0.95-0.97; inter-examiner reliability: 0.86-0.90; intra-examiner test-retest reliability: 0.84-0.87). In the CNSNP group, a good to excellent reliability was obtained for the OS-mPNFT (intra-examiner reliability: 0.89-0.96; inter-examiner reliability: 0.83-0.86; intra-examiner test-retest reliability: 0.83-0.85) and the SP-PNFT (intra-examiner reliability: 0.94-0.98; inter-examiner reliability: 0.80-0.82; intra-examiner test-retest reliability: 0.88-0.91). The CNSNP group showed statistically significant differences in OS-mPNFT (t = 4.92; P reliable tool regardless of the examiner and the time factor. Patients with CNSNP have a decrease range of motion and more pain than asymptomatic subjects in the mPNFT. This exceeds the minimal detectable changes for OS-mPNFT and VAS-mPNFT. Copyright © 2017 Elsevier Ltd. All rights reserved.
Validity and reliability of the European portuguese version of neuropsychiatric inventory in an institutionalized sample.

Science.gov (United States)

Ferreira, Ana Rita; Martins, Sonia; Ribeiro, Orquidea; Fernandes, Lia

2015-01-01

Neuropsychiatric symptoms are very common in dementia and have been associated with patient and caregiver distress, increased risk of institutionalization and higher costs of care. In this context, the neuropsychiatric inventory (NPI) is the most widely used comprehensive tool designed to measure neuropsychiatric Symptoms in geriatric patients with dementia. The aim of this study was to present the validity and reliability of the European Portuguese version of NPI. A cross-sectional study was carried out with a convenience sample of institutionalized patients (≥ 50 years old) in three nursing homes in Portugal. All patients were also assessed with mini-mental state examination (MMSE) (cognition), geriatric depression scale (GDS) (depression) and adults and older adults functional assessment inventory (IAFAI) (functionality). NPI was administered to a formal caregiver, usually from the clinical staff. Inter-rater and test-retest reliability were assessed in a subsample of 25 randomly selected subjects. The sample included 166 elderly, with a mean age of 80.9 (standard deviation: 10.2) years. Three out of the NPI behavioral items had negative correlations with MMSE: delusions (rs = -0.177, P = 0.024), disinhibition (rs = -0.174, P = 0.026) and aberrant motor activity (rs = -0.182, P = 0.020). The NPI subsection of depression/dysphoria correlated positively with GDS total score (rs = 0.166, P = 0.038). NPI showed good internal consistency (overall α = 0.766; frequency α = 0.737; severity α = 0.734). The inter-rater reliability was excellent (intraclass correlation coefficient (ICC): 1.00, 95% confidence interval (CI) 1.00 - 1.00), as well as test-retest reliability (ICC: 0.91, 95% CI 0.80 - 0.96). The results found for convergent validity, inter-rater and test-retest reliability, showed that this version appears to be a valid and reliable instrument for evaluation of neuropsychiatric symptoms in institutionalized elderly.
Handheld mechanical nociceptive threshold testing in dairy cows – intra-individual variation, inter-observer agreement and variation over time

Science.gov (United States)

Raundal, Peter M; Andersen, Pia H; Toft, Nils; Forkman, Björn; Munksgaard, Lene; Herskin, Mette S

2014-01-01

Objective To examine the use of handheld methodology to assess mechanical nociceptive threshold (MNT) on cows kept loose-housed. Study design Prospective randomized partial cross-over experimental study. A one-factor (test day) design was used to evaluate MNT over time. Animals One hundred and fifteen healthy, loose-housed Danish Holstein cattle. Methods We evaluated intra-individual variation, inter-observer agreement and variation over time of MNT using two handheld devices and two stimulation sites. Mechanical, ramped stimulations were performed with an algometer (6.5 mm diameter steel probe, 0–10.0 kgf) or an electronic von Frey device (plastic tip with diameter 0.8 mm, 0–1000 gf). Each cow received 5–6 consecutive stimulations within a 2 × 5 cm skin area on the dorsal or lateral aspect of the left third metatarsus until an avoidance reaction occurred. We investigated the difference in precision [expressed as coefficient of variation (CV)] between the combinations of devices and stimulation sites. The inter-observer agreement and the difference in MNT between test day 1, 3, 7, 10 and 24 were investigated for selected combinations. Data were analysed in mixed models and Bland-Altman as relevant. Results The CVs did not differ [range 0.34–0.52 (p = 0.1)]. Difference between observers (95% limits) was 0.2 kgf (2.8) and 4 gf (369) for the algometer and von Frey device, respectively. Mechanical nociceptive threshold increased from 361 on test day one to 495 gf on test day 24 (p < 0.01). Conclusion and clinical relevance All methods showed a high degree of intra-individual variation, and no combination of device and stimulation site showed superior precision. Mean difference between observers was low, and MNT was not consistent over time. Further development of the methods is required before they can be used in research to investigate possible relations between claw lesions and hyperalgesia. PMID:24734991
Intra-observer and interobserver reliability ofOne Leg Stand Test as a measure of posturalbalance in low back pain patients

DEFF Research Database (Denmark)

Maribo, Thomas; Iversen, Elena; Andersen, Niels Trolle

2009-01-01

Objective: To determine the absolute and relative reliability of intra-observer and interobserver To determine the absolute and relative reliability of intra-observer and interobserver measurements of postural balance using the One Leg Stand Test in patients with low back pain. Patients and methods...... to stand for the maximum time, and no further analysis was done. Eyes closed: intra-observer reliability was tested in 21 patients; absolute reliability showed a standard error of the measurement (SEM) of 2.48 s and a minimal detectable change (MDC) of 6.88. The relative reliability was acceptable...... with an intra class correlation coefficient (ICC) of 0.86. Interobserver reliability was tested in 27 patients; absolute reliability showed a SEM of 1.42 s and a MDC of 3.95. The relative reliability was acceptable with an ICC of 0.91. Conclusions: The One Leg Stand Test can be used to test postural balance...
Optical observations of Magnetosphere-Ionosphere coupling: Inter-hemispheric electron reflections within pulsating aurora

Science.gov (United States)

Samara, M.; Michell, R.; Khazanov, G. V.; Grubbs, G. A., II

2017-12-01

Magnetosphere-Ionosphere coupling is exhibited in reflected primary and secondary electrons which constitute the second step in the formation of the total precipitating electron distribution. While they have largely been missing from the current theoretical studies of particle precipitation, ground based observations point to the existence of a reflected electron population. We present evidence that pulsating aurora is caused by electrons bouncing back and forth between the two hemispheres. This means that these electrons are responsible for some of the total light in the aurora, a possibility that has largely been ignored in theoretical models. Pulsating auroral events imaged optically at high time resolution present direct observational evidence in agreement with the inter-hemispheric electron bouncing predicted by the SuperThermal Electron Trans-port (STET) model. Immediately following each of the `pulsation-on' times are equally spaced, and subsequently fainter pulsations, which can be explained by the primary precipitating electrons reflecting upwards from the ionosphere, traveling to the opposite hemisphere, and reflecting upwards again. The high time-resolution of these data, combined with the short duration of the `pulsation-on' time ( 1 s) and the relatively long spacing between pulsations ( 6 to 9 s) made it possible to observe the faint optical pulses caused by the reflected electrons coming from the opposite hemisphere. These results are significant and have broad implications because they highlight that the formation of the auroral electron distributions within regions of diffuse and pulsating aurora contain contributions from reflected primary and secondary electrons. These processes can ultimately lead to larger fluxes than expected when considering only the primary injection of magnetospheric electrons.
Reliability and accuracy of three imaging software packages used for 3D analysis of the upper airway on cone beam computed tomography images.

Science.gov (United States)

Chen, Hui; van Eijnatten, Maureen; Wolff, Jan; de Lange, Jan; van der Stelt, Paul F; Lobbezoo, Frank; Aarab, Ghizlane

2017-08-01

The aim of this study was to assess the reliability and accuracy of three different imaging software packages for three-dimensional analysis of the upper airway using CBCT images. To assess the reliability of the software packages, 15 NewTom 5G ® (QR Systems, Verona, Italy) CBCT data sets were randomly and retrospectively selected. Two observers measured the volume, minimum cross-sectional area and the length of the upper airway using Amira ® (Visage Imaging Inc., Carlsbad, CA), 3Diagnosys ® (3diemme, Cantu, Italy) and OnDemand3D ® (CyberMed, Seoul, Republic of Korea) software packages. The intra- and inter-observer reliability of the upper airway measurements were determined using intraclass correlation coefficients and Bland & Altman agreement tests. To assess the accuracy of the software packages, one NewTom 5G ® CBCT data set was used to print a three-dimensional anthropomorphic phantom with known dimensions to be used as the "gold standard". This phantom was subsequently scanned using a NewTom 5G ® scanner. Based on the CBCT data set of the phantom, one observer measured the volume, minimum cross-sectional area, and length of the upper airway using Amira ® , 3Diagnosys ® , and OnDemand3D ® , and compared these measurements with the gold standard. The intra- and inter-observer reliability of the measurements of the upper airway using the different software packages were excellent (intraclass correlation coefficient ≥0.75). There was excellent agreement between all three software packages in volume, minimum cross-sectional area and length measurements. All software packages underestimated the upper airway volume by -8.8% to -12.3%, the minimum cross-sectional area by -6.2% to -14.6%, and the length by -1.6% to -2.9%. All three software packages offered reliable volume, minimum cross-sectional area and length measurements of the upper airway. The length measurements of the upper airway were the most accurate results in all software packages. All
Assessing the reliability of the borderline regression method as a standard setting procedure for objective structured clinical examination

Directory of Open Access Journals (Sweden)

Sara Mortaz Hejri

2013-01-01

Full Text Available Background: One of the methods used for standard setting is the borderline regression method (BRM. This study aims to assess the reliability of BRM when the pass-fail standard in an objective structured clinical examination (OSCE was calculated by averaging the BRM standards obtained for each station separately. Materials and Methods: In nine stations of the OSCE with direct observation the examiners gave each student a checklist score and a global score. Using a linear regression model for each station, we calculated the checklist score cut-off on the regression equation for the global scale cut-off set at 2. The OSCE pass-fail standard was defined as the average of all station′s standard. To determine the reliability, the root mean square error (RMSE was calculated. The R2 coefficient and the inter-grade discrimination were calculated to assess the quality of OSCE. Results: The mean total test score was 60.78. The OSCE pass-fail standard and its RMSE were 47.37 and 0.55, respectively. The R2 coefficients ranged from 0.44 to 0.79. The inter-grade discrimination score varied greatly among stations. Conclusion: The RMSE of the standard was very small indicating that BRM is a reliable method of setting standard for OSCE, which has the advantage of providing data for quality assurance.
Inter-firm Networks, Organizational Learning and Knowledge Updating: An Empirical Study

Science.gov (United States)

Zhang, Su-rong; Wang, Wen-ping

In the era of knowledge-based economy which information technology develops rapidly, the rate of knowledge updating has become a critical factor for enterprises to gaining competitive advantage .We build an interactional theoretical model among inter-firm networks, organizational learning and knowledge updating thereby and demonstrate it with empirical study at last. The result shows that inter-firm networks and organizational learning is the source of knowledge updating.
Pattern description and reliability parameters of six force-time related indices measured with plantar pressure measurements.

Science.gov (United States)

Deschamps, Kevin; Roosen, Philip; Bruyninckx, Herman; Desloovere, Kaat; Deleu, Paul-Andre; Matricali, Giovanni A; Peeraer, Louis; Staes, Filip

2013-09-01

Functional interpretation of plantar pressure measurements is commonly done through the use of ratios and indices which are preceded by the strategic combination of a subsampling method and selection of physical quantities. However, errors which may arise throughout the determination of these temporal indices/ratio calculations (T-IRC) have not been quantified. The purpose of the current study was therefore to estimate the reliability of T-IRC following semi-automatic total mapping (SATM). Using a repeated-measures design, two experienced therapists performed three subsampling sessions on three left and right pedobarographic footprints of ten healthy participants. Following the subsampling, six T-IRC were calculated: Rearfoot-Forefoot_fti, Rearfoot-Midfoot_fti, Forefoot medial/lateral_fti, First ray_fti, Metatarsal 1-Metatarsal 5_fti, Foot medial-lateral_fti. Patterns of the T-IRC were found to be consistent and in good agreement with corresponding knowledge from the literature. The inter-session errors of both therapists were similar in pattern and magnitude. The lowest peak inter-therapist error was found in the First ray_fti (6.5 a.u.) whereas the highest peak inter-therapist error was observed in the Forefoot medial/lateral_fti (27.0 a.u.) The magnitude of the inter-session and inter-therapist error varied over time, precluding the calculation of a simple numerical value for the error. The difference between both error parameters of all T-IRC was negligible which underscores the repeatability of the SATM protocol. The current study reports consistent patterns for six T-IRC and similar inter-session and inter-therapist error. The proposed SATM protocol and the T-IRC may therefore serve as basis for functional interpretation of footprint data. Copyright © 2013 Elsevier B.V. All rights reserved.

Long-term reliability of the visual EEG Poffenberger paradigm.

Science.gov (United States)

Friedrich, Patrick; Ocklenburg, Sebastian; Mochalski, Lisa; Schlüter, Caroline; Güntürkün, Onur; Genc, Erhan

2017-07-14

The Poffenberger paradigm is a simple perception task that is used to estimate the speed of information transfer between the two hemispheres, the so-called interhemispheric transfer time (IHTT). Although the original paradigm is a behavioral task, it can be combined with electroencephalography (EEG) to assess the underlying neurophysiological processes during task execution. While older studies have supported the validity of both paradigms for investigating interhemispheric interactions, their long-term reliability has not been assessed systematically before. The present study aims to fill this gap by determining both internal consistency and long-term test-retest reliability of IHTTs produced by using the two different versions of the Poffenberger paradigm in a sample of 26 healthy subjects. The results show high reliability for the EEG Poffenberger paradigm. In contrast, reliability measures for the behavioral Poffenberger paradigm were low. Hence, our results indicate that electrophysiological measures of interhemispheric transfer are more reliable than behavioral measures; the later should be used with caution in research investigating inter-individual differences of neurocognitive measures. Copyright © 2017 Elsevier B.V. All rights reserved.
Validity and Reliability of 10-Hz Global Positioning System to Assess In-line Movement and Change of Direction

Directory of Open Access Journals (Sweden)

Pantelis T. Nikolaidis

2018-03-01

Full Text Available The objectives of the present study were to examine the validity and reliability of the 10 Hz Johan GPS unit in assessing in-line movement and change of direction. The validity was tested against the criterion measure of 200 m track-and-field (track-and-field athletes, n = 8 and 20 m shuttle run endurance test (female soccer players, n = 20. Intra-unit and inter-unit reliability was tested by intra-class correlation coefficient (ICC and coefficient of variation (CV, respectively. An analysis of variance examined differences between the GPS measurement and five laps of 200 m at 15 km/h, and t-test examined differences between the GPS measurement and 20 m shuttle run endurance test. The difference between the GPS measurement and 200 m distance ranged from −0.13 ± 3.94 m (95% CI −3.42; 3.17 in the first lap to 2.13 ± 2.64 m (95% CI −0.08; 4.33 in the fifth lap. A good intra-unit reliability was observed in 200 m (ICC = 0.833, 95% CI 0.535; 0.962. Inter-unit CV ranged from 1.31% (fifth lap to 2.20% (third lap. The difference between the GPS measurement and 20 m shuttle run endurance test ranged from 0.33 ± 4.16 m (95% CI −10.01; 10.68 in 11.5 km/h to 9.00 ± 5.30 m (95% CI 6.44; 11.56 in 8.0 km/h. A moderate intra-unit reliability was shown in the second and third stage of the 20 m shuttle run endurance test (ICC = 0.718, 95% CI 0.222;0.898 and good reliability in the fifth, sixth, seventh and eighth (ICC = 0.831, 95% CI −0.229;0.996. Inter-unit CV ranged from 2.08% (11.5 km/h to 3.92% (8.5 km/h. Based on these findings, it was concluded that the 10 Hz Johan system offers an affordable valid and reliable tool for coaches and fitness trainers to monitor training and performance.
Some Observations on Imaging Inter-aquifer Leakage Using Airborne EM Technologies

DEFF Research Database (Denmark)

Munday, T; Fitzpatrick, A; Auken, Esben

of surface watergroundwaterprocesses. This paper presents results from an examination of hydrogeophysics, specificallyairborne electromagnetics (AEM) data acquired by the SkyTEM time domain helicopter EM system, as ameans for improving our knowledge of spatial patterns associated with inter-aquifer mixing...
Reliability of magnetic resonance imaging assessment of rotator cuff: the ROW study.

Science.gov (United States)

Jain, Nitin B; Collins, Jamie; Newman, Joel S; Katz, Jeffrey N; Losina, Elena; Higgins, Laurence D

2015-03-01

Physiatrists encounter patients with rotator cuff disorders, and imaging is frequently an important component of their diagnostic assessment. However, there is a paucity of literature on the reliability of magnetic resonance imaging (MRI) assessment between shoulder specialists and musculoskeletal radiologists. We assessed inter- and intrarater reliability of MRI characteristics of the rotator cuff. Cross-sectional secondary analyses in a prospective cohort study. Academic tertiary care centers. Subjects with shoulder pain were recruited from orthopedic and physiatry clinics. Two shoulder-fellowship-trained physicians (a physiatrist and a shoulder surgeon) jointly performed a blinded composite MRI review by consensus of 31 subjects with shoulder pain. Subsequently, MRI was reviewed by one fellowship-trained musculoskeletal radiologist. We calculated the Cohen kappa coefficients and percentage agreement among the 2 reviews (composite review of 2 shoulder specialists versus that of the musculoskeletal radiologist). Intrarater reliability was assessed among the shoulder specialists by performing a repeated blinded composite MRI review. In addition to this repeated composite review, only one of the physiatry shoulder specialists performed an additional review. Interrater reliability (shoulder specialists versus musculoskeletal radiologist) was substantial for the presence or absence of tear (kappa 0.90 [95% confidence interval {CI}, 0.72-1.00]), tear thickness (kappa 0.84 [95% CI, 0.70-0.99]), longitudinal size of tear (kappa 0.75 [95% CI, 0.44-1.00]), fatty infiltration (kappa 0.62 [95% CI, 0.45-0.79]), and muscle atrophy (kappa 0.68 [95% CI, 0.50-0.86]). There was only fair interrater reliability of the transverse size of tear (kappa 0.20 [95% CI, 0.00-0.51]). The kappa for intrarater reliability was high for tear thickness (0.88 [95% CI, 0.72-1.00]), longitudinal tear size (0.61 [95% CI, 0.22-0.99]), fatty infiltration (0.89 [95% CI, 0.80,-0.98]), and muscle atrophy
Development and Psychometric Properties of an Assessment for Persons with Intellectual Disability--The InterRAI ID

Science.gov (United States)

Martin, Lynn; Hirdes, John P.; Fries, Brant E.; Smith, Trevor F.

2007-01-01

This paper describes the development of the interRAI-Intellectual Disability (interRAI ID), a comprehensive instrument that assesses all key domains of interest to service providers relative to a person with an intellectual disability (ID). The authors report on the reliability and validity of embedded scales for cognition, self-care, aggression,…
Handheld mechanical nociceptive threshold testing in dairy cows - intra-individual variation, inter-observer agreement and variation over time.

Science.gov (United States)

Raundal, Peter M; Andersen, Pia H; Toft, Nils; Forkman, Björn; Munksgaard, Lene; Herskin, Mette S

2014-11-01

To examine the use of handheld methodology to assess mechanical nociceptive threshold (MNT) on cows kept loose-housed. Prospective randomized partial cross-over experimental study. A one-factor (test day) design was used to evaluate MNT over time. One hundred and fifteen healthy, loose-housed Danish Holstein cattle. We evaluated intra-individual variation, inter-observer agreement and variation over time of MNT using two handheld devices and two stimulation sites. Mechanical, ramped stimulations were performed with an algometer (6.5 mm diameter steel probe, 0-10.0 kgf) or an electronic von Frey device (plastic tip with diameter 0.8 mm, 0-1000 gf). Each cow received 5-6 consecutive stimulations within a 2 × 5 cm skin area on the dorsal or lateral aspect of the left third metatarsus until an avoidance reaction occurred. We investigated the difference in precision [expressed as coefficient of variation (CV)] between the combinations of devices and stimulation sites. The inter-observer agreement and the difference in MNT between test day 1, 3, 7, 10 and 24 were investigated for selected combinations. Data were analysed in mixed models and Bland-Altman as relevant. The CVs did not differ [range 0.34-0.52 (p = 0.1)]. Difference between observers (95% limits) was 0.2 kgf (2.8) and 4 gf (369) for the algometer and von Frey device, respectively. Mechanical nociceptive threshold increased from 361 on test day one to 495 gf on test day 24 (p < 0.01). All methods showed a high degree of intra-individual variation, and no combination of device and stimulation site showed superior precision. Mean difference between observers was low, and MNT was not consistent over time. Further development of the methods is required before they can be used in research to investigate possible relations between claw lesions and hyperalgesia. © 2014 The Authors Veterinary Anaesthesia and Analgesia published by John Wiley & Sons Ltd on behalf of Association of Veterinary Anaesthetists and the
Design, application and testing of the Work Observation Method by Activity Timing (WOMBAT) to measure clinicians' patterns of work and communication.

Science.gov (United States)

Westbrook, Johanna I; Ampt, Amanda

2009-04-01

Evidence regarding how health information technologies influence clinicians' patterns of work and support efficient practices is limited. Traditional paper-based data collection methods are unable to capture clinical work complexity and communication patterns. The use of electronic data collection tools for such studies is emerging yet is rarely assessed for reliability or validity. Our aim was to design, apply and test an observational method which incorporated the use of an electronic data collection tool for work measurement studies which would allow efficient, accurate and reliable data collection, and capture greater degrees of work complexity than current approaches. We developed an observational method and software for personal digital assistants (PDAs) which captures multiple dimensions of clinicians' work tasks, namely what task, with whom, and with what; tasks conducted in parallel (multi-tasking); interruptions and task duration. During field-testing over 7 months across four hospital wards, fifty-two nurses were observed for 250 h. Inter-rater reliability was tested and validity was measured by (i) assessing whether observational data reflected known differences in clinical role work tasks and (ii) by comparing observational data with participants' estimates of their task time distribution. Observers took 15-20 h of training to master the method and data collection process. Only 1% of tasks observed did not match the classification developed and were classified as 'other'. Inter-rater reliability scores of observers were maintained at over 85%. The results discriminated between the work patterns of enrolled and registered nurses consistent with differences in their roles. Survey data (n=27) revealed consistent ratings of tasks by nurses, and their rankings of most to least time-consuming tasks were significantly correlated with those derived from the observational data. Over 40% of nurses' time was spent in direct care or professional communication
Reliability of horizontal and vertical tube shift techniques in the localisation of supernumerary teeth.

Science.gov (United States)

Mallineni, S K; Anthonappa, R P; King, N M

2016-12-01

To assess the reliability of the vertical tube shift technique (VTST) and horizontal tube shift technique (HTST) for the localisation of unerupted supernumerary teeth (ST) in the anterior region of the maxilla. A convenience sample of 83 patients who attended a major teaching hospital because of unerupted ST was selected. Only non-syndromic patients with ST and who had complete clinical and radiographic and surgical records were included in the study. Ten examiners independently rated the paired set of radiographs for each technique. Chi-square test, paired t test and kappa statistics were employed to assess the intra- and inter-examiner reliability. Paired sets of 1660 radiographs (830 pairs for each technique) were available for the analysis. The overall sensitivity for VTST and HTST was 80.6 and 72.1% respectively, with slight inter-examiner and good intra-examiner reliability. Statistically significant differences were evident between the two localisation techniques (p HTST in the anterior region of the maxilla.
Variation in GMC Association Properties across the Bars, Spiral Arms, Inter-arms, and Circumnuclear Region of M100 (NGC 4321) Extracted from ALMA Observations

Energy Technology Data Exchange (ETDEWEB)

Pan, Hsi-An [Academia Sinica, Institute of Astronomy and Astrophysics (ASIAA), P.O. Box 23-141, Taipei 10617, Taiwan (China); Kuno, Nario, E-mail: hapan@asiaa.sinica.edu.tw [Faculty of Pure and Applied Sciences, University of Tsukuba, 1-1-1 Tennoudai, Tsukuba, Ibaraki 350-8577 (Japan)

2017-04-20

We study the physical properties of giant molecular cloud associations (GMAs) in M100 (NGC 4321) using the ALMA Science Verification feathered (12 m+ACA) data in {sup 12}CO (1–0). To examine the environmental dependence of their properties, GMAs are classified based on their locations in various environments as circumnuclear ring (CNR), bar, spiral, and inter-arm GMAs. The CNR GMAs are massive and compact, while the inter-arm GMAs are diffuse, with low surface density. GMA mass and size are strongly correlated, as suggested by Larson. However, the diverse power-law index of the relation implies that the GMA properties are not uniform among the environments. The CNR and bar GMAs show higher velocity dispersion than those in other environments. We find little evidence for a correlation between GMA velocity dispersion and size, which indicates that the GMAs are in diverse dynamical states. Indeed, the virial parameter of the GMAs spans nearly two orders of magnitude. Only the spiral GMAs are generally self-gravitating. Star formation activity decreases in order over the CNR, spiral, bar, and inter-arm GMAs. The diverse GMA and star formation properties in different environments lead to variations in the Kennicutt–Schmidt relation. A combination of multiple mechanisms or gas phase change is necessary to explain the observed slopes. Comparisons of GMA properties acquired with the use of the 12 m array observations with those from the feathered data are also presented. The results show that the missing flux and extended emission cannot be neglected for the study of environmental dependence.
Intra-observer reproducibility and interobserver reliability of the radiographic parameters in the Spinal Deformity Study Group's AIS Radiographic Measurement Manual.

Science.gov (United States)

Dang, Natasha Radhika; Moreau, Marc J; Hill, Douglas L; Mahood, James K; Raso, James

2005-05-01

Retrospective cross-sectional assessment of the reproducibility and reliability of radiographic parameters. To measure the intra-examiner and interexaminer reproducibility and reliability of salient radiographic features. The management and treatment of adolescent idiopathic scoliosis (AIS) depends on accurate and reproducible radiographic measurements of the deformity. Ten sets of radiographs were randomly selected from a sample of patients with AIS, with initial curves between 20 degrees and 45 degrees. Fourteen measures of the deformity were measured from posteroanterior and lateral radiographs by 2 examiners, and were repeated 5 times at intervals of 3-5 days. Intra-examiner and interexaminer differences were examined. The parameters include measures of curve size, spinal imbalance, sagittal kyphosis and alignment, maximum apical vertebral rotation, T1 tilt, spondylolysis/spondylolisthesis, and skeletal age. Intra-examiner reproducibility was generally excellent for parameters measured from the posteroanterior radiographs but only fair to good for parameters from the lateral radiographs, in which some landmarks were not clearly visible. Of the 13 parameters observed, 7 had excellent interobserver reliability. The measurements from the lateral radiograph were less reproducible and reliable and, thus, may not add value to the assessment of AIS. Taking additional measures encourages a systematic and comprehensive assessment of spinal radiographs.
An inter-observer agreement study of autofluorescence endoscopy in Barrett's esophagus among expert and non-expert endoscopists.

Science.gov (United States)

Mannath, J; Subramanian, V; Telakis, E; Lau, K; Ramappa, V; Wireko, M; Kaye, P V; Ragunath, K

2013-02-01

Autofluorescence imaging (AFI), which is a "red flag" technique during Barrett's surveillance, is associated with significant false positive results. The aim of this study was to assess the inter-observer agreement (IOA) in identifying AFI-positive lesions and to assess the overall accuracy of AFI. Anonymized AFI and high resolution white light (HRE) images were prospectively collected. The AFI images were presented in random order, followed by corresponding AFI + HRE images. Three AFI experts and 3 AFI non-experts scored images after a training presentation. The IOA was calculated using kappa and accuracy was calculated with histology as gold standard. Seventy-four sets of images were prospectively collected from 63 patients (48 males, mean age 69 years). The IOA for number of AF positive lesions was fair when AFI images were presented. This improved to moderate with corresponding AFI and HRE images [experts 0.57 (0.44-0.70), non-experts 0.47 (0.35-0.62)]. The IOA for the site of AF lesion was moderate for experts and fair for non-experts using AF images, which improved to substantial for experts [κ = 0.62 (0.50-0.72)] but remained at fair for non-experts [κ = 0.28 (0.18-0.37)] with AFI + HRE. Among experts, the accuracy of identifying dysplasia was 0.76 (0.7-0.81) using AFI images and 0.85 (0.79-0.89) using AFI + HRE images. The accuracy was 0.69 (0.62-0.74) with AFI images alone and 0.75 (0.70-0.80) using AFI + HRE among non-experts. The IOA for AF positive lesions is fair to moderate using AFI images which improved with addition of HRE. The overall accuracy of identifying dysplasia was modest, and was better when AFI and HRE images were combined.
Children's Physical Activity While Gardening: Development of a Valid and Reliable Direct Observation Tool.

Science.gov (United States)

Myers, Beth M; Wells, Nancy M

2015-04-01

Gardens are a promising intervention to promote physical activity (PA) and foster health. However, because of the unique characteristics of gardening, no extant tool can capture PA, postures, and motions that take place in a garden. The Physical Activity Research and Assessment tool for Garden Observation (PARAGON) was developed to assess children's PA levels, tasks, postures, and motions, associations, and interactions while gardening. PARAGON uses momentary time sampling in which a trained observer watches a focal child for 15 seconds and then records behavior for 15 seconds. Sixty-five children (38 girls, 27 boys) at 4 elementary schools in New York State were observed over 8 days. During the observation, children simultaneously wore Actigraph GT3X+ accelerometers. The overall interrater reliability was 88% agreement, and Ebel was .97. Percent agreement values for activity level (93%), garden tasks (93%), motions (80%), associations (95%), and interactions (91%) also met acceptable criteria. Validity was established by previously validated PA codes and by expected convergent validity with accelerometry. PARAGON is a valid and reliable observation tool for assessing children's PA in the context of gardening.
Development and reliability testing of the Nordic Housing Enabler – an instrument for accessibility assessment of the physical housing

DEFF Research Database (Denmark)

Helle, Tina

and adapted according to accessibility norms and guidelines for housing design in Sweden, Denmark, Iceland and Finland. This iterative process involved occupational therapists, architects, building engineers and professional translators, resulting in the Nordic Housing Enabler. For reliability testing...... serious deficits when it comes to accessibility. This study addresses development of a content valid cross-Nordic version of the Housing Enabler and investigation of inter-rater reliability, when used in occupational therapy practice. The instrument was translated from the original Swedish version......, the sample strategy and data collection procedures were the same in all countries. In total, twenty voluntary occupational therapists collected data from 106 cases by means of the Nordic Housing Enabler. Inter-rater reliability was calculated by means of percentage agreement and kappa statistics. Overall...
Reliability and validity of a Chinese version of the Diagnostic Interview for Borderlines-Revised.

Science.gov (United States)

Wang, Lanlan; Yuan, Chenmei; Qiu, Jianying; Gunderson, John; Zhang, Min; Jiang, Kaida; Leung, Freedom; Zhong, Jie; Xiao, Zeping

2014-09-01

Borderline personality disorder (BPD) is the most studied of the axis II disorders. One of the most widely used diagnostic instruments is the Diagnostic Interview for Borderline Patients-Revised (DIB-R). The aim of this study was to test the reliability and validity of DIB-R for use in the Chinese culture. The reliability and validity of the DIB-R Chinese version were assessed in a sample of 236 outpatients with a probable BPD diagnosis. The Structured Clinical Interview for DSM-IV Personality Disorders (SCID-II) was used as a standard. Test-retest reliability was tested six months later with 20 patients, and inter-rater reliability was tested on 32 patients. The Chinese version of the DIB-R showed good internal global consistency (Cronbach's α of 0.916), good test-retest reliability (Pearson correlation of 0.704), good inter-rater reliability (intra-class correlation coefficient of 0.892 and kappa of 0.861). When compared with the DSM-IV diagnosis as measured by the SCID-II, the DIB-R showed relatively good sensitivity (0.768) and specificity (0.891) at the cutoff of 7, moderate diagnostic convergence (kappa of 0.631), as well as good discriminating validity. The Chinese version of the DIB-R has good psychometric properties, which renders it a valuable method for examining the presence, the severity, and component phenotypes of BPD in Chinese samples. © 2013 Wiley Publishing Asia Pty Ltd.
Water vapour inter-comparison effort in the framework of the hydrological cycle in the mediterranean experiment - special observation period (hymex-sop1)

Science.gov (United States)

Summa, Donato; Di Girolamo, Paolo; Flamant, Cyrille; De Rosa, Benedetto; Cacciani, Marco; Stelitano, Dario

2018-04-01

Accurate measurements of the vertical profiles of water vapour are of paramount importance for most key areas of atmospheric sciences. A comprehensive inter-comparison between different remote sensing and in-situ sensors has been carried out in the frame work of the first Special Observing Period of the Hydrological cycle in the Mediterranean Experiment for the purpose of obtaining accurate error estimates for these sensors. The inter-comparison involves a ground-based Raman lidar (BASIL), an airborne DIAL (LEANDRE2), a microwave radiometer, radiosondes and aircraft in-situ sensors.
[Care quality: reliability and usefulness of observation data in bench marking nursing homes and homes for the aged in the Netherlands].

Science.gov (United States)

Frijters, Dinnus; Gerritsen, Debby; Steverink, Nardi

2003-02-01

Before including quality of care indicators in the Benchmark of Nursing Homes and Homes for the Aged in the Netherlands the reliability of the patient data collection, and usefulness had to be established. The patient data items were derived from the Resident Assessment Instruments (RAI) and a questionnaire on social interaction in elderly people. Three nursing homes and three homes for the aged participated in the test with 550 patients. 279 x 2 assessments were collected by independent raters for an inter rater reliability test; 259 x 2 by the same rater for a reliability test-retest; and 24 by a single rater. The scores on paired assessment forms were compared with the weighted Kappa agreement test. The test results allowed 10 of the 13 quality indicators from RAI to be retained. In addition new quality indicators could be defined on 'giving attention' and 'unrespectful addressing'. We estimate on the basis of a questionnaire for the raters that on average 9 to 12 minutes per patient are needed to collect and enter data for the resulting 12 quality indicators.
The relative and absolute reliability of the Functional Independence and Difficulty Scale in community-dwelling frail elderly Japanese people using long-term care insurance services.

Science.gov (United States)

Saito, Takashi; Izawa, Kazuhiro P; Watanabe, Shuichiro

2017-06-01

The newly developed Functional Independence and Difficulty Scale is a tool for assessing the performance of basic activities of daily living in terms of both independence and difficulty. The reliability of this new scale has not been assessed. The aim of this study was to examine the relative reliability and absolute reliability of the newly developed scale in community-dwelling frail elderly people in Japan. Participants were 47 community-dwelling elderly subjects (22 for assessing test-retest reliability and 25 for assessing inter-rater reliability). As relative reliability indices, intra-class correlation coefficients were used. From an absolute reliability perspective, we conducted Bland-Altman analysis and calculated the limit of agreement or minimal detectable change to determine the acceptable range of error. Intra-class correlation coefficients for test-retest and inter-rater reliability were 0.90 (P reliability was -5.2 to 1.8, representing an increase of over six points for improvement and a decrease of over two points for decline of basic activities of daily living ability. The minimal detectable change for inter-rater reliability was 3.7, indicating that a three-point difference might be existed between difference raters. The results of this study demonstrated that the FIDS appeared to be a reliable instrument for use in Japanese community-dwelling frail elderly people. While further research using a large and more diverse sample of participants is needed, our findings support the use of FIDS in clinical practice or clinical research targeting frail elderly Japanese people.
Intra-/inter-laboratory validation study on reactive oxygen species assay for chemical photosafety evaluation using two different solar simulators.

Science.gov (United States)

Onoue, Satomi; Hosoi, Kazuhiro; Toda, Tsuguto; Takagi, Hironori; Osaki, Naoto; Matsumoto, Yasuhiro; Kawakami, Satoru; Wakuri, Shinobu; Iwase, Yumiko; Yamamoto, Toshinobu; Nakamura, Kazuichi; Ohno, Yasuo; Kojima, Hajime

2014-06-01

A previous multi-center validation study demonstrated high transferability and reliability of reactive oxygen species (ROS) assay for photosafety evaluation. The present validation study was undertaken to verify further the applicability of different solar simulators and assay performance. In 7 participating laboratories, 2 standards and 42 coded chemicals, including 23 phototoxins and 19 non-phototoxic drugs/chemicals, were assessed by the ROS assay using two different solar simulators (Atlas Suntest CPS series, 3 labs; and Seric SXL-2500V2, 4 labs). Irradiation conditions could be optimized using quinine and sulisobenzone as positive and negative standards to offer consistent assay outcomes. In both solar simulators, the intra- and inter-day precisions (coefficient of variation; CV) for quinine were found to be below 10%. The inter-laboratory CV for quinine averaged 15.4% (Atlas Suntest CPS) and 13.2% (Seric SXL-2500V2) for singlet oxygen and 17.0% (Atlas Suntest CPS) and 7.1% (Seric SXL-2500V2) for superoxide, suggesting high inter-laboratory reproducibility even though different solar simulators were employed for the ROS assay. In the ROS assay on 42 coded chemicals, some chemicals (ca. 19-29%) were unevaluable because of limited solubility and spectral interference. Although several false positives appeared with positive predictivity of ca. 76-92% (Atlas Suntest CPS) and ca. 75-84% (Seric SXL-2500V2), there were no false negative predictions in both solar simulators. A multi-center validation study on the ROS assay demonstrated satisfactory transferability, accuracy, precision, and predictivity, as well as the availability of other solar simulators. Copyright © 2013 Elsevier Ltd. All rights reserved.
Real-Time Observation of Apathy in Long-Term Care Residents With Dementia: Reliability of the Person-Environment Apathy Rating Scale.

Science.gov (United States)

Jao, Ying-Ling; Mogle, Jacqueline; Williams, Kristine; McDermott, Caroline; Behrens, Liza

2018-04-01

Apathy is prevalent in individuals with dementia. Lack of responsiveness to environmental stimulation is a key characteristic of apathy. The Person-Environment Apathy Rating (PEAR) scale consists of environment and apathy subscales, which allow for examination of environmental impact on apathy. The interrater reliability of the PEAR scale was examined via real-time observation. The current study included 45 observations of 15 long-term care residents with dementia. Each participant was observed at three time points for 10 minutes each. Two raters observed the participant and surrounding environment and independently rated the participant's apathy and environmental stimulation using the PEAR scale. Weighted Kappa was 0.5 to 0.82 for the PEAR-Environment subscale and 0.5 to 0.8 for the PEAR-Apathy subscale. Overall, with the exception of three items with relatively weak reliability (0.50 to 0.56), the PEAR scale showed moderate to strong interrater reliability (0.63 to 0.82). The results support the use of the PEAR scale to measure environmental stimulation and apathy via real-time observation in long-term care residents with dementia. [Journal of Gerontological Nursing, 44(4), 23-28.]. Copyright 2018, SLACK Incorporated.
Reliability of the Emergency Severity Index: Meta-analysis

Directory of Open Access Journals (Sweden)

Amir Mirhaghi

2015-01-01

Full Text Available Objectives: Although triage systems based on the Emergency Severity Index (ESI have many advantages in terms of simplicity and clarity, previous research has questioned their reliability in practice. Therefore, the aim of this meta-analysis was to determine the reliability of ESI triage scales. Methods: This metaanalysis was performed in March 2014. Electronic research databases were searched and articles conforming to the Guidelines for Reporting Reliability and Agreement Studies were selected. Two researchers independently examined selected abstracts. Data were extracted in the following categories: version of scale (latest/older, participants (adult/paediatric, raters (nurse, physician or expert, method of reliability (intra/inter-rater, reliability statistics (weighted/unweighted kappa and the origin and publication year of the study. The effect size was obtained by the Z-transformation of reliability coefficients. Data were pooled with random-effects models and a meta-regression was performed based on the method of moments estimator. Results: A total of 19 studies from six countries were included in the analysis. The pooled coefficient for the ESI triage scales was substantial at 0.791 (95% confidence interval: 0.787‒0.795. Agreement was higher with the latest and adult versions of the scale and among expert raters, compared to agreement with older and paediatric versions of the scales and with other groups of raters, respectively. Conclusion: ESI triage scales showed an acceptable level of overall reliability. However, ESI scales require more development in order to see full agreement from all rater groups. Further studies concentrating on other aspects of reliability assessment are needed.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.