Jesús F. Salgado
Full Text Available There is criticism in the literature about the use of interrater coefficients to correct for criterion reliability in validity generalization (VG studies and disputing whether .52 is an accurate and non-dubious estimate of interrater reliability of overall job performance (OJP ratings. We present a second-order meta-analysis of three independent meta-analytic studies of the interrater reliability of job performance ratings and make a number of comments and reflections on LeBreton et al.s paper. The results of our meta-analysis indicate that the interrater reliability for a single rater is .52 (k = 66, N = 18,582, SD = .105. Our main conclusions are: (a the value of .52 is an accurate estimate of the interrater reliability of overall job performance for a single rater; (b it is not reasonable to conclude that past VG studies that used .52 as the criterion reliability value have a less than secure statistical foundation; (c based on interrater reliability, test-retest reliability, and coefficient alpha, supervisor ratings are a useful and appropriate measure of job performance and can be confidently used as a criterion; (d validity correction for criterion unreliability has been unanimously recommended by "classical" psychometricians and I/O psychologists as the proper way to estimate predictor validity, and is still recommended at present; (e the substantive contribution of VG procedures to inform HRM practices in organizations should not be lost in these technical points of debate.
Palm, Peter; Josephson, Malin; Mathiassen, Svend Erik; Kjellberg, Katarina
We evaluated the intra- and inter-observer reliability and criterion validity of an observation protocol, developed in an iterative process involving practicing ergonomists, for assessment of working technique during cash register work for the purpose of preventing upper extremity symptoms. Two ergonomists independently assessed 17 15-min videos of cash register work on two occasions each, as a basis for examining reliability. Criterion validity was assessed by comparing these assessments with meticulous video-based analyses by researchers. Intra-observer reliability was acceptable (i.e. proportional agreement >0.7 and kappa >0.4) for 10/10 questions. Inter-observer reliability was acceptable for only 3/10 questions. An acceptable inter-observer reliability combined with an acceptable criterion validity was obtained only for one working technique aspect, 'Quality of movements'. Thus, major elements of the cashiers' working technique could not be assessed with an acceptable accuracy from short periods of observations by one observer, such as often desired by practitioners. Practitioner Summary: We examined an observation protocol for assessing working technique in cash register work. It was feasible in use, but inter-observer reliability and criterion validity were generally not acceptable when working technique aspects were assessed from short periods of work. We recommend the protocol to be used for educational purposes only.
Physical inactivity is one of the four leading risk factors for global mortality. Accurate measurement of physical activity (PA) and in particular by physical activity questionnaires (PAQs) remains a challenge. The aim of this paper is to provide an updated systematic review of the reliability and validity characteristics of existing and more recently developed PAQs and to quantitatively compare the performance between existing and newly developed PAQs. A literature search of electronic databases was performed for studies assessing reliability and validity data of PAQs using an objective criterion measurement of PA between January 1997 and December 2011. Articles meeting the inclusion criteria were screened and data were extracted to provide a systematic overview of measurement properties. Due to differences in reported outcomes and criterion methods a quantitative meta-analysis was not possible. In total, 31 studies testing 34 newly developed PAQs, and 65 studies examining 96 existing PAQs were included. Very few PAQs showed good results on both reliability and validity. Median reliability correlation coefficients were 0.62–0.71 for existing, and 0.74–0.76 for new PAQs. Median validity coefficients ranged from 0.30–0.39 for existing, and from 0.25–0.41 for new PAQs. Although the majority of PAQs appear to have acceptable reliability, the validity is moderate at best. Newly developed PAQs do not appear to perform substantially better than existing PAQs in terms of reliability and validity. Future PAQ studies should include measures of absolute validity and the error structure of the instrument. PMID:22938557
Helmerhorst Hendrik JF
Full Text Available Abstract Physical inactivity is one of the four leading risk factors for global mortality. Accurate measurement of physical activity (PA and in particular by physical activity questionnaires (PAQs remains a challenge. The aim of this paper is to provide an updated systematic review of the reliability and validity characteristics of existing and more recently developed PAQs and to quantitatively compare the performance between existing and newly developed PAQs. A literature search of electronic databases was performed for studies assessing reliability and validity data of PAQs using an objective criterion measurement of PA between January 1997 and December 2011. Articles meeting the inclusion criteria were screened and data were extracted to provide a systematic overview of measurement properties. Due to differences in reported outcomes and criterion methods a quantitative meta-analysis was not possible. In total, 31 studies testing 34 newly developed PAQs, and 65 studies examining 96 existing PAQs were included. Very few PAQs showed good results on both reliability and validity. Median reliability correlation coefficients were 0.62–0.71 for existing, and 0.74–0.76 for new PAQs. Median validity coefficients ranged from 0.30–0.39 for existing, and from 0.25–0.41 for new PAQs. Although the majority of PAQs appear to have acceptable reliability, the validity is moderate at best. Newly developed PAQs do not appear to perform substantially better than existing PAQs in terms of reliability and validity. Future PAQ studies should include measures of absolute validity and the error structure of the instrument.
Davies, Kylie; Bulsara, Max K; Ramelet, Anne-Sylvie; Monterosso, Leanne
To establish criterion-related construct validity and test-retest reliability for the Endotracheal Suction Assessment Tool© (ESAT©). Endotracheal tube suction performed in children can significantly affect clinical stability. Previously identified clinical indicators for endotracheal tube suction were used as criteria when designing the ESAT©. Content validity was reported previously. The final stages of psychometric testing are presented. Observational testing was used to measure construct validity and determine whether the ESAT© could guide "inexperienced" paediatric intensive care nurses' decision-making regarding endotracheal tube suction. Test-retest reliability of the ESAT© was performed at two time points. The researchers and paediatric intensive care nurse "experts" developed 10 hypothetical clinical scenarios with predetermined endotracheal tube suction outcomes. "Experienced" (n = 12) and "inexperienced" (n = 14) paediatric intensive care nurses were presented with the scenarios and the ESAT© guiding decision-making about whether to perform endotracheal tube suction for each scenario. Outcomes were compared with those predetermined by the "experts" (n = 9). Test-retest reliability of the ESAT© was measured at two consecutive time points (4 weeks apart) with "experienced" and "inexperienced" paediatric intensive care nurses using the same scenarios and tool to guide decision-making. No differences were observed between endotracheal tube suction decisions made by "experts" (n = 9), "inexperienced" (n = 14) and "experienced" (n = 12) nurses confirming the tool's construct validity. No differences were observed between groups for endotracheal tube suction decisions at T1 and T2. Criterion-related construct validity and test-retest reliability of the ESAT© were demonstrated. Further testing is recommended to confirm reliability in the clinical setting with the "inexperienced" nurse to guide decision-making related to endotracheal tube
Galán-Mercant, Alejandro; Barón-López, Francisco Javier; Labajos-Manzanares, María T; Cuesta-Vargas, Antonio I
Background The capacity to diagnosys, quantify and evaluate movement beyond the general confines of a clinical environment under effectiveness conditions may alleviate rampant strain on limited, expensive and highly specialized medical resources. An iPhone 4® mounted a three dimensional accelerometer subsystem with highly robust software applications. The present study aimed to evaluate the reliability and concurrent criterion-related validity of the accelerations with an iPhone 4® in an Exte...
Brinklov, Cecilie Fau; Thorsen, Ida Kær; Karstoft, Kristian
Background: Prevention of multi-morbidities following non-communicable diseases requires a systematic registration of adverse modifiable risk factors, including low physical fitness. The aim of the study was to establish criterion validity and reliability of a smartphone app (InterWalk) delivered....... The algorithm was validated using leave-one-out cross validation. Test-retest reliability was tested in a subset of participants (N = 10). Results: The overall VO2peak prediction of the algorithm (R2) was 0.60 and 0.45 when the smartphone was placed in the pockets of the pants and jacket, respectively (p ... calorimetry and the acceleration (vector magnitude) from the smartphone was obtained. The vector magnitude was used to predict VO2peak along with the co-variates weight, height and sex. The validity of the algorithm was tested when the smartphone was placed in the right pocket of the pants or jacket...
Helmerhorst, Hendrik J. F.; Brage, Søren; Warren, Janet; Besson, Herve; Ekelund, Ulf
Physical inactivity is one of the four leading risk factors for global mortality. Accurate measurement of physical activity (PA) and in particular by physical activity questionnaires (PAQs) remains a challenge. The aim of this paper is to provide an updated systematic review of the reliability and
Jung, Sung-Hoon; Kwon, Oh-Yun; Jeon, In-Cheol; Hwang, Ui-Jae; Weon, Jong-Hyuck
The purposes of this study were to determine the intra-rater test-retest reliability of a smart phone-based measurement tool (SBMT) and a three-dimensional (3D) motion analysis system for measuring the transverse rotation angle of the pelvis during single-leg lifting (SLL) and the criterion validity of the transverse rotation angle of the pelvis measurement using SBMT compared with a 3D motion analysis system (3DMAS). Seventeen healthy volunteers performed SLL with their dominant leg without bending the knee until they reached a target placed 20 cm above the table. This study used a 3DMAS, considered the gold standard, to measure the transverse rotation angle of the pelvis to assess the criterion validity of the SBMT measurement. Intra-rater test-retest reliability was determined using the SBMT and 3DMAS using intra-class correlation coefficient (ICC) [3,1] values. The criterion validity of the SBMT was assessed with ICC [3,1] values. Both the 3DMAS (ICC = 0.77) and SBMT (ICC = 0.83) showed excellent intra-rater test-retest reliability in the measurement of the transverse rotation angle of the pelvis during SLL in a supine position. Moreover, the SBMT showed an excellent correlation with the 3DMAS (ICC = 0.99). Measurement of the transverse rotation angle of the pelvis using the SBMT showed excellent reliability and criterion validity compared with the 3DMAS.
Mungovan, Sean F; Peralta, Paula J; Gass, Gregory C; Scanlan, Aaron T
To examine the test-retest reliability and criterion validity of a high-intensity, netball-specific fitness test. Repeated measures, within-subject design. Eighteen female netball players competing in an international competition completed a trial of the Net-Test, which consists of 14 timed netball-specific movements. Players also completed a series of netball-relevant criterion fitness tests. Ten players completed an additional Net-Test trial one week later to assess test-retest reliability using intraclass correlation coefficient (ICC), typical error of measurement (TEM), and coefficient of variation (CV). The typical error of estimate expressed as CV and Pearson correlations were calculated between each criterion test and Net-Test performance to assess criterion validity. Five movements during the Net-Test displayed moderate ICC (0.84-0.90) and two movements displayed high ICC (0.91-0.93). Seven movements and heart rate taken during the Net-Test held low CV (Test possessed low CV and significant (pTest possesses acceptable reliability for the assessment of netball fitness. Further, the high criterion validity for the Net-Test suggests a range of important netball-specific fitness elements are assessed in combination. Copyright © 2018 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.
Full Text Available This paper presents evidence on the reliability and validity of the Serbian adaptation of the Trait Emotional Intelligence Questionnaire (TEIQue, an instrument designed to comprehensively assess emotional intelligence conceived as a constellation of emotionrelated self-perceptions. Study participants were 254 adults, who completed the Serbian TEIQue, NEO-FFI, MSCEIT, EQ-short, and RSPWB. The results indicate that the adapted TEIQue is a psychometrically sound assessment tool: internal consistencies were mostly acceptable at facet, generally good at factor, and excellent at whole-scale level; the fourfactor structure was confirmed by means of CFA; convergent-discriminant validity was established through meaningful associations with related constructs, indicating that trait EI is closely aligned with affect and self-efficacy related constructs from the realm of personality (i.e., E, N, C, and Empathy, but shows only moderate overlap with ability EI; finally, incremental validity was demonstrated in the prediction of psychological wellbeing, over and above the Big Five. [Projekat Ministarstva nauke Republike Srbije, br. 179018
Herrington, Lee; Alenezi, Faisal; Alzhrani, Msaad; Alrayani, Hasan; Jones, Richard
The objective was to assess the intra-tester, within and between day reliability of measurement of hip adduction (HADD) and frontal plane projection angles (FPPA) during single leg squat (SLS) and single leg landing (SLL) using 2D video and the validity of these measurements against those found during 3D motion capture. 15 healthy subjects had their SLS and SLL assessed using 3D motion capture and video analysis. Inter-tester reliability for both SLS and SLL when measuring FPPA and HADD show excellent correlations (ICC 2,1 0.97-0.99). Within and between day assessment of SLS and SLL showed good to excellent correlations for both variables (ICC 3,1 0.72-91). 2D FPPA measures were found to have good correlation with knee abduction angle in 3-D (r=0.79, p=0.008) during SLS, and also to knee abduction moment (r=0.65, p=0.009). 2D HADD showed very good correlation with 3D HADD during SLS (r=0.81, p=0.001), and a good correlation during SLL (r=0.62, p=0.013). All other associations were weak (r<0.4). This study suggests that 2D video kinematics have a reasonable association to what is being measured with 3D motion capture. Copyright © 2017 Elsevier Ltd. All rights reserved.
Ravens-Sieberer, U.; Erhart, M.; Rajmil, L.; Herdman, M.; Auquier, P.; Bruil, J.; Power, M.; Duer, W.; Abel, T.; Czemy, L.; Mazur, J.; Czimbalmos, A.; Tountas, Y.; Hagquist, C.; Kilroe, J.
Background: To assess the criterion and construct validity of the KIDSCREEN-10 well-being and health-related quality of life (HRQoL) score, a short version of the KIDSCREEN-52 and KIDSCREEN-27 instruments. Methods: The child self-report and parent report versions of the KIDSCREEN-10 were tested in a
Erhart, Michael; Rajmil, Luis; Herdman, Michael; Auquier, Pascal; Bruil, Jeanet; Power, Mick; Duer, Wolfgang; Abel, Thomas; Czemy, Ladislav; Mazur, Joanna; Czimbalmos, Agnes; Tountas, Yannis; Hagquist, Curt; Kilroe, Jean
Background To assess the criterion and construct validity of the KIDSCREEN-10 well-being and health-related quality of life (HRQoL) score, a short version of the KIDSCREEN-52 and KIDSCREEN-27 instruments. Methods The child self-report and parent report versions of the KIDSCREEN-10 were tested in a sample of 22,830 European children and adolescents aged 8–18 and their parents (n = 16,237). Correlation with the KIDSCREEN-52 and associations with other generic HRQoL measures, physical and mental health, and socioeconomic status were examined. Score differences by age, gender, and country were investigated. Results Correlations between the 10-item KIDSCREEN score and KIDSCREEN-52 scales ranged from r = 0.24 to 0.72 (r = 0.27–0.72) for the self-report version (proxy-report version). Coefficients below r = 0.5 were observed for the KIDSCREEN-52 dimensions Financial Resources and Being Bullied only. Cronbach alpha was 0.82 (0.78), test–retest reliability was ICC = 0.70 (0.67) for the self- (proxy-)report version. Correlations between other children self-completed HRQoL questionnaires and KIDSCREEN-10 ranged from r = 0.43 to r = 0.63 for the KIDSCREEN children self-report and r = 0.22–0.40 for the KIDSCREEN parent proxy report. Known group differences in HRQoL between physically/mentally healthy and ill children were observed in the KIDSCREEN-10 self and proxy scores. Associations with self-reported psychosomatic complaints were r = −0.52 (−0.36) for the KIDSCREEN-10 self-report (proxy-report). Statistically significant differences in KIDSCREEN-10 self and proxy scores were found by socioeconomic status, age, and gender. Conclusions Our results indicate that the KIDSCREEN-10 provides a valid measure of a general HRQoL factor in children and adolescents, but the instrument does not represent well most of the single dimensions of the original KIDSCREEN-52. Test–retest reliability was slightly below a priori defined thresholds. PMID:20668950
Bannigan, Katrina; Watson, Roger
To explore and explain the different concepts of reliability and validity as they are related to measurement instruments in social science and health care. There are different concepts contained in the terms reliability and validity and these are often explained poorly and there is often confusion between them. To develop some clarity about reliability and validity a conceptual framework was built based on the existing literature. The concepts of reliability, validity and utility are explored and explained. Reliability contains the concepts of internal consistency and stability and equivalence. Validity contains the concepts of content, face, criterion, concurrent, predictive, construct, convergent (and divergent), factorial and discriminant. In addition, for clinical practice and research, it is essential to establish the utility of a measurement instrument. To use measurement instruments appropriately in clinical practice, the extent to which they are reliable, valid and usable must be established.
Hamid, M. R. Ab; Sami, W.; Mohmad Sidek, M. H.
Assessment of discriminant validity is a must in any research that involves latent variables for the prevention of multicollinearity issues. Fornell and Larcker criterion is the most widely used method for this purpose. However, a new method has emerged for establishing the discriminant validity assessment through heterotrait-monotrait (HTMT) ratio of correlations method. Therefore, this article presents the results of discriminant validity assessment using these methods. Data from previous study was used that involved 429 respondents for empirical validation of value-based excellence model in higher education institutions (HEI) in Malaysia. From the analysis, the convergent, divergent and discriminant validity were established and admissible using Fornell and Larcker criterion. However, the discriminant validity is an issue when employing the HTMT criterion. This shows that the latent variables under study faced the issue of multicollinearity and should be looked into for further details. This also implied that the HTMT criterion is a stringent measure that could detect the possible indiscriminant among the latent variables. In conclusion, the instrument which consisted of six latent variables was still lacking in terms of discriminant validity and should be explored further.
Sadhinoch, M.; Atzema, E. H.; Perdahcioglu, E. S.; van den Boogaard, A. H.
Most commercial finite element software packages, like Abaqus, have a built-in coupled damage model where a damage evolution needs to be defined in terms of a single fracture energy value for all stress states. The Johnson-Cook criterion has been modified to be Lode parameter dependent and this Modified Johnson-Cook (MJC) criterion is used as a Damage Initiation Surface (DIS) in combination with the built-in Abaqus ductile damage model. An exponential damage evolution law has been used with a single fracture energy value. Ultimately, the simulated force-displacement curves are compared with experiments to validate the MJC criterion. 7 out of 9 fracture experiments were predicted accurately. The limitations and accuracy of the failure predictions of the newly developed damage initiation criterion will be discussed shortly.
Lai, Cheng-Fei; Alonzo, Julie; Tindal, Gerald
In this technical report, we present the results of a study to gather criterion-related evidence for Grade K-1 easyCBM® reading measures. We used correlations to examine the relation between the easyCBM® measures and other published measures with known reliability and validity evidence, including the Dynamic Indicators of Basic Early Literacy…
Lai, Cheng-Fei; Alonzo, Julie; Tindal, Gerald
In this technical report, we present the results of a study to gather criterion-related evidence for Grade 2-5 easyCBM® reading measures. We used correlations to examine the relation between the easyCBM® measures and other published measures with known reliability and validity evidence, including the Gates-MacGinitie Reading Tests and the Dynamic…
Stice, Eric; Fisher, Melissa; Martinez, Erin
The authors conducted 4 studies investigating the reliability and validity of the Eating Disorder Diagnostic Scale (HDDS; E. Stice, C. F. Telch, & S. L. Rizvi, 2000), a brief self-report measure for diagnosing anorexia nervosa, bulimia nervosa, and binge eating disorder. Study 1 found that the HDDS showed criterion validity with interview-based…
Raykov, Tenko; Marcoulides, George A.
A latent variable modeling method is outlined, which accomplishes estimation of criterion validity and reliability for a multicomponent measuring instrument with hierarchical structure. The approach provides point and interval estimates for the scale criterion validity and reliability coefficients, and can also be used for testing composite or…
Fang, Shaoji; Leira, Bernt J.; Blanke, Mogens
is achieved using structural reliability indices in a cost function, where both the mean mooring-line tension and dynamic effects are considered. An optimal set-point is automatically produced without need for manual interaction. The parameters of the extreme value distribution are calculated on-line thereby...... mooring lines simultaneously from exceeding a stress threshold, this paper suggests a new algorithm to determine the reference position and an associated control system. The safety of each line is assessed through a structural reliability index. A reference position where all mooring lines are safe...
Kang, Seunghoon; Lim, Woochul; Cho, Su-gil; Park, Sanghyun; Lee, Tae Hee; Lee, Minuk; Choi, Jong-su; Hong, Sup
In order to perform estimations with high reliability, it is necessary to deal with the tail part of the cumulative distribution function (CDF) in greater detail compared to an overall CDF. The use of a generalized Pareto distribution (GPD) to model the tail part of a CDF is receiving more research attention with the goal of performing estimations with high reliability. Current studies on GPDs focus on ways to determine the appropriate number of sample points and their parameters. However, even if a proper estimation is made, it can be inaccurate as a result of an incorrect threshold value. Therefore, in this paper, a GPD based on the Akaike information criterion (AIC) is proposed to improve the accuracy of the tail model. The proposed method determines an accurate threshold value using the AIC with the overall samples before estimating the GPD over the threshold. To validate the accuracy of the method, its reliability is compared with that obtained using a general GPD model with an empirical CDF
Kang, Seunghoon; Lim, Woochul; Cho, Su-gil; Park, Sanghyun; Lee, Tae Hee [Hanyang University, Seoul (Korea, Republic of); Lee, Minuk; Choi, Jong-su; Hong, Sup [Korea Research Insitute of Ships and Ocean Engineering, Daejeon (Korea, Republic of)
In order to perform estimations with high reliability, it is necessary to deal with the tail part of the cumulative distribution function (CDF) in greater detail compared to an overall CDF. The use of a generalized Pareto distribution (GPD) to model the tail part of a CDF is receiving more research attention with the goal of performing estimations with high reliability. Current studies on GPDs focus on ways to determine the appropriate number of sample points and their parameters. However, even if a proper estimation is made, it can be inaccurate as a result of an incorrect threshold value. Therefore, in this paper, a GPD based on the Akaike information criterion (AIC) is proposed to improve the accuracy of the tail model. The proposed method determines an accurate threshold value using the AIC with the overall samples before estimating the GPD over the threshold. To validate the accuracy of the method, its reliability is compared with that obtained using a general GPD model with an empirical CDF.
van der Wulp, I.
Reliability and validity of triage systems is important because this can affect patient safety. In this thesis, these aspects of two emergency department (ED) triage systems were studied as well as methodological aspects in these types of studies. The consistency, reproducibility, and criterion
Tamboer, Peter; Vorst, Harrie C M
The validity of a Dutch self-report inventory of dyslexia was ascertained in two samples of students. Six biographical questions, 20 general language statements and 56 specific language statements were based on dyslexia as a multi-dimensional deficit. Dyslexia and non-dyslexia were assessed with two criteria: identification with test results (Sample 1) and classification using biographical information (both samples). Using discriminant analyses, these criteria were predicted with various groups of statements. All together, 11 discriminant functions were used to estimate classification accuracy of the inventory. In Sample 1, 15 statements predicted the test criterion with classification accuracy of 98%, and 18 statements predicted the biographical criterion with classification accuracy of 97%. In Sample 2, 16 statements predicted the biographical criterion with classification accuracy of 94%. Estimations of positive and negative predictive value were 89% and 99%. Items of various discriminant functions were factor analysed to find characteristic difficulties of students with dyslexia, resulting in a five-factor structure in Sample 1 and a four-factor structure in Sample 2. Answer bias was investigated with measures of internal consistency reliability. Less than 20 self-report items are sufficient to accurately classify students with and without dyslexia. This supports the usefulness of self-assessment of dyslexia as a valid alternative to diagnostic test batteries. Copyright © 2015 John Wiley & Sons, Ltd.
Corty, E W; Althof, S E; Kurit, D M
The present study assessed the reliability and validity of a measure of sexual functioning, the CMSH-SFQ, for male patients and their partners. The CMSH-SFQ measures erectile and orgasmic functioning, sexual drive, frequency of sexual behavior, and sexual satisfaction. Test-retest reliability was assessed with 19 males and 19 females for the baseline CMSH-SFQ. Criterion validity was measured by comparing the answers of 25 male patients to those of their partners at baseline and follow-up. The majority of items had acceptable levels of reliability and validity. The CMSH-SFQ provides a reliable and valid device that can be used to measure global sexual functioning in men and their partners and may be used to evaluate the efficacy of treatments for sexual dysfunctions. Limitations and suggestions for use of the CMSH-SFQ are addressed.
Jacobs, Nora W; Berduszek, Redmar J; Dijkstra, Pieter U; van der Sluis, Corry K
Purpose To evaluate validity and reliability of the upper extremity work demands (UEWD) scale. Methods Participants from different levels of physical work demands, based on the Dictionary of Occupational Titles categories, were included. A historical database of 74 workers was added for factor analysis. Criterion validity was evaluated by comparing observed and self-reported UEWD scores. To assess structural validity, a factor analysis was executed. For reliability, the difference between two self-reported UEWD scores, the smallest detectable change (SDC), test-retest reliability and internal consistency were determined. Results Fifty-four participants were observed at work and 51 of them filled in the UEWD twice with a mean interval of 16.6 days (SD 3.3, range = 10-25 days). Criterion validity of the UEWD scale was moderate (r = .44, p = .001). Factor analysis revealed that 'force and posture' and 'repetition' subscales could be distinguished with Cronbach's alpha of .79 and .84, respectively. Reliability was good; there was no significant difference between repeated measurements. An SDC of 5.0 was found. Test-retest reliability was good (intraclass correlation coefficient for agreement = .84) and all item-total correlations were >.30. There were two pairs of highly related items. Conclusion Reliability of the UEWD scale was good, but criterion validity was moderate. Based on current results, a modified UEWD scale (2 items removed, 1 item reworded, divided into 2 subscales) was proposed. Since observation appeared to be an inappropriate gold standard, we advise to investigate other types of validity, such as construct validity, in further research.
Thomas, Katherine M.; Wright, Aidan G. C.; Lukowitsky, Mark R.; Donnellan, M. Brent; Hopwood, Christopher J.
In this study, the authors evaluated aspects of criterion validity and clinical utility of the grandiosity and vulnerability components of the Pathological Narcissism Inventory (PNI) using two undergraduate samples (N = 299 and 500). Criterion validity was assessed by evaluating the correlations of narcissistic grandiosity and narcissistic…
Dogan, Tayfun; Cetin, Bayram
The purpose of the present study was to investigate the reliability and validity of the Turkish version of the Tromso Social Intelligence Scale (TSIS) developed by Silvera, Martinussen, and Dahl (2001). 719 students from Sakarya University participated in the study. Construct validity and criterion related validity and reliability were assessed.…
Sadhinoch, M.; Atzema, E.H.; Perdahcioglu, E.S.; Van Den Boogaard, A.H.
Most commercial finite element software packages, like Abaqus, have a built-in coupled damage model where a damage evolution needs to be defined in terms of a single fracture energy value for all stress states. The Johnson-Cook criterion has been modified to be Lode parameter dependent and this
Semeijn, E.J.; Michielsen, M.; Comijs, H.C.; Deeg, D.J.H.; Beekman, A.T.; Kooij, J.J.
Objective: To identify Attention Deficit Hyperactivity disorder (ADHD) in older adults, a validated screener is needed. This study evaluates the reliability and criterion validity of an ADHD screener for younger adults on its usefulness in a population-based sample of older adults. Methods: Data
Hicks, Jason L; Starns, Jeffrey J
In seven experiments, we explored the potential for strength-based, within-list criterion shifts in recognition memory. People studied a mix of target words, some presented four times (strong) and others studied once (weak). In Experiments 1, 2, 4A, and 4B, the test was organized into alternating blocks of 10, 20, or 40 trials. Each block contained lures intermixed with strong targets only or weak targets only. In strength-cued conditions, test probes appeared in a unique font color for strong and weak blocks. In the uncued conditions of Experiments 1 and 2, similar strength blocks were tested, but strength was not cued with font color. False alarms to lures were lower in blocks containing strong target words, as compared with lures in blocks containing weak targets, but only when strength was cued with font color. Providing test feedback in Experiment 2 did not alter these results. In Experiments 3A-3C, test items were presented in a random order (i.e., not blocked by strength). Of these three experiments, only one demonstrated a significant shift even though strength cues were provided. Overall, the criterion shift was larger and more reliable as block size increased, and the shift occurred only when strength was cued with font color. These results clarify the factors that affect participants' willingness to change their response criterion within a test list.
The effect on criterion-related validity of nonfitting response vectors (NRVs) on a predictor test was investigated. Using simulated data, it was shown that there was a substantial decrease in validity when the type of misfit was severe (i.e., guessing the correct answers to all test items), when
Wang, Qian; Qin, Pinquan; Wang, Wen-ge
Based on an analysis of Feynman's path integral formulation of the propagator, a relative criterion is proposed for validity of a semiclassical approach to the dynamics near critical points in a class of systems undergoing quantum phase transitions. It is given by an effective Planck constant, in the relative sense that a smaller effective Planck constant implies better performance of the semiclassical approach. Numerical tests of this relative criterion are given in the XY model and in the Dicke model.
Drost, Ellen A.
In this paper, the author aims to provide novice researchers with an understanding of the general problem of validity in social science research and to acquaint them with approaches to developing strong support for the validity of their research. She provides insight into these two important concepts, namely (1) validity; and (2) reliability, and…
Lin, Yi-Kuei; Yeh, Cheng-Ta
From the perspective of supply chain management, the selected carrier plays an important role in freight delivery. This article proposes a new criterion of multi-commodity reliability and optimises the carrier selection based on such a criterion for logistics networks with routes and nodes, over which multiple commodities are delivered. Carrier selection concerns the selection of exactly one carrier to deliver freight on each route. The capacity of each carrier has several available values associated with a probability distribution, since some of a carrier's capacity may be reserved for various orders. Therefore, the logistics network, given any carrier selection, is a multi-commodity multi-state logistics network. Multi-commodity reliability is defined as a probability that the logistics network can satisfy a customer's demand for various commodities, and is a performance indicator for freight delivery. To solve this problem, this study proposes an optimisation algorithm that integrates genetic algorithm, minimal paths and Recursive Sum of Disjoint Products. A practical example in which multi-sized LCD monitors are delivered from China to Germany is considered to illustrate the solution procedure.
Minner, Daphne Diane
The intention of this research project was to bridge the gap between social science research and application to the environmental domain through the development of a theoretically derived instrument designed to give educators a template by which to evaluate environmental education curricula. The theoretical base for instrument development was provided by several developmental theories such as Piaget's theory of cognitive development, Developmental Systems Theory, Life-span Perspective, as well as curriculum research within the area of environmental education. This theoretical base fueled the generation of a list of components which were then translated into a questionnaire with specific questions relevant to the environmental education domain. The specific research question for this project is: Can a valid assessment instrument based largely on human development and education theory be developed that reliably discriminates high, moderate, and low quality in environmental education curricula? The types of analyses conducted to answer this question were interrater reliability (percent agreement, Cohen's Kappa coefficient, Pearson's Product-Moment correlation coefficient), test-retest reliability (percent agreement, correlation), and criterion-related validity (correlation). Face validity and content validity were also assessed through thorough reviews. Overall results indicate that 29% of the questions on the questionnaire demonstrated a high level of interrater reliability and 43% of the questions demonstrated a moderate level of interrater reliability. Seventy-one percent of the questions demonstrated a high test-retest reliability and 5% a moderate level. Fifty-five percent of the questions on the questionnaire were reliable (high or moderate) both across time and raters. Only eight questions (8%) did not show either interrater or test-retest reliability. The global overall rating of high, medium, or low quality was reliable across both coders and time, indicating
Full Text Available Objective: Discomfort Intolerance Scale was developed by Norman B. Schmidt et al. to assess the individual differences of capacity to withstand physical perturbations or uncomfortable bodily states (2006. The aim of this study is to investigate the validity and reliability of Discomfort Intolerance Scale-Turkish Version (RDÖ. Method: From two different universities, total of 225 students (male=167, female=58 were participated in this study. In order to determine the criterion validity, Beck Anxiety Inventory (BAI and State-Trait Anxiety Inventory (STAI were used. Construct validity was evaluated by factor analysis after the Kaiser-Meyer-Olkin (KMO and Barlett test had been performed. To assess the test-retest reliability the scale was re-applied to 54 participants 6 weeks later. Results: To assess construct validity of DIS, factor analyses were performed using varimax principal components analysis with varimax rotation. The factor analysis resulted in two factors named “discomfort (in tolerance” and “discomfort avoidance”. The Cronbach’s alpha coefficient for the entire scale, discomfort-(intolerance subscale, discomfortavoidance subscale were, .592, .670, .600 respectively. Correlations between two factors of DIS, discomfort intolerance and discomfort avoidance, and Trait Anxiety Inventory of STAI (State-Trait Anxiety Inventory were statistically significant at the level of 0.05. Test-retest reliability was statistically significant at the level of 0.01. Conclusion: Analysis demonstrated that DIS had a satisfactory level of reliability and validity in Turkish university students.
Ganestam, Ann; Barfod, Kristoffer; Klit, Jakob
study was to validate a Danish translation of the ATRS. The ATRS was translated into Danish according to internationally adopted standards. Of 142 patients, 90 with previous rupture of the Achilles tendon participated in the validity study and 52 in the reliability study. The ATRS showed moderately......The best treatment of acute Achilles tendon rupture remains debated. Patient-reported outcome measures have become cornerstones in treatment evaluations. The Achilles tendon total rupture score (ATRS) has been developed for this purpose but requires additional validation. The purpose of the present...... = .07). The limits of agreement were ±18.53. A strong correlation was found between test and retest (intercorrelation coefficient .908); the standard error of measurement was 6.7, and the minimal detectable change was 18.5. The Danish version of the ATRS showed moderately strong criterion validity...
The purpose of this study was to determine the test-retest reliability and concurrent validity of the short form (Form B) of the Coopersmith Self-Esteem Inventory. Criterion measures for validity included: (1) sociometric measures; (2) teacher's popularity ranking; and, (3) self-esteem rating. (Author/LMO)
Interest differentiation and elevation are supposed to provide important information about a person's state of interest development, yet little is known about their development and criterion validity. The present study explored these constructs among a group of Swiss adolescents. Study 1 applied a cross-sectional design with 210 students in 11th…
Strum, Irene; Shapiro, Madelaine
The purpose of this study was to validate the Prescriptive Instructional Program for Educational Readiness (PIPER) for utilization as a criterion referenced test (CRT) among learning disabled children. The program consisted of behavioral objectives and diagnostic and/or mastery tasks and activities for each objective in the area of gross motor…
Lunde Pedersen, Eva Sophie; Mortensen, L H; Brage, S
BACKGROUND: The Physical Activity Scale (PAS2) was developed to measure physical activity (PA) during work, transportation and leisure time, in the Danish adult population. The objective of this study was to assess the criterion validity of PAS2 against a combined accelerometer and heart rate mon...
Ng, Thomas W H; Feldman, Daniel C
This study examines the criterion-related and incremental validity of ethical leadership (EL) with meta-analytic data. Across 101 samples published over the last 15 years (N = 29,620), we observed that EL demonstrated acceptable criterion-related validity with variables that tap followers' job attitudes, job performance, and evaluations of their leaders. Further, followers' trust in the leader mediated the relationships of EL with job attitudes and performance. In terms of incremental validity, we found that EL significantly, albeit weakly in some cases, predicted task performance, citizenship behavior, and counterproductive work behavior-even after controlling for the effects of such variables as transformational leadership, use of contingent rewards, management by exception, interactional fairness, and destructive leadership. The article concludes with a discussion of ways to strengthen the incremental validity of EL. (PsycINFO Database Record (c) 2015 APA, all rights reserved).
MacKillop, James; Acker, John D; Bollinger, Jared; Clifton, Allan; Miller, Joshua D; Campbell, W Keith; Goodie, Adam S
Alcohol misuse is substantially influenced by social factors, but systematic assessments of social network drinking are typically lengthy. The goal of the present study was to provide further validation of a brief measure of social network alcohol use, the Brief Alcohol Social Density Assessment (BASDA), in a sample of emerging adults. Specifically, the study sought to examine the BASDA's convergent, criterion, and incremental validity in relation to well-established measures of drinking motives and problematic drinking. Participants were 354 undergraduates who were assessed using the BASDA, the Alcohol Use Disorders Identification Test (AUDIT), and the Drinking Motives Questionnaire. Significant associations were observed between the BASDA index of alcohol-related social density and alcohol misuse, social motives, and conformity motives, supporting convergent validity. Criterion-related validity was supported by evidence that significantly greater alcohol involvement was present in the social networks of individuals scoring at or above an AUDIT score of 8, a validated criterion for hazardous drinking. Finally, the BASDA index was significantly associated with alcohol misuse above and beyond drinking motives in relation to AUDIT scores, supporting incremental validity. Taken together, these findings provide further support for the BASDA as an efficient measure of drinking in an individual's social network. Methodological considerations as well as recommendations for future investigations in this area are discussed.
Hoover, Matthew J; Jung, Rose; Jacobs, David M; Peeters, Michael J
To evaluate and compare the reliability and validity of educational testing reported in pharmacy education journals to medical education literature. Descriptions of validity evidence sources (content, construct, criterion, and reliability) were extracted from articles that reported educational testing of learners' knowledge, skills, and/or abilities. Using educational testing, the findings of 108 pharmacy education articles were compared to the findings of 198 medical education articles. For pharmacy educational testing, 14 articles (13%) reported more than 1 validity evidence source while 83 articles (77%) reported 1 validity evidence source and 11 articles (10%) did not have evidence. Among validity evidence sources, content validity was reported most frequently. Compared with pharmacy education literature, more medical education articles reported both validity and reliability (59%; particles in pharmacy education compared to medical education, validity, and reliability reporting were limited in the pharmacy education literature.
Aven, Terje; Heide, Bjornar
In this paper we investigate to what extent risk analysis meets the scientific quality requirements of reliability and validity. We distinguish between two types of approaches within risk analysis, relative frequency-based approaches and Bayesian approaches. The former category includes both traditional statistical inference methods and the so-called probability of frequency approach. Depending on the risk analysis approach, the aim of the analysis is different, the results are presented in different ways and consequently the meaning of the concepts reliability and validity are not the same.
Cha, Young Joo; Lee, Jae Jin; Kim, Do Hyun; You, Joshua Sung H
Core stabilization plays an important role in the regulation of postural stability. To overcome shortcomings associated with pain and severe core instability during conventional core stabilization tests, we recently developed the dynamic neuromuscular stabilization-based heel sliding (DNS-HS) test. The purpose of this study was to establish the criterion validity and test-retest reliability of the novel DNS-HS test. Twenty young adults with core instability completed both the bilateral straight leg lowering test (BSLLT) and DNS-HS test for the criterion validity study and repeated the DNS-HS test for the test-retest reliability study. Criterion validity was determined by comparing hip joint angle data that were obtained from BSLLT and DNS-HS measures. The test-retest reliability was determined by comparing hip joint angle data. Criterion validity was (ICC2,3) = 0.700 (preliability was (ICC3,3) = 0.953 (pvalidity data demonstrated a good relationship between the gold standard BSLLT and DNS-HS core stability measures. Test-retest reliability data suggests that DNS-HS core stability was a reliable test for core stability. Clinically, the DNS-HS test is useful to objectively quantify core instability and allow early detection and evaluation.
This article reviews three topics from test theory that continue to raise discussion and controversy and capture test theorists' and constructors' interest. The first topic concerns the discussion of the methodology of investigating and establishing construct validity; the second topic concerns reliability and its misuse, alternative definitions…
Ahmet Emre SARGIN
Full Text Available Objective: Distress Tolerance Scale (DTS is developed by Simons and Gaher in order to measure individual differences in the capacity of distress tolerance.The aim of this study is to assess the reliability and validity of the Turkish version of DTS. Method: One hundred and sixty seven university students (male=66, female=101 participated in this study. Beck Anxiety Inventory (BAI, State-trait Anxiety Inventory (STAI and Discomfort Intolerance Scale (DIS were used to determine the criterion validity. Construct validity was evaluated with factor analysis after the Kaiser-Meyer-Olkin (KMO and Barlett test had been performed. To assess the test-retest reliability, the scale was re-applied to 79 participants six weeks later. Results: To assess construct validity, factor analyses were performed using varimax principal components analysis with varimax rotation. While there were factors in the original study, our factor analysis resulted in three factors. Cronbach’s alpha coefficients for the entire scale and tolerance, regulation, self-efficacy subscales were .89, .90, .80 and .64 respectively. There were correlations at the level of 0.01 between the Trait Anxiety Inventory of STAI and BAI, and all the subscales of DTS and also between the State Anxiety Inventory and regulation subscale. Both of the subscales of DIS were correlated with the entire subscale and all the subscales except regulation at the level of 0.05.Test-retest reliability was statistically significant at the level of 0.01. Conclusion: Analysis demonstrated that DTS had a satisfactory level of reliability and validity in Turkish university students.
Doctor, S.R.; Deffenbaugh, J.D.; Good, M.S.; Green, E.R.; Heasler, P.G.; Hutton, P.H.; Reid, L.D.; Simonen, F.A.; Spanner, J.C.; Vo, T.V.
This paper reports on progress for three programs: (1) evaluation and improvement in nondestructive examination reliability for inservice inspection of light water reactors (LWR) (NDE Reliability Program), (2) field validation acceptance, and training for advanced NDE technology, and (3) evaluation of computer-based NDE techniques and regional support of inspection activities. The NDE Reliability Program objectives are to quantify the reliability of inservice inspection techniques for LWR primary system components through independent research and establish means for obtaining improvements in the reliability of inservice inspections. The areas of significant progress will be described concerning ASME Code activities, re-analysis of the PISC-II data, the equipment interaction matrix study, new inspection criteria, and PISC-III. The objectives of the second program are to develop field procedures for the AE and SAFT-UT techniques, perform field validation testing of these techniques, provide training in the techniques for NRC headquarters and regional staff, and work with the ASME Code for the use of these advanced technologies. The final program's objective is to evaluate the reliability and accuracy of interpretation of results from computer-based ultrasonic inservice inspection systems, and to develop guidelines for NRC staff to monitor and evaluate the effectiveness of inservice inspections conducted on nuclear power reactors. This program started in the last quarter of FY89, and the extent of the program was to prepare a work plan for presentation to and approval from a technical advisory group of NRC staff
Pigford, T.H.; Chambre, P.L.
The objective of predicting long-term performance should be to make reliable determinations of whether the prediction falls within the criteria for acceptable performance. Establishing reliable predictions of long-term performance of a waste repository requires emphasis on valid theories to predict performance. The validation process must establish the validity of the theory, the parameters used in applying the theory, the arithmetic of calculations, and the interpretation of results; but validation of such performance predictions is not possible unless there are clear criteria for acceptable performance. Validation programs should emphasize identification of the substantive issues of prediction that need to be resolved. Examples relevant to waste package performance are predicting the life of waste containers and the time distribution of container failures, establishing the criteria for defining container failure, validating theories for time-dependent waste dissolution that depend on details of the repository environment, and determining the extent of congruent dissolution of radionuclides in the UO 2 matrix of spent fuel. Prediction and validation should go hand in hand and should be done and reviewed frequently, as essential tools for the programs to design and develop repositories. 29 refs
Bourke-Taylor, Helen M; Cordier, Reinie; Pallant, Julie F
The Child's Challenging Behavior Scale, Version 2 (CCBS-2), measures maternal rating of a child's challenging behaviors that compromise maternal mental health. The CCBS-2, the Child Behavior Checklist (CBCL), and the Strengths and Difficulties Questionnaire (SDQ) were compared in a sample of typically developing young Australian children. Criterion validity was investigated by correlating the CCBS-2 with "gold standard" measures (CBCL and SDQ subscales). Data were collected in a cross-sectional survey of mothers (N = 336) of children ages 3-9 yr. Correlations with the CBCL externalizing subscales demonstrated moderate (ρ = .46) to strong (ρ = .66) correlations. Correlations with the SDQ externalizing behaviors subscales were moderate (ρ = .35) to strong (ρ = .60). The criterion validity established in this study strengthens the psychometric properties that support ongoing development of the CCBS-2 as an efficient tool that may identify children in need of further evaluation. Copyright © 2018 by the American Occupational Therapy Association, Inc.
Jeremy T. Goldbach
Full Text Available Sexual minority adolescents (SMA consistently report health disparities compared to their heterosexual counterparts, yet the underlying mechanisms of these negative health outcomes remain unclear. The predominant explanatory model is the minority stress theory; however, this model was developed largely with adults, and no valid and comprehensive measure of minority stress has been developed for adolescents. The present study validated a newly developed instrument to measure minority stress among racially and ethnically diverse SMA. A sample of 346 SMA aged 14–17 was recruited and surveyed between February 2015 and July 2016. The focal measure of interest was the 64-item, 11-factor Sexual Minority Adolescent Stress Inventory (SMASI developed in the initial phase of this study. Criterion validation measures included measures of depressive symptoms, suicidality and self-harm, youth problem behaviors, and substance use; the general Adolescent Stress Questionnaire (ASQ was included as a measure of divergent validity. Analyses included Pearson and tetrachoric correlations to establish criterion and divergent validity and structural equation modeling to assess the explanatory utility of the SMASI relative to the ASQ. SMASI scores were significantly associated with all outcomes but only moderately associated with the ASQ (r = −0.13 to 0.51. Analyses revealed significant associations of a latent minority stress variable with both proximal and distal health outcomes beyond the variation explained by general stress. Results show that the SMASI is the first instrument to validly measure minority stress among SMA.
Tosun, Betül; Aslan, Özlem; Tunay, Servet; Akyüz, Aygül; Özkan, Hüseyin; Bek, Doğan; Açıksöz, Semra
The purpose of this study was to determine the validity and reliability of the Turkish version of the Immobilization Comfort Questionnaire (ICQ). The sample used in this methodological study consisted of 121 patients undergoing lower extremity arthroscopy in a training and research hospital. The validity study of the questionnaire assessed language validity, structural validity and criterion validity. Structural validity was evaluated via exploratory factor analysis. Criterion validity was evaluated by assessing the correlation between the visual analog scale (VAS) scores (i.e., the comfort and pain VAS scores) and the ICQ scores using Spearman's correlation test. The Kaiser-Meyer-Olkin coefficient and Bartlett's test of sphericity were used to determine the suitability of the data for factor analysis. Internal consistency was evaluated to determine reliability. The data were analyzed with SPSS version 15.00 for Windows. Descriptive statistics were presented as frequencies, percentages, means and standard deviations. A p value ≤ .05 was considered statistically significant. A moderate positive correlation was found between the ICQ scores and the VAS comfort scores; a moderate negative correlation was found between the ICQ and the VAS pain measures in the criterion validity analysis. Cronbach α values of .75 and .82 were found for the first and second measurements, respectively. The findings of this study reveal that the ICQ is a valid and reliable tool for assessing the comfort of patients in Turkey who are immobilized because of lower extremity orthopedic problems. Copyright © 2015. Published by Elsevier B.V.
Macagnino, Sandro; Steinert, Tilman; Uhlmann, Carmen
Examination of in-hospital suicide risk levels concerning their validity and their reliability. The internal suicide risk levels were evaluated in a cross sectional study of in 163 inpatients. A reliability check was performed via determining interrater-reliability of senior physician, therapist and the responsible nurse. Within the scope of the validity check, we conducted analyses of criterion validity and construct validity. For the total sample an "acceptable" to "good" interrater-reliability (Kendalls W = .77) of suicide risk levels were obtained. Schizophrenic disorders showed the lowest values, for personality disorders we found the highest level of interrater-reliability. When examining the criterion validity, Item-9 of the BDI-II is substantial correlated to our suicide risk levels (ρ m = .54, p validity check, affective disorders showed the highest correlation (ρ = .77), compatible also with "convergent validity". They differed with schizophrenic disorders which showed the least concordance (ρ = .43). In-hospital suicide risk levels may represent an important contribution to the assessment of suicidal behavior of inpatients experiencing psychiatric treatment due to their overall good validity and reliability. © Georg Thieme Verlag KG Stuttgart · New York.
Woodburn, Jim; Sutcliffe, Nick
The Objective Structured Clinical Examination (OSCE), initially developed for undergraduate medical education, has been adapted for assessment of clinical skills in podiatry students. A 12-month pilot study found the test had relatively low levels of reliability, high construct and criterion validity, and good stability of performance over time.…
Huang, X N; Zhang, Y; Feng, W W; Wang, H S; Cao, B; Zhang, B; Yang, Y F; Wang, H M; Zheng, Y; Jin, X M; Jia, M X; Zou, X B; Zhao, C X; Robert, J; Jing, Jin
Objective: To evaluate the reliability and validity of warning signs checklist developed by the National Health and Family Planning Commission of the People's Republic of China (NHFPC), so as to determine the screening effectiveness of warning signs on developmental problems of early childhood. Method: Stratified random sampling method was used to assess the reliability and validity of checklist of warning sign and 2 110 children 0 to 6 years of age(1 513 low-risk subjects and 597 high-risk subjects) were recruited from 11 provinces of China. The reliability evaluation for the warning signs included the test-retest reliability and interrater reliability. With the use of Age and Stage Questionnaire (ASQ) and Gesell Development Diagnosis Scale (GESELL) as the criterion scales, criterion validity was assessed by determining the correlation and consistency between the screening results of warning signs and the criterion scales. Result: In terms of the warning signs, the screening positive rates at different ages ranged from 10.8%(21/141) to 26.2%(51/137). The median (interquartile) testing time for each subject was 1(0.6) minute. Both the test-retest reliability and interrater reliability of warning signs reached 0.7 or above, indicating that the stability was good. In terms of validity assessment, there was remarkable consistency between ASQ and warning signs, with the Kappa value of 0.63. With the use of GESELL as criterion, it was determined that the sensitivity of warning signs in children with suspected developmental delay was 82.2%, and the specificity was 77.7%. The overall Youden index was 0.6. Conclusion: The reliability and validity of warning signs checklist for screening early childhood developmental problems have met the basic requirements of psychological screening scales, with the characteristics of short testing time and easy operation. Thus, this warning signs checklist can be used for screening psychological and behavioral problems of early childhood
Kane, Michael; Case, Susan
The scores on two distinct tests (e.g., essay and objective) are often combined into a composite score, which is used to make decisions. The validity of the observed composite can sometimes be evaluated relative to a separate criterion. In cases where no criterion is available, the observed composite has generally been evaluated in terms of its…
Full Text Available John C Sieverdes,1 Eric E Wickel,2 Gregory A Hand,3 Marco Bergamin,4 Robert R Moran,5 Steven N Blair3,51Medical University of South Carolina, College of Nursing and Medicine, Charleson, SC, 2University of Tulsa, Exercise and Sport Science, Tulsa, OK, 3University of South Carolina, Department of Exercise Science, Division of Health Aspects of Physical Activity, Arnold School of Public Health, Columbia, SC, USA; 4University of Padova, Department of Medicine, Sports Medicine Division, Padova, Italy; 5University of South Carolina, Department of Epidemiology and Biostatistics, Arnold School of Public Health, Columbia, SC, USABackground: This study evaluated the reliability and criterion validity of the Mywellness Key accelerometer (MWK using treadmill protocols and indirect calorimetry.Methods: Twenty-five participants completed two four-stage 20-minute treadmill protocols while wearing two MWK accelerometers. Reliability was assessed using raw counts. Validity was assessed by comparing the estimated VO2 calculated from the MWK with values from respiratory gas exchange.Results: Good overall and point estimates of reliability were found for the MWK (all intraclass correlations > 0.93. Generalizability theory coefficients showed lower values for running speed (0.70 versus walking speed (all > 0.84, with the majority of the overall percentage of variability derived from the participant (68%–88% of the total 100%. Acceptable validity was found overall (Pearson’s r = 0.895–0.902, P < 0.0001, with an overall mean absolute error of 16.22% and a coefficient of variance of 16.92%. Bland-Altman plots showed an overestimation of energy expenditure during the running speed, but total kilocalories were underestimated during the protocol by approximately 10%.Conclusion: Good validity was found during light and moderate walking, while running was slightly overestimated. The MWK may be useful for clinicians and researchers interested in promotion or assessment
Lam, Benjamin; Middleton, Laura E; Masellis, Mario; Stuss, Donald T; Harry, Robin D; Kiss, Alex; Black, Sandra E
To compare the validity of the Montreal Cognitive Assessment (MoCA) with the criterion standard of standardized neuropsychological testing and to compare the convergent validity of the MoCA with that of existing screening tools and global measures of cognition. Cross-sectional observational study. Tertiary care hospital-based cognitive neurology subspecialty clinic. A convenience sample of 107 individuals with mild Alzheimer's disease (AD, n=75) or mild cognitive impairment (MCI, n=32) from the Sunnybrook Dementia Study. In addition to the MoCA, all participants completed the Mini-Mental State Examination (MMSE), the Mattis Dementia Rating Scale (DRS), and detailed neuropsychological testing. Convergent validity was supported, with MoCA scores correlating well with the MMSE (correlation coefficient (r)=0.66, Pvalidity was supported, with MoCA subscores according to cognitive domain correlating well with analogous neuropsychological tests and, in the case of memory (area under the receiver operating characteristic curve (AUC)=0.86), executive (AUC=0.79), and visuospatial function (AUC=0.79), being reasonably sensitive to impairment in those domains. The MoCA is a valid assessment of cognition that shows good agreement with existing screening tools and global measures (convergent validity) and was superior to the MMSE in this regard. The MoCA domain-specific subscores align with performance on more-detailed neuropsychological tests, suggesting not only good criterion validity for the MoCA, but also that it may be useful in guiding further neuropsychological testing. © 2013, Copyright the Authors Journal compilation © 2013, The American Geriatrics Society.
Maffini, Cara S; Wong, Y Joel
Although measures of cultural identity, values, and behavior exist in the multicultural psychological literature, there is currently no measure that explicitly assesses ethnic minority individuals' positive and negative affect toward culture. Therefore, we developed 2 new measures called the Feelings About Culture Scale--Ethnic Culture and Feelings About Culture Scale--Mainstream American Culture and tested their psychometric properties. In 6 studies, we piloted the measures, conducted factor analyses to clarify their factor structure, and examined reliability and validity. The factor structure revealed 2 dimensions reflecting positive and negative affect for each measure. Results provided evidence for convergent, discriminant, criterion-related, and incremental validity as well as the reliability of the scales. The Feelings About Culture Scales are the first known measures to examine both positive and negative affect toward an individual's ethnic culture and mainstream American culture. The focus on affect captures dimensions of psychological experiences that differ from cognitive and behavioral constructs often used to measure cultural orientation. These measures can serve as a valuable contribution to both research and counseling by providing insight into the nuanced affective experiences ethnic minority individuals have toward culture. (c) 2015 APA, all rights reserved).
Kong, Feng; You, Xuqun; Zhao, Jingjing
The Gratitude Questionnaire (GQ; McCullough et al., 2002) is one of the most widely used instruments to assess dispositional gratitude. The purpose of this study was to validate a Chinese version of the GQ by examining internal consistency, factor structure, convergent validity, and measurement invariance across sex. A total of 1151 Chinese adults were recruited to complete the GQ, Positive Affect and Negative Affect Scales, and Satisfaction with Life Scale. Confirmatory factor analysis indicated that the original unidimensional model fitted well, which is in accordance with the findings in Western populations. Furthermore, the GQ had satisfactory composite reliability and criterion-related validity with measures of life satisfaction and affective well-being. Evidence of configural, metric and scalar invariance across sex was obtained. Tests of the latent mean differences found females had higher latent mean scores than males. These findings suggest that the Chinese version of GQ is a reliable and valid tool for measuring dispositional gratitude and can generally be utilized across sex in the Chinese context.
Watson, David; O'Hara, Michael W.; Chmielewski, Michael; McDade-Montez, Elizabeth A.; Koffel, Erin; Naragon, Kristin; Stuart, Scott
The authors explicated the validity of the Inventory of Depression and Anxiety Symptoms (IDAS; D. Watson et al., 2007) in 2 samples (306 college students and 605 psychiatric patients). The IDAS scales showed strong convergent validity in relation to parallel interview-based scores on the Clinician Rating version of the IDAS; the mean convergent…
Cafiero, Carlo; Melgar-Quiñonez, Hugo R; Ballard, Terri J; Kepple, Anne W
This paper reviews some of the existing food security indicators, discussing the validity of the underlying concept and the expected reliability of measures under reasonably feasible conditions. The main objective of the paper is to raise awareness on existing trade-offs between different qualities of possible food security measurement tools that must be taken into account when such tools are proposed for practical application, especially for use within an international monitoring framework. The hope is to provide a timely, useful contribution to the process leading to the definition of a food security goal and the associated monitoring framework within the post-2015 Development Agenda. © 2014 New York Academy of Sciences.
Full Text Available How strongly does humor (i.e., the construct-relevant content in the Humor Styles Questionnaire (HSQ; Martin et al., 2003 determine the responses to this measure (i.e., construct validity? Also, how much does humor influence the relationships of the four HSQ scales, namely affiliative, self-enhancing, aggressive, and self-defeating, with personality traits and subjective well-being (i.e., criterion validity? The present paper answers these two questions by experimentally manipulating the 32 items of the HSQ to only (or mostly contain humor (i.e., construct-relevant content or to substitute the humor content with non-humorous alternatives (i.e., only assessing construct-irrelevant context. Study 1 (N = 187 showed that the HSQ affiliative scale was mainly determined by humor, self-enhancing and aggressive were determined by both humor and non-humorous context, and self-defeating was primarily determined by the context. This suggests that humor is not the primary source of the variance in three of the HQS scales, thereby limiting their construct validity. Study 2 (N = 261 showed that the relationships of the HSQ scales to the Big Five personality traits and subjective well-being (positive affect, negative affect, and life satisfaction were consistently reduced (personality or vanished (subjective well-being when the non-humorous contexts in the HSQ items were controlled for. For the HSQ self-defeating scale, the pattern of relationships to personality was also altered, supporting an positive rather than a negative view of the humor in this humor style. The present findings thus call for a reevaluation of the role that humor plays in the HSQ (construct validity and in the relationships to personality and well-being (criterion validity.
Ruch, Willibald; Heintz, Sonja
How strongly does humor (i.e., the construct-relevant content) in the Humor Styles Questionnaire (HSQ; Martin et al., 2003) determine the responses to this measure (i.e., construct validity)? Also, how much does humor influence the relationships of the four HSQ scales, namely affiliative, self-enhancing, aggressive, and self-defeating, with personality traits and subjective well-being (i.e., criterion validity)? The present paper answers these two questions by experimentally manipulating the 32 items of the HSQ to only (or mostly) contain humor (i.e., construct-relevant content) or to substitute the humor content with non-humorous alternatives (i.e., only assessing construct-irrelevant context). Study 1 ( N = 187) showed that the HSQ affiliative scale was mainly determined by humor, self-enhancing and aggressive were determined by both humor and non-humorous context, and self-defeating was primarily determined by the context. This suggests that humor is not the primary source of the variance in three of the HQS scales, thereby limiting their construct validity. Study 2 ( N = 261) showed that the relationships of the HSQ scales to the Big Five personality traits and subjective well-being (positive affect, negative affect, and life satisfaction) were consistently reduced (personality) or vanished (subjective well-being) when the non-humorous contexts in the HSQ items were controlled for. For the HSQ self-defeating scale, the pattern of relationships to personality was also altered, supporting an positive rather than a negative view of the humor in this humor style. The present findings thus call for a reevaluation of the role that humor plays in the HSQ (construct validity) and in the relationships to personality and well-being (criterion validity).
Jahn, Rebecca; Baumgartner, Josef S; van den Nest, Miriam; Friedrich, Fabian; Alexandrowicz, Rainer W; Wancata, Johannes
The "Center of Epidemiologic Studies - Depression scale" (CES-D) is a well-known screening tool for depression. Until now the criterion validity of the German version of the CES-D was not investigated in a sample of the adult general population. 508 study participants of the Austrian general population completed the CES-D. ICD-10 diagnoses were established by using the Schedules for Clinical Assessment in Neuropsychiatry (SCAN). Receiver Operating Characteristics (ROC) analysis was conducted. Possible gender differences were explored. Overall discriminating performance of the CES-D was sufficient (ROC-AUC 0,836). Using the traditional cut-off values of 15/16 and 21/22 respectively the sensitivity was 43.2 % and 32.4 %, respectively. The cut-off value developed on the basis of our sample was 9/10 with a sensitivity of 81.1 % und a specificity of 74.3 %. There were no significant gender differences. This is the first study investigating the criterion validity of the German version of the CES-D in the general population. The optimal cut-off values yielded sufficient sensitivity and specificity, comparable to the values of other screening tools. © Georg Thieme Verlag KG Stuttgart · New York.
Guirao-Goris, Silamani J; Ferrer Ferrandis, Esperanza; Montejano Lozoya, Raimunda
The aim of the study is to identify the construct and criterion validity of the nursing diagnosis label Sedentary Lifestyle. A cross-sectional study in a nursing consultation in primary health care was conducted. Participants were all people that was attended for one year over 50 who voluntarily wish to participate (n=85) in the study. Objective weekly physical activity was measured in METs with an Accelerometer, objective measure of performance was measured by gait speed EPESE Battery (both measures that were used as the gold standard), and physical activity questionnaires (RAPA), the COOP-WONCA physical fitness chart. Spearman correlation coefficients, mean comparison tests and analysis of sensitivity and specificity were used as statistical analysis. The diagnosis "Sedentary Lifestyle" showed a positive correlation between its manifestations and physical activity measured in METs (r=0.39) and EPESE gait speed (r=0.35). The diagnosis showed a sensitivity of 85.1% and a specificity of 65.2% and showed ability to discriminate active people from those that are not using METs as a measure of physical activity (t=-4.4). The diagnosis "Sedentary Lifestyle" shows criterion and construct validity.
Weber, Ulrich; Zubler, Veronika; Pedersen, Susanne J
OBJECTIVE: To validate an MRI reference criterion for a positive SIJ MRI based on the level of confidence in classification of spondyloarthritis (SpA) by expert MRI readers. METHODS: Four readers assessed SIJ MRI in two inception cohorts (A/B) of 157 consecutive back pain patients ≤50 years, and ...... using two inception cohorts and comparing clinical and MRI-based classification supports the case for including both erosion and BME to define a positive SIJ MRI for the classification of axial SpA. © 2012 by the American College of Rheumatology.......OBJECTIVE: To validate an MRI reference criterion for a positive SIJ MRI based on the level of confidence in classification of spondyloarthritis (SpA) by expert MRI readers. METHODS: Four readers assessed SIJ MRI in two inception cohorts (A/B) of 157 consecutive back pain patients ≤50 years......, and in 20 healthy controls. Patients were classified according to clinical examination and pelvic radiography as having non-radiographic axial SpA (n=51), ankylosing spondylitis (n=34), or non-specific back pain (n=72). Readers recorded their level of confidence in the classification of SpA on a 0-10 scale...
Buri, Hilary M; Daly, Jeanette M; Jogerst, Gerald J
(a) To identify reliable and valid questions that identify elder abuse, (b) to assess the reliability and validity of extant self-reported elder abuse screens in a high-risk elderly population, and (c) to describe difficulties of completing and interpreting screens in a high-need elderly population. All elders referred to research-trained social workers in a community service agency were asked to participate. Of the 70 elders asked, 49 participated, 44 completed the first questionnaire, and 32 completed the duplicate second questionnaire. A research assistant administered the telephone questionnaires. Twenty-nine (42%) persons were judged abused, 12 (17%) had abuse reported, and 4 (6%) had abuse substantiated. The elder abuse screen instruments were not found to be predictive of assessed abuse or as predictors of reported abuse; the measures tended toward being inversely predictive. Two questions regarding harm and taking of belongings were significantly different for the assessed abused group. In this small group of high-need community-dwelling elders, the screens were not effective in discriminating between abused and nonabused groups. Better instruments are needed to assess for elder abuse.
Chung, Mi Ja; Park, Youngrye; Eun, Young
The aim of this study was to examine the validity and reliability of the Korean Version of the Spiritual Care Competence Scale (K-SCCS). A cross-sectional study design was used. The K-SCCS consisted of 26 questions to measure spiritual care competence of nurses. Participants, 228 nurses who had more than 3 years'experience as a nurse, completed the survey. Confirmatory factor analysis was used to examine the construct validity and correlations of K-SCCS and spiritual well-being (SWB) were used to examine the criterion validity of K-SCCS. Cronbach's alpha was used to test internal consistency. The construct and the criterion-related validity of K-SCCS were supported as measures of spiritual care competence. Cronbach's alpha was .95. Factor loadings of the 26 questions ranged from .60 to .96. Construct validity of K-SCCS was verified by confirmatory factor analysis (RMSEA=.08, CFI=.90, NFI=.85). Criterion validity compared to the SWB showed significant correlation (r=.44, pspiritual care competence with validity and reliability. However, further study is needed to retest the verification of the factor analysis related to factor 2 (professionalisation and improving the quality of spiritual care) and factor 3 (personal support and patient counseling). Therefore, we recommend using the total score without distinguishing subscales.
Goodwin, Laura D.; Goodwin, William L.
The views of prominant qualitative methodologists on the appropriateness of validity and reliability estimation for the measurement strategies employed in qualitative evaluations are summarized. A case is made for the relevance of validity and reliability estimation. Definitions of validity and reliability for qualitative measurement are presented…
Full Text Available Validation of land cover products is a fundamental task prior to data applications. Current validation schemes and methods are, however, suited only for assessing classification accuracy and disregard the reliability of land cover products. The reliability evaluation of land cover products should be undertaken to provide reliable land cover information. In addition, the lack of high-quality reference data often constrains validation and affects the reliability results of land cover products. This study proposes a validation schema to evaluate the reliability of land cover products, including two methods, namely, result reliability evaluation and process reliability evaluation. Result reliability evaluation computes the reliability of land cover products using seven reliability indicators. Process reliability evaluation analyzes the reliability propagation in the data production process to obtain the reliability of land cover products. Fuzzy fault tree analysis is introduced and improved in the reliability analysis of a data production process. Research results show that the proposed reliability evaluation scheme is reasonable and can be applied to validate land cover products. Through the analysis of the seven indicators of result reliability evaluation, more information on land cover can be obtained for strategic decision-making and planning, compared with traditional accuracy assessment methods. Process reliability evaluation without the need for reference data can facilitate the validation and reflect the change trends of reliabilities to some extent.
Serel Arslan, S; Demir, N; Karaduman, A A
This study aimed to develop a scale called Tongue Thrust Rating Scale (TTRS), which categorised tongue thrust in children in terms of its severity during swallowing, and to investigate its validity and reliability. The study describes the developmental phase of the TTRS and presented its content and criterion-based validity and interobserver and intra-observer reliability. For content validation, seven experts assessed the steps in the scale over two Delphi rounds. Two physical therapists evaluated videos of 50 children with cerebral palsy (mean age, 57·9 ± 16·8 months), using the TTRS to test criterion-based validity, interobserver and intra-observer reliability. The Karaduman Chewing Performance Scale (KCPS) and Drooling Severity and Frequency Scale (DSFS) were used for criterion-based validity. All the TTRS steps were deemed necessary. The content validity index was 0·857. A very strong positive correlation was found between two examinations by one physical therapist, which indicated intra-observer reliability (r = 0·938, P reliability (r = 0·892, P validity of the TTRS. The TTRS is a valid, reliable and clinically easy-to-use functional instrument to document the severity of tongue thrust in children. © 2016 John Wiley & Sons Ltd.
Messinis, Lambros; Malegiannaki, Amaryllis-Chryssi; Christodoulou, Tessa; Panagiotopoulos, Vassillis; Papathanasopoulos, Panagiotis
The Color Trails Test (CTT) was developed as a culturally fair analog of the Trail Making Test. In the present study, normative data for the CTT were developed for the Greek adult population and further the criterion validity of the CTT was examined in two clinical groups (29 Parkinson's disease [PD] and 25 acute stroke patients). The instrument was applied to 163 healthy participants, aged 19-75. Stepwise linear regression analyses revealed a significant influence of age and education level on completion time in both parts of the CTT (increased age and decreased educational level contributed to slower completion times for both parts), whereas gender did not influence time to completion of part B. Further, the CTT appears to discriminate adequately between the performance of PD and acute stroke patients and matched healthy controls.
von Porat, Anette; Holmström, Eva; Roos, Ewa
BACKGROUND AND PURPOSE: In clinical practice, visual observation is often used to determine functional impairment and to evaluate treatment following a knee injury. The aim of this study was to evaluate the reliability and validity of observational assessments of knee movement pattern quality......, crossover hop on one leg and one-leg hop. The videos were observed by four physiotherapists, and the knee movement pattern quality, a feature of the loading strategy of the lower extremity, was scored on an 11-point rating scale. To assess the criterion validity, the observational rating was correlated...... obtained between the observers' assessment and knee flexion angle, r = 0.37-0.61. The crossover hop test or one-leg hop test was ranked as the most useful test in 172 of 192 occasions (90%) when assessing knee function. CONCLUSION: The moderate to good inter-observer reliability and the moderate criterion...
Post, Marcel W
Clinimetric studies may use criteria for test-retest reliability and convergent validity such that correlation coefficients as low as .40 are supportive of reliability and validity. It can be argued that moderate (.40-.60) correlations should not be interpreted in this way and that reliability
Powers, Stephen; And Others
Spanish speaking first graders were administered the Artes de Lenguage (ADL)--a Spanish, criterion-referenced, language arts test. Reliability analyses indicated the adequacy of three of the four subscales (Phonetic Analysis, Vocabulary Development, Comprehension Skills, and General Skills). A principal factors analysis of the intercorrelation…
Sal'nikov, N.L.; Filimonov, E.V.
Monitoring temperature regimes is an important part of ensuring the operational safety of a nuclear power plant. Therefore, high standards are imposed upon the reliability of the primary information on the heat field of the object obtained from different sensors, and it is urgent to develop methods of evaluating the metrological reliability of these sensors. THe main sources of thermometric information at nuclear power plants are contact temperature sensors, the most widely used of these being thermoelectric converters (TEC) and thermal resistance converters (TRC)
Spathis, Jemima Grace; Connick, Mark James; Beckman, Emma Maree; Newcombe, Peter Anthony; Tweedy, Sean Michael
Paralympic throwing events for athletes with physical impairments comprise seated and standing javelin, shot put, discus and seated club throwing. Identification of talented throwers would enable prediction of future success and promote participation; however, a valid and reliable talent identification battery for Paralympic throwing has not been reported. This study evaluates the reliability and validity of a talent identification battery for Paralympic throws. Participants were non-disabled so that impairment would not confound analyses, and results would provide an indication of normative performance. Twenty-eight non-disabled participants (13 M; 15 F) aged 23.6 years (±5.44) performed five kinematically distinct criterion throws (three seated, two standing) and nine talent identification tests (three anthropometric, six motor); 23 were tested a second time to evaluate test-retest reliability. Talent identification test-retest reliability was evaluated using Intra-class Correlation Coefficient (ICC) and Bland-Altman plots (Limits of Agreement). Spearman's correlation assessed strength of association between criterion throws and talent identification tests. Reliability was generally acceptable (mean ICC = 0.89), but two seated talent identification tests require more extensive familiarisation. Correlation strength (mean rs = 0.76) indicated that the talent identification tests can be used to validly identify individuals with competitively advantageous attributes for each of the five kinematically distinct throwing activities. Results facilitate further research in this understudied area.
Oubaid, V; Anheuser, P
Employees represent an important safety factor in high-reliability organizations. The combination of clear organizational structures, a nonpunitive safety culture, and psychological personnel selection guarantee a high level of safety. The cockpit personnel selection process of a major German airline is presented in order to demonstrate a possible transferability into medicine and urology.
Full Text Available For the mechanism with rotating cam and knife-edge follower, an optimization criterion by means of imposed constraints upon cam’s curvature is expressed in a special coordinate system. Thus, stating the optimization criterion in the coordinate system defined by the mechanisms constructive parameters -eccentricity and minimum follower’s stroke, a contour is obtained for any position of the mechanism. The optimization criterion assumes establishing the position of the characteristic point of the mechanism with respect to this contour. Fulfillment of optimization criterion assumes that the characteristic point is positioned in the same manner with respect to all contours. The optimization criterion is simplified when considering the envelope of the contours. The method is exemplified using two mechanisms, with the cams priori satisfying the criterion.
Haradhan Kumar Mohajan
Full Text Available Reliability and validity are two most important and fundamental features in the evaluation of any measurement instrument or toll for a good research. The purpose of this research is to discuss the validity and reliability of measurement instruments that are used in research. Validity concerns what an instrument measures, and how well it does so. Reliability concerns the faith that one can have in the data obtained from use of an instrument, that is, the degree to which any measuring tool controls for random error. An attempt has been taken here to review the reliability and validity, and threat to them in some details.
Full Text Available The paper presents the need to develop a description of the importance of the technological systems reliability structure elements in terms of security of the system. Basic issues related to the exploration of weak links and important elements in the system as well as a proposal to develop the current approach to assessing the importance of the system components have been presented. Moreover, the differences between the unreliability of suitability and unreliability of safety have been pointed out.
Reliability and Concurrent Validity of the International Personality item Pool (IPIP) Big-five Factor Markers in Nigeria. ... Nigerian Journal of Psychiatry ... Aims: The aim of this study was to assess the internal consistency and concurrent validity ...
Hale, W D; Fiedler, L R; Cochran, C D
The Generalized Expectancy for Success Scale (GESS; Fibel & Hale, 1978) was revised and assessed for reliability and validity. The revised version was administered to 199 college students along with other conceptually related measures, including the Rosenberg Self-Esteem Scale, the Life Orientation Test, and Rotter's Internal-External Locus of Control Scale. One subsample of students also completed the Eysenck Personality Inventory, while another subsample performed a criterion-related task that involved risk taking. Item analysis yielded 25 items with correlations of .45 or higher with the total score. Results indicated high internal consistency and test-retest reliability.
Full Text Available BACKGROUND: The CogState Schizophrenia Battery (CSB, a computerized cognitive battery, covers all the same cognitive domains as the Measurement and Treatment Research to Improve Cognition in Schizophrenia (MATRICS Consensus Cognitive Battery but is briefer to conduct. The aim of the present study was to evaluate the criterion and construct validity of the Japanese language version of the CSB (CSB-J in Japanese patients with schizophrenia. METHODOLOGY/PRINCIPAL FINDINGS: Forty Japanese patients with schizophrenia and 40 Japanese healthy controls with matching age, gender, and premorbid intelligence quotient were enrolled. The CSB-J and the Brief Assessment of Cognition in Schizophrenia, Japanese-language version (BACS-J were performed once. The structure of the CSB-J was also evaluated by a factor analysis. Similar to the BACS-J, the CSB-J was sensitive to cognitive impairment in Japanese patients with schizophrenia. Furthermore, there was a significant positive correlation between the CSB-J composite score and the BACS-J composite score. A factor analysis showed a three-factor model consisting of memory, speed, and social cognition factors. CONCLUSIONS/SIGNIFICANCE: This study suggests that the CSB-J is a useful and rapid automatically administered computerized battery for assessing broad cognitive domains in Japanese patients with schizophrenia.
Banhato, Eliane Ferreira Carvalho; Leite, Isabel Cristina Gonçalves; Guedes, Danielle Viveiros; Chaoubah, Alfredo
Although a normative process, changes in cognitive functioning vary among older adults. The differential diagnosis between normal and pathological aging must be made early using psychometrically adequate measures. To assess the evidence of criterion validity of a Short Form (SF) of the Wechsler-III Scale containing eight subtests (SF8) by determining its sensitivity, specificity, positive and negative predictive values and cut-off points for Brazilian elderly from different age groups. 168 individuals, aged 60 years or above, living in the community or in an institution, were assigned to case and control groups, and investigated according to age range. Measures included a sociodemographic questionnaire, the Mini-Mental State Examination (MMSE), Verbal Fluency Test, Clock-Drawing Test and the SF8. More than two thirds of the sample was women (73.8%), mean age was 74.5 years (SD=8.9), mean education was 6.2 years (SD=4.8) and 40.5% were widows/widowers. In the total sample, the best cut-off point for the SF8 was 142 while cut offs among individuals aged 60 to 69 years, 70 to 79 years, and more than 80 years were 160, 129 and 129, respectively. The results demonstrated the importance of different cut-off points for different age ranges. Sensitivity and specificity values of the SF8 were sufficiently high to warrant the use of the SF8 as an instrument to identify cognitive impairment in the elderly.
Janssens, Annelies; Goossens, Luc; Van Den Noortgate, Wim; Colpin, Hilde; Verschueren, Karine; Van Leeuwen, Karla
Uncertainty persists regarding adequate measurement of parenting behavior during early adolescence. The present study aimed to clarify the conceptual structure of parenting by evaluating three different models that include support, psychological control, and various types of behavioral control (i.e., proactive, punitive, and harsh punitive control). Furthermore, we examined measurement invariance of parenting ratings by 1,111 Flemish adolescents from Grade 7 till 9, their mother, and father. Finally, criterion validity of parenting ratings was estimated in relation to adolescent problem behavior. Results supported a five-factor parenting model indicating multiple aspects of behavioral control, with punitive and harsh punitive control as more intrusive forms and proactive control as a more supportive form. Similar constructs were measured for adolescents, mothers, and fathers (i.e., configural and metric invariance), however on a different scale (i.e., scalar noninvariance). Future research and clinical practices should acknowledge these findings in order to fully grasp the parenting process. © The Author(s) 2014.
Full Text Available Introduction: Objective structured clinical examination (OSCE is used for the evaluation of the clinical competence in medicine for which it is essential to measure validity and reliability. This study aimed to investigate the validity and reliability of OSCE for residents of obstetrics and gynecology at Kermanshah University of Medical Sciences in 2011.Methods: A descriptive-correlation study was designed and the data of OSCE for obstetrics and gynecology were collected via learning behavior checklists in method stations and multiple choice questions in question stations. The data were analyzed through Pearson correlation coefficient and Cronbach's alpha, using SPSS software (version 16. To determine the criterion validity, correlation of OSCE scores with scores of resident promotion test, direct observation of procedural skills, and theoretical knowledge was determined; for reliability, however, Cronbach's alpha was used. Total sample consisted of 25 participants taking part in 14 stations. P value of less than 0.05 was considered as significant.Results: The mean OSCE scores was 22.66 (±6.85. Criterion validity of the stations with resident promotion theoretical test, first theoretical knowledge test, second theoretical knowledge, and direct observation of procedural skills (DOPS was 0.97, 0.74, 0.49, and 0.79, respectively. In question stations, criterion validity was 0.15, and total validity of OSCE was 0.77.Conclusion: Findings of the present study indicated acceptable validity and reliability of OSCE for residents of obstetrics and gynecology.
Douglas Altamiro Consolo
Full Text Available
This paper reports on a process to validate a revised version of a system for coding classroom discourse in foreign language lessons, a context in which the dual role of language (as content and means of communication and the speakers' specific pedagogical aims lead to a certain degree of ambiguity in language analysis. The language used by teachers and students has been extensively studied, and a framework of concepts concerning classroom discourse well-established. Models for coding classroom language need, however, to be revised when they are applied to specific research contexts. The application and revision of an initial framework can lead to the development of earlier models, and to the re-definition of previously established categories of analysis that have to be validated. The procedures followed to validate a coding system are related here as guidelines for conducting research under similar circumstances. The advantages of using instruments that incorporate two types of data, that is, quantitative measures and qualitative information from raters' metadiscourse, are discussed, and it is suggested that such procedure can contribute to the process of validation itself, towards attaining reliability of research results, as well as indicate some constraints of the adopted research methodology.
Casartelli, Nicola; Müller, Roland; Maffiuletti, Nicola A
The aim of the present study was to verify the validity and reliability of the Myotest accelerometric system (Myotest SA, Sion, Switzerland) for the assessment of vertical jump height. Forty-four male basketball players (age range: 9-25 years) performed series of squat, countermovement and repeated jumps during 2 identical test sessions separated by 2-15 days. Flight height was simultaneously quantified with the Myotest system and validated photoelectric cells (Optojump). Two calculation methods were used to estimate the jump height from Myotest recordings: flight time (Myotest-T) and vertical takeoff velocity (Myotest-V). Concurrent validity was investigated comparing Myotest-T and Myotest-V to the criterion method (Optojump), and test-retest reliability was also examined. As regards validity, Myotest-T overestimated jumping height compared to Optojump (p 0.98), that is, excellent validity. Myotest-V overestimated jumping height compared to Optojump (p 12 cm), high limits of agreement ratios (>36%), and low ICCs (9 cm). In conclusion, Myotest-T is a valid and reliable method for the assessment of vertical jump height, and its use is legitimate for field-based evaluations, whereas Myotest-V is neither valid nor reliable.
Goode, N; Salmon, P M; Taylor, N Z; Lenné, M G; Finch, C F
One factor potentially limiting the uptake of Rasmussen's (1997) Accimap method by practitioners is the lack of a contributing factor classification scheme to guide accident analyses. This article evaluates the intra- and inter-rater reliability and criterion-referenced validity of a classification scheme developed to support the use of Accimap by led outdoor activity (LOA) practitioners. The classification scheme has two levels: the system level describes the actors, artefacts and activity context in terms of 14 codes; the descriptor level breaks the system level codes down into 107 specific contributing factors. The study involved 11 LOA practitioners using the scheme on two separate occasions to code a pre-determined list of contributing factors identified from four incident reports. Criterion-referenced validity was assessed by comparing the codes selected by LOA practitioners to those selected by the method creators. Mean intra-rater reliability scores at the system (M = 83.6%) and descriptor (M = 74%) levels were acceptable. Mean inter-rater reliability scores were not consistently acceptable for both coding attempts at the system level (M T1 = 68.8%; M T2 = 73.9%), and were poor at the descriptor level (M T1 = 58.5%; M T2 = 64.1%). Mean criterion referenced validity scores at the system level were acceptable (M T1 = 73.9%; M T2 = 75.3%). However, they were not consistently acceptable at the descriptor level (M T1 = 67.6%; M T2 = 70.8%). Overall, the results indicate that the classification scheme does not currently satisfy reliability and validity requirements, and that further work is required. The implications for the design and development of contributing factors classification schemes are discussed. Copyright © 2017 Elsevier Ltd. All rights reserved.
Full Text Available Cybercrime is a growing and worrisome problem, particularly when it involves minors. Cyber aggression among adolescents in particular can result in negative legal and psychological consequences for people involved. Therefore, it is important to have instruments to detect these incidents early and understand the problem to propose effective measures for prevention and treatment. This paper aims to design a new self-report, the Cyber-Aggression Questionnaire for Adolescents (CYBA, to evaluate the extentto which the respondent conducts aggressions through a mobile phone or the internet and analyse the factorial and criterion validity and reliability of their scores in a sample of adolescents from Asturias, Spain. The CYBA was administered to 3,148 youth aged between 12 and 18 years old along with three self-reports to measure aggression at school, impulsivity, and empathy. Regarding factorial validity, the model that best represents the structure of the CYBA consists of three factors (Impersonation, Visual sexual Cyber-aggression, and Verbal Cyber-aggression and Exclusion and four additional indicators of Visual Cyber-aggression–Teasing/Happy Slapping. Regarding criterion validity, the score on the CYBA correlates positively with aggression at school and impulsivity and negatively with empathy. That is the way cyber-aggression correlates with these three variables, according to previous empirical evidence. The reliability of the scores on each item and factor of the CYBA are adequate. Therefore, the CYBA offers a valid and reliable measure of cyber-aggression in adolescents.
Stockbrugger, Barry A.; Haennel, Robert G.
Evaluated the validity and reliability of a medicine ball throw test to evaluate explosive power. Data on competitive sand volleyball players who performed a medicine ball throw and a standard countermovement jump indicated that the medicine ball throw test was a valid and reliable way to assess explosive power for an analogous total-body movement…
Alkhamra, Rana A.; Al-Jazi, Aya B.
Background: The Token Test for Children (2nd edition) (TTFC) is a measure for assessing receptive language. In this study we describe the translation process, validity and reliability of the Arabic Token Test for Children (A-TTFC). Aims: The aim of this study is to translate, validate and establish the reliability of the Arabic Token Test for…
Badjadi, Nour El Imane
The current paper on writing assessment surveys the literature on the reliability and validity of essay tests. The paper aims to examine the two concepts in relationship with essay testing as well as to provide a snapshot of the current understandings of the reliability and validity of essay tests as drawn in recent research studies. Bearing in…
Osadebe, P. U.
The study was carried out to construct a valid and reliable test in Economics for secondary school students. Two research questions were drawn to guide the establishment of validity and reliability for the Economics Achievement Test (EAT). It is a multiple choice objective test of five options with 100 items. A sample of 1000 students was randomly…
The aim of this research is to develop the Mobbing Scale and examine its validity and reliability. The sample of the study consisted of 515 persons from Sakarya and Bursa. In this study, construct validity, internal consistency, test-retest reliability, and item analysis of the scale were examined. As a result of factor analysis for construct…
Yule, S; Gupta, A; Gazarian, D; Geraghty, A; Smink, D S; Beard, J; Sundt, T; Youngson, G; McIlhenny, C; Paterson-Brown, S
Surgeons' non-technical skills are an important part of surgical performance and surgical education. The most widely adopted assessment tool is the Non-Technical Skills for Surgeons (NOTSS) behaviour rating system. Psychometric analysis of this tool to date has focused on inter-rater reliability and feasibility rather than validation. NOTSS assessments were collected from two groups of consultant/attending surgeons in the UK and USA, who rated behaviours of the lead surgeon during a video-based simulated crisis scenario after either online or classroom instruction. The process of validation consisted of assessing construct validity, scale reliability and concurrent criterion validity, and undertaking a sensitivity analysis. Central to this was confirmatory factor analysis to evaluate the structure of the NOTSS taxonomy. Some 255 consultant surgeons participated in the study. The four-category NOTSS model was found to have robust construct validity evidence, and a superior fit compared with alternative models. Logistic regression and sensitivity analysis revealed that, after adjusting for technical skills, for every 1-point increase in NOTSS score of the lead surgeon, the odds of having a higher versus lower patient safety score was 2·29 times. The same pattern of results was obtained for a broad mix of surgical specialties (UK) as well as a single discipline (cardiothoracic, USA). The NOTSS tool can be applied in research and education settings to measure non-technical skills in a valid and efficient manner. © 2018 BJS Society Ltd Published by John Wiley & Sons Ltd.
Dong, Lijuan; Liu, Na; Tian, Xiaoyu; Qiao, Xiaoxia; Gobbens, Robbert J J; Kane, Robert L; Wang, Cuili
To translate the Tilburg Frailty Indicator (TFI) into Chinese and assess its reliability and validity. A sample of 917 community-dwelling older people, aged ≥60 years, in a Chinese city was included between August 2015 and March 2016. Construct validity was assessed using alternative measures corresponding to the TFI items, including self-rated health status (SRH), unintentional weight loss, walking speed, timed-up-and-go tests (TUGT), making telephone calls, grip strength, exhaustion, Short Portable Mental Status Questionnaire (SPMSQ), Geriatric Depression scale (GDS-15), emotional role, Adaptability Partnership Growth Affection and Resolve scale (APGAR) and Social Support Rating Scale (SSRS). Fried's phenotype and frailty index were measured to evaluate criterion validity. Adverse health outcomes (ADL and IADL disability, healthcare utilization, GDS-15, SSRS) were used to assess predictive (concurrent) validity. The internal consistency reliability was good (Cronbach's α=0.71). The test-retest reliability was strong (r=0.88). Kappa coefficients showed agreements between the TFI items and corresponding alternative measures. Alternative measures correlated as expected with the three domains of TFI, with an exclusion that alternative psychological measures had similar correlations with psychological and physical domains of the TFI. The Chinese TFI had excellent criterion validity with the AUCs regarding physical phenotype and frailty index of 0.87 and 0.86, respectively. The predictive (concurrent) validities of the adverse health outcomes and healthcare utilization were acceptable (AUCs: 0.65-0.83). The Chinese TFI has good validity and reliability as an integral instrument to measure frailty of older people living in the community in China. Copyright © 2017 Elsevier B.V. All rights reserved.
Vendrig, A A; Schaafsma, F G
Purpose The purpose of this study is to measure the psychometric properties of the Work and Wellbeing Inventory (WBI) (in Dutch: VAR-2), a screening tool that is used within occupational health care and rehabilitation. Our research question focused on the reliability and validity of this inventory. Methods Over the years seven different samples of workers, patients and sick listed workers varying in size between 89 and 912 participants (total: 2514), were used to measure the test-retest reliability, the internal consistency, the construct and concurrent validity, and the criterion and predictive validity. Results The 13 scales displayed good internal consistency and test-retest reliability. The constructive validity of the WBI could clearly be demonstrated in both patients and healthy workers. Confirmative factor analyses revealed a CFI >.90 for all scales. The depression scale predicted future work absenteeism (>6 weeks) because of a common mental disorder in healthy workers. The job strain scale and the illness behavior scale predicted long term absenteeism (>3 months) in workers with short-term absenteeism. The illness behavior scale moderately predicted return to work in rehab patients attending an intensive multidisciplinary program. Conclusions The WBI is a valid and reliable tool for occupational health practitioners to screen for risk factors for prolonged or future sickness absence. With this tool they will have reliable indications for further advice and interventions to restore the work ability.
Eliane Ferreira Carvalho Banhato
Full Text Available Abstract Although a normative process, changes in cognitive functioning vary among older adults. The differential diagnosis between normal and pathological aging must be made early using psychometrically adequate measures. Objectives: To assess the evidence of criterion validity of a Short Form (SF of the Wechsler-III Scale containing eight subtests (SF8 by determining its sensitivity, specificity, positive and negative predictive values and cut-off points for Brazilian elderly from different age groups. Methods: 168 individuals, aged 60 years or above, living in the community or in an institution, were assigned to case and control groups, and investigated according to age range. Measures included a sociodemographic questionnaire, the Mini-Mental State Examination (MMSE, Verbal Fluency Test, Clock-Drawing Test and the SF8. Results: More than two thirds of the sample was women (73.8%, mean age was 74.5 years (SD=8.9, mean education was 6.2 years (SD=4.8 and 40.5% were widows/widowers. In the total sample, the best cut-off point for the SF8 was 142 while cut offs among individuals aged 60 to 69 years, 70 to 79 years, and more than 80 years were 160, 129 and 129, respectively. Conclusions: The results demonstrated the importance of different cut-off points for different age ranges. Sensitivity and specificity values of the SF8 were sufficiently high to warrant the use of the SF8 as an instrument to identify cognitive impairment in the elderly.
CIE 2015 August 2-5, 2015, Boston, Massachusetts, USA [DRAFT] DETC2015-46982 DEVELOPMENT OF A CONSERVATIVE MODEL VALIDATION APPROACH FOR RELIABLE...obtain a conservative simulation model for reliable design even with limited experimental data. Very little research has taken into account the...3, the proposed conservative model validation is briefly compared to the conventional model validation approach. Section 4 describes how to account
Full Text Available Both qualitative and quantitative paradigms try to find the same result; the truth. Qualitative studies are tools used in understanding and describing the world of human experience. Since we maintain our humanity throughout the research process, it is largely impossible to escape the subjective experience, even for the most experienced of researchers. Reliability and Validity are the issue that has been described in great deal by advocates of quantitative researchers. The validity and the norms of rigor that are applied to quantitative research are not entirely applicable to qualitative research. Validity in qualitative research means the extent to which the data is plausible, credible and trustworthy; and thus can be defended when challenged. Reliability and validity remain appropriate concepts for attaining rigor in qualitative research. Qualitative researchers have to salvage responsibility for reliability and validity by implementing verification strategies integral and self-correcting during the conduct of inquiry itself. This ensures the attainment of rigor using strategies inherent within each qualitative design, and moves the responsibility for incorporating and maintaining reliability and validity from external reviewers’ judgments to the investigators themselves. There have different opinions on validity with some suggesting that the concepts of validity is incompatible with qualitative research and should be abandoned while others argue efforts should be made to ensure validity so as to lend credibility to the results. This paper is an attempt to clarify the meaning and use of reliability and validity in the qualitative research paradigm.
The relation between the excess entropy production criterion of thermodynamic stabilityfor nonequilibrium states and kinetic linear stability principle is discussed. It is shown thatthe condition required by the excess entropy production criterion generally is sufficient, butnot necessary to judge the system stability. The condition required by the excess entropyproduction criterion is stronger than that of the linear stability principle. Only when theproduct matrix between the linearized matrix of kinetic equations and matrix of quadraticform of second-order excess entropy is symmetric, is the condition required by the excessentropy production criterion that the steady steate is asymptotically stable (δ_xP>0) necessaryand sufficient. The counterexample given by Fox to prove that the excess entropy, (δ~2S)ss,is not a Liapunov function is incorrect. Contradictory to his conclusion, the counterexampleis just a positive one that proves that the excess entropy is a Liapunov function. Moreover,the excess entropy production criterion is not limited by symmetric conditions of the linear-ized matrix of kinetic equations. The excess entropy around nonequilibrium steady states,(δ~2S)ss, is a Liapunov function of thermodynamic system.
Ammerman Alice S
Full Text Available Abstract Background Few assessment instruments have examined the nutrition and physical activity environments in child care, and none are self-administered. Given the emerging focus on child care settings as a target for intervention, a valid and reliable measure of the nutrition and physical activity environment is needed. Methods To measure inter-rater reliability, 59 child care center directors and 109 staff completed the self-assessment concurrently, but independently. Three weeks later, a repeat self-assessment was completed by a sub-sample of 38 directors to assess test-retest reliability. To assess criterion validity, a researcher-administered environmental assessment was conducted at 69 centers and was compared to a self-assessment completed by the director. A weighted kappa test statistic and percent agreement were calculated to assess agreement for each question on the self-assessment. Results For inter-rater reliability, kappa statistics ranged from 0.20 to 1.00 across all questions. Test-retest reliability of the self-assessment yielded kappa statistics that ranged from 0.07 to 1.00. The inter-quartile kappa statistic ranges for inter-rater and test-retest reliability were 0.45 to 0.63 and 0.27 to 0.45, respectively. When percent agreement was calculated, questions ranged from 52.6% to 100% for inter-rater reliability and 34.3% to 100% for test-retest reliability. Kappa statistics for validity ranged from -0.01 to 0.79, with an inter-quartile range of 0.08 to 0.34. Percent agreement for validity ranged from 12.9% to 93.7%. Conclusion This study provides estimates of criterion validity, inter-rater reliability and test-retest reliability for an environmental nutrition and physical activity self-assessment instrument for child care. Results indicate that the self-assessment is a stable and reasonably accurate instrument for use with child care interventions. We therefore recommend the Nutrition and Physical Activity Self-Assessment for
Sanders, James L; Williams, Robert J
Most tests of video game addiction have weak construct validity and limited ability to correctly identify people in denial. The purpose of the present research was to investigate the reliability and validity of a new test of video game addiction (Behavioral Addiction Measure-Video Gaming [BAM-VG]) that was developed in part to address these deficiencies. Regular adult video gamers (n = 506) were recruited from a Canadian online panel and completed a survey containing three measures of excessive video gaming (BAM-VG; DSM-5 criteria for Internet Gaming Disorder [IGD]; and the IGD-20), as well as questions concerning extensiveness of video game involvement and self-report of problems associated with video gaming. One month later, they were reassessed for the purposes of establishing test-retest reliability. The BAM-VG demonstrated good internal consistency as well as 1 month test-retest reliability. Criterion-related validity was demonstrated by significant correlations with the following: time spent playing, self-identification of video game problems, and scores on other instruments designed to assess video game addiction (DSM-5 IGD, IGD-20). Consistent with the theory, principal component analysis identified two components underlying the BAM-VG that roughly correspond with impaired control and significant negative consequences deriving from this impaired control. Together with its excellent construct validity and other technical features, the BAM-VG represents a reliable and valid test of video game addiction.
Christiansen, H; Kis, B; Hirsch, O; Matthies, S; Hebebrand, J; Uekermann, J; Abdel-Hamid, M; Kraemer, M; Wiltfang, J; Graf, E; Colla, M; Sobanski, E; Alm, B; Rösler, M; Jacob, C; Jans, T; Huss, M; Schimmelmann, B G; Philipsen, A
The German version of the Conners Adult ADHD Rating Scales (CAARS) has proven to show very high model fit in confirmative factor analyses with the established factors inattention/memory problems, hyperactivity/restlessness, impulsivity/emotional lability, and problems with self-concept in both large healthy control and ADHD patient samples. This study now presents data on the psychometric properties of the German CAARS-self-report (CAARS-S) and observer-report (CAARS-O) questionnaires. CAARS-S/O and questions on sociodemographic variables were filled out by 466 patients with ADHD, 847 healthy control subjects that already participated in two prior studies, and a total of 896 observer data sets were available. Cronbach's-alpha was calculated to obtain internal reliability coefficients. Pearson correlations were performed to assess test-retest reliability, and concurrent, criterion, and discriminant validity. Receiver Operating Characteristics (ROC-analyses) were used to establish sensitivity and specificity for all subscales. Coefficient alphas ranged from .74 to .95, and test-retest reliability from .85 to .92 for the CAARS-S, and from .65 to .85 for the CAARS-O. All CAARS subscales, except problems with self-concept correlated significantly with the Barrett Impulsiveness Scale (BIS), but not with the Wender Utah Rating Scale (WURS). Criterion validity was established with ADHD subtype and diagnosis based on DSM-IV criteria. Sensitivity and specificity were high for all four subscales. The reported results confirm our previous study and show that the German CAARS-S/O do indeed represent a reliable and cross-culturally valid measure of current ADHD symptoms in adults. Copyright © 2011 Elsevier Masson SAS. All rights reserved.
Admiraal, W.; Hoeksma, M.; van de Kamp, M.-T.; van Duin, G.
The richness and complexity of video portfolios endanger both the reliability and validity of the assessment of teacher competencies. In a post-graduate teacher education program, the assessment of video portfolios was evaluated for its reliability, construct validity, and consequential validity.
Akmaz, Hazel Ekin; Uyar, Meltem; Kuzeyli Yıldırım, Yasemin; Akın Korhan, Esra
Pain acceptance is the process of giving up the struggle with pain and learning to live a worthwhile life despite it. In assessing patients with chronic pain in Turkey, making a diagnosis and tracking the effectiveness of treatment is done with scales that have been translated into Turkish. However, there is as yet no valid and reliable scale in Turkish to assess the acceptance of pain. To validate a Turkish version of the Chronic Pain Acceptance Questionnaire developed by McCracken and colleagues. Methodological and cross sectional study. A simple randomized sampling method was used in selecting the study sample. The sample was composed of 201 patients, more than 10 times the number of items examined for validity and reliability in the study, which totaled 20. A patient identification form, the Chronic Pain Acceptance Questionnaire, and the Brief Pain Inventory were used to collect data. Data were collected by face-to-face interviews. In the validity testing, the content validity index was used to evaluate linguistic equivalence, content validity, construct validity, and expert views. In reliability testing of the scale, Cronbach’s α coefficient was calculated, and item analysis and split-test reliability methods were used. Principal component analysis and varimax rotation were used in factor analysis and to examine factor structure for construct concept validity. The item analysis established that the scale, all items, and item-total correlations were satisfactory. The mean total score of the scale was 21.78. The internal consistency coefficient was 0.94, and the correlation between the two halves of the scale was 0.89. The Chronic Pain Acceptance Questionnaire, which is intended to be used in Turkey upon confirmation of its validity and reliability, is an evaluation instrument with sufficient validity and reliability, and it can be reliably used to examine patients’ acceptance of chronic pain.
Hazel Ekin Akmaz
Full Text Available Background: Pain acceptance is the process of giving up the struggle with pain and learning to live a worthwhile life despite it. In assessing patients with chronic pain in Turkey, making a diagnosis and tracking the effectiveness of treatment is done with scales that have been translated into Turkish. However, there is as yet no valid and reliable scale in Turkish to assess the acceptance of pain. Aims: To validate a Turkish version of the Chronic Pain Acceptance Questionnaire developed by McCracken and colleagues. Study Design: Methodological and cross sectional study. Methods: A simple randomized sampling method was used in selecting the study sample. The sample was composed of 201 patients, more than 10 times the number of items examined for validity and reliability in the study, which totaled 20. A patient identification form, the Chronic Pain Acceptance Questionnaire, and the Brief Pain Inventory were used to collect data. Data were collected by face-to-face interviews. In the validity testing, the content validity index was used to evaluate linguistic equivalence, content validity, construct validity, and expert views. In reliability testing of the scale, Cronbach’s α coefficient was calculated, and item analysis and split-test reliability methods were used. Principal component analysis and varimax rotation were used in factor analysis and to examine factor structure for construct concept validity. Results: The item analysis established that the scale, all items, and item-total correlations were satisfactory. The mean total score of the scale was 21.78. The internal consistency coefficient was 0.94, and the correlation between the two halves of the scale was 0.89. Conclusion: The Chronic Pain Acceptance Questionnaire, which is intended to be used in Turkey upon confirmation of its validity and reliability, is an evaluation instrument with sufficient validity and reliability, and it can be reliably used to examine patients’ acceptance
Robbins, Mandy; Francis, Leslie J; Bradford, Amanda
A sample of 16 male and 30 female undergraduates completed the Greer and Francis Scale of Rejection of Christianity. The data support the internal consistency reliability and construct validity of the scale for this sample.
McDonald, Ann E; Vigen, Cheryl
This study examined the ability of a two-part self-report instrument, the McDonald Play Inventory, to reliably and validly measure the play activities and play styles of 7- to 11-yr-old children and to discriminate between the play of neurotypical children and children with known learning and developmental disabilities. A total of 124 children ages 7-11 recruited from a sample of convenience and a subsample of 17 parents participated in this study. Reliability estimates yielded moderate correlations for internal consistency, total test intercorrelations, and test-retest reliability. Validity estimates were established for content and construct validity. The results suggest that a self-report instrument yields reliable and valid measures of a child's perceived play performance and discriminates between the play of children with and without disabilities. Copyright © 2012 by the American Occupational Therapy Association, Inc.
Previous research funded by Florida Department of Transportation (FDOT) developed a method for estimating : travel time reliability for arterials. This method was not initially implemented or validated using field data. This : project evaluated and r...
Khushhal, Alaa; Nichols, Simon; Evans, Will; Gleadall-Siddall, Damien; Page, Richard; O'Doherty, Alasdair; Carroll, Sean; Ingle, Lee; Abt, Grant
We examined the validity and reliability of the Apple Watch heart rate sensor during and in recovery from exercise. Twentyone males completed treadmill exercise while wearing two Apple Watches (left and right wrists) and a Polar S810i monitor (criterion). Exercise involved 5-min bouts of walking, jogging, and running at speeds of 4 km.h − 1, 7 km.h − 1, and 10 km.h − 1, followed by 11 min of rest between bouts. At all exercise intensities the mean bias was trivial. There were very good correl...
Full Text Available OBJECTIVE: Aphasia assessment is the first step towards a well- founded language therapy. Language tests need to consider cultural as well as typological linguistic aspects of a given language. This study was designed to determine the standardization, validity and reliability of Language Assessment Test for Aphasia, which consists of eight subtests including spontaneous speech and language, auditory comprehension, repetition, naming, reading, grammar, speech acts, and writing. METHODS: The test was administered to 282 healthy participants and 92 aphasic participants in age, education and gender matched groups. The validity study of the test was investigated with analysis of content, structure and criterion-related validity. For reliability of the test, the analysis of internal consistency, stability and equivalence reliability was conducted. The influence of variables on healhty participants’ sub-test scores, test score and language score was examined. According to significant differences, norms and cut-off scores based on language score were determined. RESULTS: The group with aphasia performed highly lower than healthy participants on subtest, test and language scores. The test scores of healthy group were mostly affected by age and educational level but not affected by gender. According to significant differences, age and educational level for both groups were determined. Considering age and educational levels, the reference values for the cut-off scores were presented. CONCLUSION: The test was found to be a highly reliable and valid aphasia test for Turkish- speaking aphasic patients either in Turkey or other Turkish communities around the world
Nakano, Hideki; Kodama, Takayuki; Ukai, Kazumasa; Kawahara, Satoru; Horikawa, Shiori; Murata, Shin
In this study, we aimed to (1) translate the English version of the Kinesthetic and Visual Imagery Questionnaire (KVIQ), which assesses motor imagery ability, into Japanese, and (2) investigate the reliability and validity of the Japanese KVIQ. We enrolled 28 healthy adults in this study. We used Cronbach’s alpha coefficients to assess reliability reflected by the internal consistency. Additionally, we assessed validity reflected by the criterion-related validity between the Japanese KVIQ and the Japanese version of the Movement Imagery Questionnaire-Revised (MIQ-R) with Spearman’s rank correlation coefficients. The Cronbach’s alpha coefficients for the KVIQ-20 were 0.88 (Visual) and 0.91 (Kinesthetic), which indicates high reliability. There was a significant positive correlation between the Japanese KVIQ-20 (Total) and the Japanese MIQ-R (Total) (r = 0.86, p < 0.01). Our results suggest that the Japanese KVIQ is an assessment that is a reliable and valid index of motor imagery ability.
Full Text Available The purpose of this research is to adapt the Scale of Happiness Orientations, which was developed by Peterson, Park, and Seligman (2005, into Turkish and examine the psychometric properties of the scale. The participants of the research consist of 489 students. The psychometric properties of the scale was examined with test methods; linguistic equivalence, descriptive factor analysis, confirmatory factor analysis, criterion-related validity, internal consistency, and test-retest. For criterion-related validity (concurrent validity, the Oxford Happiness Questionnaire-Short Form is used. Articles resulting from the descriptive factor analysis for structural validity of scale were summed into three factors (life of meaning, life of pleasure, life of engagement in accordance with the original form. Confirmatory factor analysis conducted yielded the value of three-factor fit indexes of 18 items: (χ2/df=1.94, RMSEA= .059, CFI= .96, GFI= .95, IFI= .95, NFI= .96, RFI= .95 and SRMR= .044. Factor load of the scale ranges from .36 to .59. During criterion-validity analysis, between Scale of Happiness Orientations and the Oxford Happiness Questionnaire, positive strong relations were seen at the level of p<.01 significance level. Cronbach Alpha internal consistency coefficient was .88 for the life of meaning sub-scale, .84 for the life of pleasure sub-scale, and .81 for the life of engagement sub-scale. In addition, a corrected items total correlation ranges from .39 to .61. According to these results, it can be said that the scale is a valid and reliable assessment instrument for positive psychology, educational psychology, and other fields.
Due, Ulla; Ottesen, Marianne
Objective. To revise, validate and test for reliability an anal sphincter rupture questionnaire in relation to construct, content and face validity. Setting and background. Since 1996 women with anal sphincter rupture (ASR) at one of the public university hospitals in Copenhagen, Denmark have bee...
Park, Yu Kyung; Ju, Hyeon Ok; Na, Hunjoo
The Perinatal Post-Traumatic Stress Disorder Questionnaire (PPQ) was designed to measure post-traumatic symptoms related to childbirth and symptoms during postnatal period. The purpose of this study was to develop a translated Korean version of the PPQ and to evaluate reliability and validity of the Korean PPQ. Participants were 196 mothers at one to 18 months after giving childbirth and data were collected through e-mails. The PPQ was translated into Korean using translation guideline from World Health Organization. For this study Cronbach's alpha and split-half reliability were used to evaluate the reliability of the PPQ. Exploratory Factor Analysis (EFA), Confirmatory Factor Analysis (CFA), and known-group validity were conducted to examine construct validity. Correlations of the PPQ with Impact of Event Scale (IES), Beck Depression Inventory II (BDI-II), and Beck Anxiety Inventory (BAI) were used to test a criterion validity of the PPQ. Cronbach's alpha and Spearman-Brown split-half correlation coefficient were 0.91 and 0.77, respectively. EFA identified a 3-factor solution including arousal, avoidance, and intrusion factors and CFA revealed the strongest support for the 3-factor model. The correlations of the PPQ with IES, BDI-II, and BAI were .99, .60, and .72, respectively, pointing to criterion validity of a high level. The Korean version PPQ is a useful tool for screening and assessing mothers' experiencing emotional distress related to child birth and during the postnatal period. The PPQ also reflects Post Traumatic Stress Disorder's diagnostic standards well.
Kara, Kerime C; Çıtak Karakaya, İlkim; Tunalı, Nur; Karakaya, Mehmet G
The aim of this study was to investigate the reliability and validity of the Turkish version of the Incontinence Quiz, which was developed by Branch et al. (1994), to assess women's knowledge of and attitudes toward urinary incontinence. Comprehensibility of the Turkish version of the 14-item Incontinence Quiz, which was prepared following translation-back translation procedures, was tested on a pilot group of eight women, and its internal reliability, test-retest reliability and construct validity were assessed in 150 women who attended the gynecology clinics of three hospitals in İçel, Turkey. Physical and sociodemographic characteristics and presence of incontinence complaints were also recorded. Data were analyzed at the 0.05 alpha level, using SPSS version 22. The scale had good reliability and validity. The internal reliability coefficient (Cronbach α) was 0.80, test-retest correlation coefficients were 0.83-0.94; and with regard to construct validity, Kaiser-Meyer-Olkin coefficient was 0.76 and Barlett sphericity test was 562.777 (P = 0.000). Turkish version of the Incontinence Quiz had a four-factor structure, with Eigenvalues ranging from 1.17 to 4.08. The Incontinence Quiz-Turkish version is a highly comprehensible, reliable and valid scale, which may be used to assess Turkish-speaking women's knowledge of and attitudes toward urinary incontinence. © 2017 Japan Society of Obstetrics and Gynecology.
Lund, Rikke; Nielsen, Lene Snabe; Henriksen, Pia Wichmann
OBJECTIVE: The aim of the present article is to describe the face and content validity as well as reliability of the Copenhagen Social Relations Questionnaire (CSRQ). METHOD: The face and content validity test was based on focus group discussions and individual interviews with 31 informants...... from the interviews. Two additional themes not covered by CSRQ on dynamics and reciprocity of social relations were identified. DISCUSSION: CSRQ holds satisfactory face and content validity as well as reliability, and is suitable for measuring structure and function of social relations including...
Cuntz, M.; Yeager, K. E.
We challenge the customary assumption that the entering of an Earth-mass planet into the Hill radius (or multiples of the Hill radius) of a giant planet is a valid criterion for its ejection from the star-planet system. This assumption has widely been used in previous studies, especially those with an astrobiological focus. As intriguing examples, we explore the dynamics of the systems HD 20782 and HD 188015. Each system possesses a giant planet that remains in or crosses into the stellar habitable zone, thus effectively thwarting the possibility of habitable terrestrial planets. In the case of HD 188015, the orbit of the giant planet is almost circular, whereas in the case of HD 20782, it is extremely elliptical. Although it is found that Earth-mass planets are eventually ejected from the habitable zones of these systems, the 'Hill Radius Criterion' is identified as invalid for the prediction of when the ejection is actually occurring.
Walker, Timothy J; Tullar, Jessica M; Diamond, Pamela M; Kohl, Harold W; Amick, Benjamin C
Purpose To evaluate factorial validity, scale reliability, test-retest reliability, convergent validity, and discriminant validity of the 8-item Work Limitations Questionnaire (WLQ) among employees from a public university system. Methods A secondary analysis using de-identified data from employees who completed an annual Health Assessment between the years 2009-2015 tested research aims. Confirmatory factor analysis (CFA) (n = 10,165) tested the latent structure of the 8-item WLQ. Scale reliability was determined using a CFA-based approach while test-retest reliability was determined using the intraclass correlation coefficient. Convergent/discriminant validity was tested by evaluating relations between the 8-item WLQ with health/performance variables for convergent validity (health-related work performance, number of chronic conditions, and general health) and demographic variables for discriminant validity (gender and institution type). Results A 1-factor model with three correlated residuals demonstrated excellent model fit (CFI = 0.99, TLI = 0.99, RMSEA = 0.03, and SRMR = 0.01). The scale reliability was acceptable (0.69, 95% CI 0.68-0.70) and the test-retest reliability was very good (ICC = 0.78). Low-to-moderate associations were observed between the 8-item WLQ and the health/performance variables while weak associations were observed between the demographic variables. Conclusions The 8-item WLQ demonstrated sufficient reliability and validity among employees from a public university system. Results suggest the 8-item WLQ is a usable alternative for studies when the more comprehensive 25-item WLQ is not available.
Full Text Available Abstract Background Wolfram syndrome (WFS is a rare, neurodegenerative disease that typically presents with childhood onset insulin dependent diabetes mellitus, followed by optic atrophy, diabetes insipidus, deafness, and neurological and psychiatric dysfunction. There is no cure for the disease, but recent advances in research have improved understanding of the disease course. Measuring disease severity and progression with reliable and validated tools is a prerequisite for clinical trials of any new intervention for neurodegenerative conditions. To this end, we developed the Wolfram Unified Rating Scale (WURS to measure the severity and individual variability of WFS symptoms. The aim of this study is to develop and test the reliability and validity of the Wolfram Unified Rating Scale (WURS. Methods A rating scale of disease severity in WFS was developed by modifying a standardized assessment for another neurodegenerative condition (Batten disease. WFS experts scored the representativeness of WURS items for the disease. The WURS was administered to 13 individuals with WFS (6-25 years of age. Motor, balance, mood and quality of life were also evaluated with standard instruments. Inter-rater reliability, internal consistency reliability, concurrent, predictive and content validity of the WURS were calculated. Results The WURS had high inter-rater reliability (ICCs>.93, moderate to high internal consistency reliability (Cronbach’s α = 0.78-0.91 and demonstrated good concurrent and predictive validity. There were significant correlations between the WURS Physical Assessment and motor and balance tests (rs>.67, ps>.76, ps=-.86, p=.001. The WURS demonstrated acceptable content validity (Scale-Content Validity Index=0.83. Conclusions These preliminary findings demonstrate that the WURS has acceptable reliability and validity and captures individual differences in disease severity in children and young adults with WFS.
Gore, Shweta; Blackwood, Jennifer; Guyette, Mary; Alsalaheen, Bara
Reduced physical activity is associated with poor prognosis in chronic obstructive pulmonary disease (COPD). Accelerometers have greatly improved quantification of physical activity by providing information on step counts, body positions, energy expenditure, and magnitude of force. The purpose of this systematic review was to compare the validity and reliability of accelerometers used in patients with COPD. An electronic database search of MEDLINE and CINAHL was performed. Study quality was assessed with the Strengthening the Reporting of Observational Studies in Epidemiology checklist while methodological quality was assessed using the modified Quality Appraisal Tool for Reliability Studies. The search yielded 5392 studies; 25 met inclusion criteria. The SenseWear Pro armband reported high criterion validity under controlled conditions (r = 0.75-0.93) and high reliability (ICC = 0.84-0.86) for step counts. The DynaPort MiniMod demonstrated highest concurrent validity for step count using both video and manual methods. Validity of the SenseWear Pro armband varied between studies especially in free-living conditions, slower walking speeds, and with addition of weights during gait. A high degree of variability was found in the outcomes used and statistical analyses performed between studies, indicating a need for further studies to measure reliability and validity of accelerometers in COPD. The SenseWear Pro armband is the most commonly used accelerometer in COPD, but measurement properties are limited by gait speed variability and assistive device use. DynaPort MiniMod and Stepwatch accelerometers demonstrated high validity in patients with COPD but lack reliability data.
Due, Ulla; Ottesen, Marianne
Objective. To revise, validate and test for reliability an anal sphincter rupture questionnaire in relation to construct, content and face validity. Setting and background. Since 1996 women with anal sphincter rupture (ASR) at one of the public university hospitals in Copenhagen, Denmark have been...... main questions but one. Two questions needed further explanation. Seven women made minor errors. Conclusion. The validated Danish questionnaire has a good construct, content and face validity. It is a well accepted, reliable, simple and clinically relevant screening tool. It reveals physical problems...... offered pelvic floor muscle examination and instruction by a specialist physiotherapist. In relation to that, a non-validated questionnaire about anal and urinary incontinence was to be answered six months after childbirth. Method. The original questionnaire was revised and a pilot test was performed...
Heinik, J; Werner, P; Lin, R
The testament definition scale (TDS) is a specifically designed six-item scale aimed at measuring the respondent's capacity to define "testament." We assessed the reliability and validity of this new short scale in 31 community-dwelling cognitively impaired elderly patients. Interrater reliability for the six items ranged from .87 to .97. The interrater reliability for the total score was .77. Significant correlations were found between the TDS score and the Mini-Mental State Examination (MMSE) and the Cambridge Cognitive Examination scores (r = .71 and .72 respectively, p = .001). Criterion validity yielded significantly different means for subjects with MMSE scores of 24-30 and 0-23: mean 3.9 and 1.6 respectively (t(20) = 4.7, p = .001). Using a cutoff point of 0-2 vs. 3+, 79% of the subjects were correctly classified as severely cognitively impaired, with only 8.3% false positives, and a positive predictive value of 94%. Thus, TDS was found both reliable and valid. This scale, however, is not synonymous with testamentary capacity. The discussion deals with the methodological limitations of this study, and highlights the practical as well as the theoretical relevance of TDS. Future studies are warranted to elucidate the relationships between TDS and existing legal requirements of testamentary capacity.
Hobbelen, Johannes S M; Koopmans, Raymond T C M; Verhey, Frans R J; Habraken, Kitty M; de Bie, Rob A
Paratonia is one of the associated movement disorders characteristic of dementia. The aim of this study was to develop an assessment tool (the Paratonia Assessment Instrument, PAI), based on the new consensus definition of paratonia. An additional aim was to investigate the reliability and validity of the PAI. A three-phase cross-sectional survey was conducted. In the first two phases, the PAI was developed and validated. In the third phase, the inter-observer reliability and feasibility of the instrument was tested. The original PAI consisted of five criteria that all needed to be met in order to make the diagnosis. On the basis of a qualitative analysis, one criterion was reformulated and another was removed. Following this, inter-observer reliability between the two assessors resulted in an improvement of Cohen's kappa from 0.532 in the initial phase to 0.677 in the second phase. This improvement was substantiated in the third phase by two independent assessors with Cohen's kappa ranging from 0.625 to 1. The PAI is a reliable and valid assessment tool for diagnosing paratonia in elderly people with dementia that can be applied easily in daily practice.
Ramo, Danielle E; Hall, Sharon M; Prochaska, Judith J
The Internet offers many potential benefits to conducting smoking and other health behavior research with young adults. Questions, however, remain regarding the psychometric properties of online self-reported smoking behaviors. The purpose of this study was to examine the reliability and validity of self-reported smoking and smoking-related cognitions obtained from an online survey. Young adults (N = 248) age 18 to 25 who had smoked at least 1 cigarette in the past 30 days were recruited online and completed a survey of tobacco and other substance use. Measures of smoking behavior (quantity and frequency) and smoking-related expectancies demonstrated high internal consistency reliability. Measures of smoking behavior and smoking stage of change demonstrated strong concurrent criterion and divergent validity. Results for convergent validity varied by specific constructs measured. Estimates of smoking quantity, but not frequency, were comparable to those obtained from a nationally representative household interview among young adults. These findings generally support the reliability and validity of online surveys of young adult smokers. Identified limitations may reflect issues specific to the measures rather than the online data collection methodology. Strategies to maximize the psychometric properties of online surveys with young adult smokers are discussed. PsycINFO Database Record (c) 2011 APA, all rights reserved.
Erel, Suat; Şimşek, İbrahim Engin; Özkan, Hüseyin
The aim of this study was to analyze the validity and reliability of the Turkish version (ICOAP-TR) of the intermittent and constant osteoarthritis pain (ICOAP) questionnaire in patients with knee osteoarthritis (OA). Thirty-eight volunteer patients diagnosed with knee OA answered the questionnaire twice with an interval of 2-4 days. The reliability of the measurement was assessed using Cronbach's alpha coefficient and intraclass correlation (ICC) for test-retest reliability. Criterion validity was tested against the Western Ontario and McMaster Universities Arthritis Index (WOMAC) pain score and visual analog scale (VAS) designed to assess the perceived discomfort rated by the patient. Test-retest reliability was found to be ICC=0.942 for total score, 0.902 for constant pain subscale, and 0.945 for intermittent pain subscale. Internal consistency was tested using Cronbach's alpha and was found to be 0.970 for total score, 0.948 for constant pain subscale, and 0.972 for intermittent pain subscale. For criterion validity, the correlation between the total score of ICOAP-TR and WOMAC pain subscale was r=0.779 (p<0.05), and correlation between total score of ICOAP-TR and VAS was r=0.570 (p<0.05). The ICOAP-TR is a reliable and valid instrument to be used with patients with knee OA.
Cabanas-Sánchez, Verónica; Martínez-Gómez, David; Esteban-Cornejo, Irene; Castro-Piñero, José; Conde-Caveda, Julio; Veiga, Óscar L
To develop a questionnaire able to assess time spent by youth in a wide range of leisure-time sedentary behaviors (SB) and evaluate its test-retest reliability and criterion validity. Cross-sectional observational. The reliability sample included 194 youth, aged 10-18 years, who completed the questionnaire twice, separated by one-week interval. The validity study comprised 1207 participants aged 8-18 years. Participants wore an accelerometer for 7 consecutive days. The questionnaire was designed to assess the amount of time spent in twelve different SB during weekdays and weekends, separately. In order to avoid usual phenomenon of time over reporting, values were adjusted to real available leisure-time (LT) for each participant. Reliability was assessed by using Intraclass Correlation Coefficients (ICC) and weighted (quadratic) kappa (k), and validity was assessed by using Pearson correlation and Bland-Altman plots. The reliability of questionnaire showed a moderate-to-substantial agreement for the most (91%) of items (k=0.43-0.74; ICC=0.41-0.79) with three items (4%) reaching an almost perfect agreement (ICC=0.82-0.83). Only 'sitting and talking' evidenced fair-to-moderate reliability (k=0.27-0.39; ICC=0.34-0.46). The relationship between average sedentary time assessed by the questionnaire and accelerometry was moderate (r=0.36; pquestionnaire and accelerometer sedentary time for average day (r=0.05; p=0.11) but Bland-Altman plots suggest moderate discrepancies between both methods of SB measurement (mean=19.86; limits of agreement=-280.04 to 319.76). The questionnaire showed moderate to good test-retest reliability and a moderate level of validity for assessing SB in youth, similar or slightly better to previously published in this population. Copyright © 2017 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.
Todsen, Tobias; Tolsgaard, Martin Grønnebæk; Olsen, Beth Härstedt
physicians' OSAUS scores with diagnostic accuracy. RESULTS: The generalizability coefficient was high (0.81) and a D-study demonstrated that 1 assessor and 5 cases would result in similar reliability. The construct validity of the OSAUS scale was supported by a significant difference in the mean scores......OBJECTIVE: To explore the reliability and validity of the Objective Structured Assessment of Ultrasound Skills (OSAUS) scale for point-of-care ultrasonography (POC US) performance. BACKGROUND: POC US is increasingly used by clinicians and is an essential part of the management of acute surgical...... conditions. However, the quality of performance is highly operator-dependent. Therefore, reliable and valid assessment of trainees' ultrasonography competence is needed to ensure patient safety. METHODS: Twenty-four physicians, representing novices, intermediates, and experts in POC US, scanned 4 different...
Konge, Lars; Lehnert, Per; Hansen, Henrik Jessen
BACKGROUND: As we move toward competency-based education in medicine, we have lagged in developing competency-based evaluation methods. In the era of minimally invasive surgery, there is a need for a reliable and valid tool dedicated to measure competence in video-assisted thoracoscopic surgery....... The purpose of this study is to create such an assessment tool, and to explore its reliability and validity. METHODS: An expert group of physicians created an assessment tool consisting of 10 items rated on a five-point rating scale. The following factors were included: economy and confidence of movement...
Andrei, Federica; Smith, Martin M.; Surcinelli, Paola; Baldaro, Bruno; Saklofske, Donald H.
This study investigated the structure and validity of the Italian translation of the Trait Emotional Intelligence Questionnaire. Data were self-reported from 227 participants. Confirmatory factor analysis supported the four-factor structure of the scale. Hierarchical regressions also demonstrated its incremental validity beyond demographics, the…
Carlsen, C G; Lindorff Larsen, Karen; Funch-Jensen, P
PURPOSE: Lichtenstein hernia repair is a common surgical procedure and one of the first procedures performed by a surgical trainee. However, formal assessment tools developed for this procedure are few and sparsely validated. The aim of this study was to determine the reliability and validity...... of an assessment tool designed to measure surgical skills in Lichtenstein hernia repair. METHODS: Key issues were identified through a focus group interview. On this basis, an assessment tool with eight items was designed. Ten surgeons and surgical trainees were video recorded while performing Lichtenstein hernia...... a significant difference between the three groups which indicates construct validity, p skills can be assessed blindly by a single rater in a reliable and valid fashion with the new procedure-specific assessment tool. We recommend this tool for future assessment...
Weda, M; van Riet-Nales, D A; van Aalst, P; de Kaste, D; Lekkerkerker, J F F
In the Netherlands the market share of isosorbide dinitrate 5 mg sublingual tablets is dominated by 2 products (A and B). In the last few years complaints have been received from health care professionals on product B. During patient use the disintegration of the tablet was reported to be slow and/or incomplete, and ineffectiveness was experienced. In the European Pharmacopoeia (Ph. Eur.) no requirement is present for the disintegration time of sublingual tablets. The purpose of this study was to compare the in vitro disintegration time of products A and B, and to establish a suitable test method and acceptance criterion. A and B were tested with the Ph. Eur. method described in the monograph on disintegration of tablets and capsules as well as with 3 modified tests using the same Ph. Eur. apparatus, but without movement of the basket-rack assembly. In modified test 1 and modified test 2 water was used as medium (900 ml and 50 ml respectively), whereas in modified test 3 artificial saliva was used (50 ml). In addition, disintegration was tested in Nessler tubes with 0.5 and 2 ml of water. Finally, the Ph. Eur. method was also applied to other sublingual tablets with other drug substances on the Dutch market. With modified test 3 no disintegration could be achieved within 20 min. With the Ph. Eur. method and modified tests 1 and 2 product A and B differed significantly (p disintegration times. These 3 methods were capable of discriminating between products and between batches. The time measured with the Ph. Eur. method was significantly lower compared to modified tests 1 and 2 (p tablets the disintegration time should be tested. The Ph. Eur. method is considered suitable for this test. In view of the products currently on the market and taking into consideration requirements in the United States Pharmacopeia and Japanese Pharmacopoeia, an acceptance criterion of not more than 2 min is proposed.
Gromisch, Elizabeth S; Zemon, Vance; Holtzer, Roee; Chiaravalloti, Nancy D; DeLuca, John; Beier, Meghan; Farrell, Eileen; Snyder, Stacey; Schairer, Laura C; Glukhovsky, Lisa; Botvinick, Jason; Sloan, Jessica; Picone, Mary Ann; Kim, Sonya; Foley, Frederick W
Cognitive dysfunction is prevalent in multiple sclerosis. As self-reported cognitive functioning is unreliable, brief objective screening measures are needed. Utilizing widely used full-length neuropsychological tests, this study aimed to establish the criterion validity of highly abbreviated versions of the Brief Visuospatial Memory Test - Revised (BVMT-R), Symbol Digit Modalities Test (SDMT), Delis-Kaplan Executive Function System (D-KEFS) Sorting Test, and Controlled Oral Word Association Test (COWAT) in order to begin developing an MS-specific screening battery. Participants from Holy Name Medical Center and the Kessler Foundation were administered one or more of these four measures. Using test-specific criterion to identify impairment at both -1.5 and -2.0 SD, receiver-operating-characteristic (ROC) analyses of BVMT-R Trial 1, Trial 2, and Trial 1 + 2 raw data (N = 286) were run to calculate the classification accuracy of the abbreviated version, as well as the sensitivity and specificity. The same methods were used for SDMT 30-s and 60-s (N = 321), D-KEFS Sorting Free Card Sort 1 (N = 120), and COWAT letters F and A (N = 298). Using these definitions of impairment, each analysis yielded high classification accuracy (89.3 to 94.3%). BVMT-R Trial 1, SDMT 30-s, D-KEFS Free Card Sort 1, and COWAT F possess good criterion validity in detecting impairment on their respective overall measure, capturing much of the same information as the full version. Along with the first two trials of the California Verbal Learning Test - Second Edition (CVLT-II), these five highly abbreviated measures may be used to develop a brief screening battery.
Garnacho-Castaño, Manuel V.; López-Lastra, Silvia; Maté-Muñoz, José L.
The objectives of the study were to determine the validity and reliability of peak velocity (PV), average velocity (AV), peak power (PP) and average power (AP) measurements were made using a linear position transducer. Validity was assessed by comparing measurements simultaneously obtained using the Tendo Weightlifting Analyzer Systemi and T-Force Dynamic Measurement Systemr (Ergotech, Murcia, Spain) during two resistance exercises, bench press (BP) and full back squat (BS), performed by 71 trained male subjects. For the reliability study, a further 32 men completed both lifts using the Tendo Weightlifting Analyzer Systemz in two identical testing sessions one week apart (session 1 vs. session 2). Intraclass correlation coefficients (ICCs) indicating the validity of the Tendo Weightlifting Analyzer Systemi were high, with values ranging from 0.853 to 0.989. Systematic biases and random errors were low to moderate for almost all variables, being higher in the case of PP (bias ±157.56 W; error ±131.84 W). Proportional biases were identified for almost all variables. Test-retest reliability was strong with ICCs ranging from 0.922 to 0.988. Reliability results also showed minimal systematic biases and random errors, which were only significant for PP (bias -19.19 W; error ±67.57 W). Only PV recorded in the BS showed no significant proportional bias. The Tendo Weightlifting Analyzer Systemi emerged as a reliable system for measuring movement velocity and estimating power in resistance exercises. The low biases and random errors observed here (mainly AV, AP) make this device a useful tool for monitoring resistance training. Key points This study determined the validity and reliability of peak velocity, average velocity, peak power and average power measurements made using a linear position transducer The Tendo Weight-lifting Analyzer Systemi emerged as a reliable system for measuring movement velocity and power. PMID:25729300
Zalma, Abdul Razak; Safiah, Md Yusof; Ajau, Danis; Khairil Anuar, Md Isa
Interventions to counter the influence of television food advertising amongst children are important. Thus, reliable and valid instrument to assess its effect is needed. The objective of this study was to determine the reliability and validity of such a questionnaire. The questionnaire was administered twice on 32 primary schoolchildren aged 10-11 years in Selangor, Malaysia. The interval between the first and second administration was 2 weeks. Test-retest method was used to examine the reliability of the questionnaire. Intra-rater reliability was determined by kappa coefficient and internal consistency by Cronbach's alpha coefficient. Construct validity was evaluated using factor analysis. The test-retest correlation showed moderate-to-high reliability for all scores (r = 0.40*, p = 0.02 to r = 0.95**, p = 0.00), with one exception, consumption of fast foods (r = 0.24, p = 0.20). Kappa coefficient showed acceptable-to-strong intra-rater reliability (K = 0.40-0.92), except for two items under knowledge on television food advertising (K = 0.26 and K = 0.21) and one item under preference for healthier foods (K = 0.33). Cronbach's alpha coefficient indicated acceptable internal consistency for all scores (0.45-0.60). After deleting two items under Consumption of Commonly Advertised Food, the items showed moderate-to-high loading (0.52, 0.84, 0.42 and 0.42) with the Scree plot showing that there was only one factor. The Kaiser-Meyer-Olkin was 0.60, showing that the sample was adequate for factor analysis. The questionnaire on television food advertising is reliable and valid to assess the effect of media literacy education on television food advertising on schoolchildren. © The Author (2013). Published by Oxford University Press. All rights reserved. For Permissions, please email: email@example.com.
Monbaliu, E; Ortibus, E; Roelens, F; Desloovere, K; Deklerck, J; Prinzie, P; de Cock, P; Feys, H
This study investigated the reliability and validity of the Barry-Albright Dystonia Scale (BADS), the Burke-Fahn-Marsden Movement Scale (BFMMS), and the Unified Dystonia Rating Scale (UDRS) in patients with bilateral dystonic cerebral palsy (CP). Three raters independently scored videotapes of 10 patients (five males, five females; mean age 13 y 3 mo, SD 5 y 2 mo, range 5-22 y). One patient each was classified at levels I-IV in the Gross Motor Function Classification System and six patients were classified at level V. Reliability was measured by (1) intraclass correlation coefficient (ICC) for interrater reliability, (2) standard error of measurement (SEM) and smallest detectable difference (SDD), and (3) Cronbach's alpha for internal consistency. Validity was assessed by Pearson's correlations among the three scales used and by content analysis. Moderate to good interrater reliability was found for total scores of the three scales (ICC: BADS=0.87; BFMMS=0.86; UDRS=0.79). However, many subitems showed low reliability, in particular for the UDRS. SEM and SDD were respectively 6.36% and 17.72% for the BADS, 9.88% and 27.39% for the BFMMS, and 8.89% and 24.63% for the UDRS. High internal consistency was found. Pearson's correlations were high. Content validity showed insufficient accordance with the new CP definition and classification. Our results support the internal consistency and concurrent validity of the scales; however, taking into consideration the limitations in reliability, including the large SDD values and the content validity, further research on methods of assessment of dystonia is warranted.
Martínez-Gómez, David; Martínez-de-Haro, Vicente; Pozo, Tamara; Welk, Gregory J; Villagra, Ariel; Calle, Marisa E; Marcos, Ascensión; Veiga, Oscar L
Questionnaires are feasible instruments to assess physical activity (PA) in large samples. The aim of the current study was to evaluate the reliability and validity of the PAQ-A questionnaire in Spanish adolescents using the measurement of PA by accelerometer as criterion. In a sample of 82 adolescents, aged 12 to 17 years, 1-week PAQ-A test-retest was administered. Reliability was analyzed by the Intraclass Correlation Coefficient (ICC) and the internal consistency by the Cronbach's alpha Coefficient. Two hundred thirty-two adolescents, aged 13-17 years, completed the PAQ-A and wore the ActiGraph GT1M accelerometer during 7-days. The PAQ-A was compared against total PA and moderate to vigorous PA (MVPA) obtained by the accelerometer. Test-retest reliability showed ICC = 0.71 for the final score of PAQ-A. Internal consistency was alpha = 0.65 in the first self-report, alpha = 0.67 in the retest in 82 adolescents sample, and alpha = 0.74 in the 232 adolescents sample. The PAQ-A was moderately correlated with total PA (rho = 0.39) and MVPA (rho= 0.34) assessed by the accelerometer. The PAQ-A obtained significantly moderate correlations in boys but not in girls against the accelerometer. The PAQ-A questionnaire shows an adequate reliability and a reasonable validity for assessing PA in Spanish adolescents.
Gosadi, Ibrahim M; Alatar, Abdullah A; Otayf, Mojahed M; AlJahani, Dhaherah M; Ghabbani, Hisham M; AlRajban, Waleed A; Alrsheed, Abdullah M; Al-Nasser, Khalid A
To create a food frequency questionnaire specifically designed to capture the dietary habits of Saudis and test its validity and reliability. Methods: This investigation is a longitudinal, test-retest study conducted in King Saud University, Riyadh, Kingdom of Saudi Arabia between December 2015 and March 2016. A list of 140 food items was included in the questionnaire where a closed-ended and open-ended approach was used. Regarding past year food frequency consumption and 24 hours dietary recall, body weight and height were collected. Internal consistency, test-retest reliability, completeness of the food list, and criterion validity were assessed. Results: One-hundred and thirty eight participants were interviewed to complete the 24 hours dietary recall and the constructed questionnaire. Approximately 85% of the food items reported in the dietary recall were covered in the food frequency questionnaire. The association of body mass index with meats (regression coefficients: 2.28) and dairy products consumption frequency was statistically significant (regression coefficients: 2.31). A high overall reproducibility rate of the questionnaire was detected (Pearsons' correlation coefficient: 0.78 p less than 0.001). Conclusion: The developed questionnaire has a high reliability and reasonable validity, and suitable for use in nutritional epidemiological investigations in Saudi Arabia.
Nilipour, Reza; Pourshahbaz, Abbas; Ghoreyshi, Zahra Sadat
In this study, we reported the reliability and validity of Bedside version of Persian WAB (P-WAB-1) adapted from Western Aphasia Battery (WAB-R) (1,2). P-WAB-1 is a clinical linguistic measuring tool to determine severity and type of aphasia in brain damaged patients based on Aphasia Quotient (AQ) as a functional measure. For the purposes of a quick clinical screening of aphasia in Persian, we adapted the bedside version of WAB-R to assess the performance of Persian aphasic patients. The data we reported on adaptation, validity and reliability of P-WAB-1 are based on faithful translation and criterion validity ratio (CVR) taken from the expert panel and the performance of 60 consecutive brain damaged patients referred to different university clinics for rehabilitation and 30 healthy subjects as norms and 40 age-matched epileptic patients as the control group. Based on the results of this study, P-WAB-1 has internal consistency (a=0.71) and test-retest reliability (r=.65 PPersian speaking brain damaged patients. This study is the initial step on adaptation of different versions of WAB-R to measure the severity of aphasia using AQ, LQ and CQ as operational measures and to classify Persian speaking aphasic patients into different types.
Clark, Ross A; Mentiplay, Benjamin F; Pua, Yong-Hao; Bower, Kelly J
The use of force platform technologies to assess standing balance is common across a range of clinical areas. Numerous researchers have evaluated the low-cost Wii Balance Board (WBB) for its utility in assessing balance, with variable findings. This review aimed to systematically evaluate the reliability and concurrent validity of the WBB for assessment of static standing balance. Articles were retrieved from six databases (Medline, SCOPUS, EMBASE, CINAHL, Web of Science, Inspec) from 2007 to 2017. After independent screening by two reviewers, 25 articles were included. Two reviewers performed the data extraction and quality assessment. Test-retest reliability was investigated in 12 studies, with intraclass correlation coefficients or Pearson's correlation values showing a range from poor to excellent reliability (range: 0.27 to 0.99). Concurrent validity (i.e. comparison with another force platform) was examined in 21 studies, and was generally found to be excellent in studies examining the association between the same outcome measures collected on both devices. For studies reporting predominantly poor to moderate validity, potentially influential factors included the choice of 1) criterion reference (e.g. not a common force platform), 2) test duration (e.g. balance. Protocol registration number: PROSPERO 2017: CRD42017058122. Copyright © 2018 Elsevier B.V. All rights reserved.
Muyor, José M
The aims of the current study were 1) to evaluate the validity of the WIMU ® system for measuring hamstring muscle extensibility in the passive straight leg raise (PSLR) test using an inclinometer for the criterion and 2) to determine the test-retest reliability of the WIMU ® system to measure hamstring muscle extensibility during the PSLR test. 55 subjects were evaluated on 2 separate occasions. Data from a Unilever inclinometer and WIMU ® system were collected simultaneously. Intraclass correlation coefficients (ICCs) for the validity were very high (0.983-1); a very low systematic bias (-0.21°--0.42°), random error (0.05°-0.04°) and standard error of the estimate (0.43°-0.34°) were observed (left-right leg, respectively) between the 2 devices (inclinometer and the WIMU ® system). The R 2 between the devices was 0.999 (p<0.001) in both the left and right legs. The test-retest reliability of the WIMU ® system was excellent, with ICCs ranging from 0.972-0.995, low coefficients of variation (0.01%), and a low standard error of the estimate (0.19-0.31°). The WIMU ® system showed strong concurrent validity and excellent test-retest reliability for the evaluation of hamstring muscle extensibility in the PSLR test. © Georg Thieme Verlag KG Stuttgart · New York.
Hadley, Wendy; Stewart, Angela; Hunter, Heather L; Affleck, Katelyn; Donenberg, Geri; Diclemente, Ralph; Brown, Larry K
We evaluated the reliability and validity of the Dyadic Observed Communication Scale (DOCS) coding scheme, which was developed to capture a range of communication components between parents and adolescents. Adolescents and their caregivers were recruited from mental health facilities for participation in a large, multi-site family-based HIV prevention intervention study. Seventy-one dyads were randomly selected from the larger study sample and coded using the DOCS at baseline. Preliminary validity and reliability of the DOCS was examined using various methods, such as comparing results to self-report measures and examining interrater reliability. Results suggest that the DOCS is a reliable and valid measure of observed communication among parent-adolescent dyads that captures both verbal and nonverbal communication behaviors that are typical intervention targets. The DOCS is a viable coding scheme for use by researchers and clinicians examining parent-adolescent communication. Coders can be trained to reliably capture individual and dyadic components of communication for parents and adolescents and this complex information can be obtained relatively quickly.
Ahmed, Hussam; Chateauneuf, Alaa
The reliability validation of engineering products and systems is mandatory for choosing the best cost-effective design among a series of alternatives. Decisions at early design stages have a large effect on the overall life cycle performance and cost of products. In this paper, an optimization-based formulation is proposed by coupling the costs of product design and validation testing, in order to ensure the product reliability with the minimum number of tests. This formulation addresses the question about the number of tests to be specified through reliability demonstration necessary to validate the product under appropriate confidence level. The proposed formulation takes into account the product cost, the failure cost and the testing cost. The optimization problem can be considered as a decision making system according to the hierarchy of structural reliability measures. The numerical examples show the interest of coupling design and testing parameters. - Highlights: • Coupled formulation for design and testing costs, with lifetime degradation. • Cost-effective testing optimization to achieve reliability target. • Solution procedure for nested aleatoric and epistemic variable spaces
Dueñas, María; Mendonça, Liliane; Sampaio, Rute; Gouvinhas, Cláudia; Oliveira, Daniela; Castro-Lopes, José Manuel; Azevedo, Luís Filipe
The Bowel Function Index (BFI) is a simple and sound bowel function and opioid-induced constipation (OIC) screening tool. We aimed to develop the translation and cultural adaptation of this measure (BFI-P) and to assess its reliability and validity for the Portuguese language and a chronic pain population. The BFI-P was created after a process including translation, back translation and cultural adaptation. Participants (n = 226) were recruited in a chronic pain clinic and were assessed at baseline and after one week. Internal consistency, test-retest reliability, responsiveness, construct (convergent and known groups) and factorial validity were assessed. Test-retest reliability had an intra-class correlation of 0.605 for BFI mean score. Internal consistency of BFI had Cronbach's alpha of 0.865. The construct validity of BFI-P was shown to be excellent and the exploratory factor analysis confirmed its unidimensional structure. The responsiveness of BFI-P was excellent, with a suggested 17-19 point and 8-12 point change in score constituting a clinically relevant change in constipation for patients with and without previous constipation, respectively. This study had some limitations, namely, the criterion validity of BFI-P was not directly assessed; and the absence of a direct criterion for OIC precluded the assessment of the criterion based responsiveness of BFI-P. Nevertheless, BFI may importantly contribute to better OIC screening and its Portuguese version (BFI-P) has been shown to have excellent reliability, internal consistency, validity and responsiveness. Further suggestions regarding statistically and clinically important change cut-offs for this instrument are presented.
Hop, M.; Moues, C.; Bogomolova, K.; Nieuwenhuis, M.; Oen, I.; Middelkoop, E.; Breederveld, R.; de Baar, M.
Objective: The aim of this study was to examine the reliability and validity of using photographs of burns to assess both burn size and depth. Method: Fifty randomly selected photographs taken on day 0-1 post burn were assessed by seven burn experts and eight referring physicians. Inter-rater
Gundogan, Aysun; Ari, Meziyet; Gonen, Mubeccel
The purpose of this study was to investigate validity and reliability of the test of creative imagination. This study was conducted with the participation of 1000 children, aged between 9-14 and were studying in six primary schools in the city center of Denizli Province, chosen by cluster ratio sampling. In the study, it was revealed that the…
Young, Daniel Kim-Wan; Ng, Petrus Y. N.; Pan, Jia-Yan; Cheng, Daphne
Purpose: This study aims to translate and test the reliability and validity of the Internalized Stigma of Mental Illness-Cantonese (ISMI-C). Methods: The original English version of ISMI is translated into the ISMI-C by going through forward and backward translation procedure. A cross-sectional research design is adopted that involved 295…
Kerkhoffs, Gino M. M. J.; Blankevoort, Leendert; Sierevelt, Inger N.; Corvelein, Ruby; Janssen, Guido H. W.; van Dijk, C. Niek
Two test devices were manufactured to objectively measure ankle joint laxity: the dynamic anterior ankle tester (DAAT) and the quasi-static anterior ankle tester (QAAT). The primary aim was to analyse the reliability of both testers; The secondary aim was to assess validity in correlation with TELOS
Halpin, Glennelle; Halpin, Gerald
Research indicating that different cut-off points result from the use of different standard-setting techniques leaves decision makers with a disturbing dilemma: Which standard-setting method is best? This investigation of the reliability and validity of 10 different standard-setting approaches was designed to provide information that might help…
Yildiz, F. Ülkü; Çagdas, Aysel; Kayili, Gökhan
The purpose of this study is to perform the validity-reliability analysis of the three subtests of Basic School Skills Inventory 3--Mathematics, Classroom Behavior and Daily Life skills--and do its adaptation for four to six year-old Turkish children. The sample of the study included 595 four to six year-old Turkish children attending public and…
Tretter, Thomas R.; Brown, Sherri L.; Bush, William S.; Saderholm, Jon C.; Holmes, Vicki-Lynn
Science teachers' content knowledge is an important influence on student learning, highlighting an ongoing need for programs, and assessments of those programs, designed to support teacher learning of science. Valid and reliable assessments of teacher science knowledge are needed for direct measurement of this crucial variable. This paper…
Arevalo Romero, J.; Brinkkemper, T.; van der Heide, A.; Rietjens, J.A.; Ribbe, M.W.; Deliens, L.; Loer, S.A.; Zuurmond, W.W.A.; Perez, R.S.G.M.
Context: Observer-based sedation scales have been used to provide a measurable estimate of the comfort of nonalert patients in palliative sedation. However, their usefulness and appropriateness in this setting has not been demonstrated. Objectives: To study the reliability and validity of
Putnam, Frank W.; And Others
Evaluation of the Child Dissociative Checklist found it to be a reliable and valid observer report measure of dissociation in children, including sexually abused girls and children with dissociative disorder and with multiple personality disorder. The checklist, which is appended, is intended as a clinical screening instrument and research measure…
Kooiman, Thea; Dontje, Manon L.; Sprenger, Siska; Krijnen, Wim; van der Schans, Cees; de Groot, Martijn
Background: Activity trackers can potentially stimulate users to increase their physical activity behavior. The aim of this study was to examine the reliability and validity of ten consumer activity trackers for measuring step count in both laboratory and free-living conditions. Method: Healthy
Radiological assessment of lumbar lordotic curve aids in early diagnosis of conditions even before neurologic changes set in. Objective: To ascertain the level of reliability and validity of subjective assessment of lumbar lordosis in conventional radiography. Design: A blinded, repeated-measures diagnostic test was carried ...
Automated Body Reaction Test (ABRT) is a new device for skills and physical assessment instrument to measure ability on react, move quickly and accurately in accordance with stimulus. A total of 474 subjects aged 7-17 years old were randomly selected for the construct validity (n=330) and reliability (n=144). The ABRT ...
The aim of this study is to develop a useful, valid and reliable measurement tool that will help teacher candidates determine their Turkish metalinguistic awareness. During the development of the scale, a pool of items was created by scanning the relevant literature and examining other awareness scales. The materials prepared were re-examined…
Rocha, Luiz Roberto Martins; Veiga, Daniela Francescato; e Oliveira, Paulo Rocha; Song, Elaine Horibe; Ferreira, Lydia Masako
The Health Service Quality Scale is a multidimensional hierarchical scale that is based on interdisciplinary approach. This instrument was specifically created for measuring health service quality based on marketing and health care concepts. The aim of this study was to translate and culturally adapt the Health Service Quality Scale into Brazilian Portuguese and to assess the validity and reliability of the Brazilian Portuguese version of the instrument. We conducted a cross-sectional, observational study, with public health system patients in a Brazilian university hospital. Validity was assessed using Pearson's correlation coefficient to measure the strength of the association between the Brazilian Portuguese version of the instrument and the SERVQUAL scale. Internal consistency was evaluated using Cronbach's alpha coefficient; the intraclass (ICC) and Pearson's correlation coefficients were used for test-retest reliability. One hundred and sixteen consecutive postoperative patients completed the questionnaire. Pearson's correlation coefficient for validity was 0.20. Cronbach's alpha for the first and second administrations of the final version of the instrument were 0.982 and 0.986, respectively. For test-retest reliability, Pearson's correlation coefficient was 0.89 and ICC was 0.90. The culturally adapted, Brazilian Portuguese version of the Health Service Quality Scale is a valid and reliable instrument to measure health service quality.
Results: Two valid factors emerged with items 1-3 and items 4, 5 & 7 loading on respectively, making the BFSS a twodimensional (multidimensional) scale which measures 2 aspects of brain fag [labeled burning sensation and crawling sensation respectively]. The reliability analysis yielded a Cronbach Alpha coefficient of ...
Fuchs, Lynn; And Others
A study was conducted to explore the reliability and validity of three prominent procedures used in informal reading inventories (IRIs): (1) choosing a 95% word recognition accuracy standard for determining student instructional level, (2) arbitrarily selecting a passage to represent the difficulty level of a basal reader, and (3) employing…
Background The Health Service Quality Scale is a multidimensional hierarchical scale that is based on interdisciplinary approach. This instrument was specifically created for measuring health service quality based on marketing and health care concepts. The aim of this study was to translate and culturally adapt the Health Service Quality Scale into Brazilian Portuguese and to assess the validity and reliability of the Brazilian Portuguese version of the instrument. Methods We conducted a cross-sectional, observational study, with public health system patients in a Brazilian university hospital. Validity was assessed using Pearson’s correlation coefficient to measure the strength of the association between the Brazilian Portuguese version of the instrument and the SERVQUAL scale. Internal consistency was evaluated using Cronbach’s alpha coefficient; the intraclass (ICC) and Pearson’s correlation coefficients were used for test-retest reliability. Results One hundred and sixteen consecutive postoperative patients completed the questionnaire. Pearson’s correlation coefficient for validity was 0.20. Cronbach's alpha for the first and second administrations of the final version of the instrument were 0.982 and 0.986, respectively. For test-retest reliability, Pearson’s correlation coefficient was 0.89 and ICC was 0.90. Conclusions The culturally adapted, Brazilian Portuguese version of the Health Service Quality Scale is a valid and reliable instrument to measure health service quality. PMID:23327598
In facilitating cross-cultural study in the field of psychology and Logotherapy, the reliability and validity of the logotest which measures inner meaning fulfillment was carried out among 885 University of Ibadan students, 439 males and 434 females, aged between 15 and 60 years old with mean X age of 6.0. Data analyses ...
Chiang, Hui-Ying; Hsiao, Ya-Chu; Lin, Shu-Yuan; Lee, Huan-Fang
To examine the psychometric validity and reliability of the incident reporting culture questionnaire (IRCQ; in Chinese) following an exploration of the reporting culture perceived by hospital nurses in Taiwan. Scale development with psychometric examination and a cross-sectional study. Ten teaching hospitals. A total of 1064 nurses participated with an average response rate of 83% between November 2008 and June 2009. The factorial construct, criterion-related validity, homogeneity and stability of the IRCQ were evaluated. The nurses' perceptions of the IRCQ were also explored. The four-factor structure of the 20-item IRCQ had satisfactory construct validity (explained variance: 49.37%), criterion-related validity (r = 0.42; P = 0.001), reliability (Cronbach's alpha: 0.83) and stability (3-week-interval correlation: r = 0.80; P = 0.001). These factors included 'application of learning from errors', 'readiness to provide feedback on incident reports', 'collegial atmospheres of unpleasantness and punishment' (CA) and 'incident management: confidential and system driven'. The nurses perceived a moderate overall reporting culture (mean positive response = 49.25%; range: 67.2-24.94%). They weakly agreed on the CA factor of five items (mean positive response = 24.94%; range: 33.0-17.2%). This study provides empirical evidence for the psychometric properties of the IRCQ and the reporting culture which nurses perceive in Taiwan. To Taiwanese nurses, the reporting culture within their work environments especially as it relates to coworker relations, inter-professional collaboration and non-punitive atmosphere is their major concern. Healthcare administrators should consider nurses' perceptions related to incident reporting when managing underreporting issues.
We determined the criterion validity and the retest reliability of the ΑctivPAL™ monitor in young people with diplegic cerebral palsy (CP). Activity monitor data were compared with the criterion of video recording for 10 participants. For the retest reliability, activity monitor data were collected from 24 participants on two occasions. Participants had to have diplegic CP and be between 14 and 22 years of age. They also had to be of Gross Motor Function Classification System level II or III. Outcomes were time spent in standing, number of steps (physical activity) and time spent in sitting (sedentary behaviour). For criterion validity, coefficients of determination were all high (r(2) ≥ 0.96), and limits of group agreement were relatively narrow, but limits of agreement for individuals were narrow only for number of steps (≥5.5%). Relative reliability was high for number of steps (intraclass correlation coefficient = 0.87) and moderate for time spent in sitting and lying, and time spent in standing (intraclass correlation coefficients = 0.60-0.66). For groups, changes of up to 7% could be due to measurement error with 95% confidence, but for individuals, changes as high as 68% could be due to measurement error. The results support the criterion validity and the retest reliability of the ActivPAL™ to measure physical activity and sedentary behaviour in groups of young people with diplegic CP but not in individuals. Copyright © 2014 John Wiley & Sons, Ltd.
Full Text Available Abstract Background Musculoskeletal physiotherapists routinely assess lumbar segmental motion during the clinical examination of a patient with low back pain. The validity of manual assessment of segmental motion has not, however, been adequately investigated. Methods In this prospective, multi-centre, pragmatic, diagnostic validity study, 138 consecutive patients with recurrent or chronic low back pain (R/CLBP were recruited. Physiotherapists with post-graduate training in manual therapy performed passive accessory intervertebral motion tests (PAIVMs and passive physiological intervertebral motion tests (PPIVMs. Consenting patients were referred for flexion-extension radiographs. Sagittal angular rotation and sagittal translation of each lumbar spinal motion segment was measured from these radiographs, and compared to a reference range derived from a study of 30 asymptomatic volunteers. Motion beyond two standard deviations from the reference mean was considered diagnostic of rotational lumbar segmental instability (LSI and translational LSI. Accuracy and validity of the clinical assessments were expressed using sensitivity, specificity, and likelihood ratio statistics with 95% confidence intervals (CI. Results Only translation LSI was found to be significantly associated with R/CLBP (p Conclusion This study provides the first evidence reporting the concurrent validity of manual tests for the detection of abnormal sagittal planar motion. PAIVMs and PPIVMs are highly specific, but not sensitive, for the detection of translation LSI. Likelihood ratios resulting from positive test results were only moderate. This research indicates that manual clinical examination procedures have moderate validity for detecting segmental motion abnormality.
Sachs, Bonnie C; Rush, Beth K; Pedraza, Otto
Confrontation naming is commonly assessed in neuropsychological practice, but few standardized measures of naming exist and those that do are susceptible to the effects of education and culture. The Neuropsychological Assessment Battery (NAB) Naming Test is a 31-item measure used to assess confrontation naming. Despite adequate psychometric information provided by the test publisher, there has been limited independent validation of the test. In this study, we investigated the convergent and discriminant validity, internal consistency, and alternate forms reliability of the NAB Naming Test in a sample of adults (Form 1: n = 247, Form 2: n = 151) clinically referred for neuropsychological evaluation. Results indicate adequate-to-good internal consistency and alternate forms reliability. We also found strong convergent validity as demonstrated by relationships with other neurocognitive measures. We found preliminary evidence that the NAB Naming Test demonstrates a more pronounced ceiling effect than other commonly used measures of naming. To our knowledge, this represents the largest published independent validation study of the NAB Naming Test in a clinical sample. Our findings suggest that the NAB Naming Test demonstrates adequate validity and reliability and merits consideration in the test arsenal of clinical neuropsychologists.
Full Text Available Validity and Reliability of Agoraphobic Cognitions Questionnaire-Turkish Version Objective: The aim of this study is to investigate the validity and reliability of Agoraphobic Cognitions Questionnaire -Turkish Version (ACQ. Method: ACQ was administered to 92 patients with agoraphobia or panic disorder with agoraphobia. BSQ Turkish version completed by translation, back-translation and pilot assessment. Reliability of ACQ was analyzed by test-retest correlation, split-half technique, Cronbach’s alpha coefficient. Construct validity was evaluated by factor analysis after the Kaiser-Meyer-Olkin (KMO and Bartlett test had been performed. Principal component analysis and varimax rotation used for factor analysis. Results: 64% of patients evaluated in the study were female and 36% were male. Age interval was between 18 and 58, mean age was 31.5±10.4. The Cronbach’s alpha coefficient was 0.91. Analysis of test-retest evaluations revealed that there were statistically significant correlations ranging between 24% and 84% concerning questionnaire components. In analysis performed by split-half method reliability coefficients of half questionnaires were found as 0.77 and 0.91. Again Spearmen-Brown coefficient was found as 0.87 by the same analysis. To assess construct validity of ACQ, factor analysis was performed and two basic factors found. These two factors explained 57.6% of the total variance. (Factor 1: 34.6%, Factor 2: 23% Conclusion: Our findings support that ACQ-Turkish version had a satisfactory level of reliability and validity
Schlegel, Claudia; Woermann, Ulrich; Rethans, Jan-Joost; van der Vleuten, Cees
In the training of healthcare professionals, one of the advantages of communication training with simulated patients (SPs) is the SP's ability to provide direct feedback to students after a simulated clinical encounter. The quality of SP feedback must be monitored, especially because it is well known that feedback can have a profound effect on student performance. Due to the current lack of valid and reliable instruments to assess the quality of SP feedback, our study examined the validity and reliability of one potential instrument, the 'modified Quality of Simulated Patient Feedback Form' (mQSF). Content validity of the mQSF was assessed by inviting experts in the area of simulated clinical encounters to rate the importance of the mQSF items. Moreover, generalizability theory was used to examine the reliability of the mQSF. Our data came from videotapes of clinical encounters between six simulated patients and six students and the ensuing feedback from the SPs to the students. Ten faculty members judged the SP feedback according to the items on the mQSF. Three weeks later, this procedure was repeated with the same faculty members and recordings. All but two items of the mQSF received importance ratings of > 2.5 on a four-point rating scale. A generalizability coefficient of 0.77 was established with two judges observing one encounter. The findings for content validity and reliability with two judges suggest that the mQSF is a valid and reliable instrument to assess the quality of feedback provided by simulated patients.
Yoshizumi, Takahiro; Murase, Satomi; Murakami, Takashi; Takai, Jiro
The purposes of the present study were to develop a Parenting Scale of Inconsistency and to evaluate its initial reliability and validity. The 12 items assess the inconsistency among parents' moods, behaviors, and attitudes toward children. In the primary study, 517 participants completed three measures: the new Parenting Scale of Inconsistency, the Parental Bonding Instrument, and the Depression Scale of the General Health Questionnaire. The Parenting Scale of Inconsistency had good test-retest reliability of .85 and internal consistency of .88 (Cronbach coefficient alpha). Construct validity was good as Inconsistency scores were significantly correlated with the Care and Overprotection scores of the Parental Bonding Instrument and with the Depression scores. Moreover, Inconsistency scores' relation with a dimension of parenting style distinct from Care and Overprotection suggested that the Parenting Scale of Inconsistency had factorial validity. This scale seems a potential measure for examining the relationships between inconsistent parenting and the mental health of children.
Carvalho, Flávia A; Morelhão, Priscila K; Franco, Marcia R; Maher, Chris G; Smeets, Rob J E M; Oliveira, Crystian B; Freitas Júnior, Ismael F; Pinto, Rafael Z
Although there is some evidence for reliability and validity of self-report physical activity (PA) questionnaires in the general adult population, it is unclear whether we can assume similar measurement properties in people with chronic low back pain (LBP). To determine the test-retest reliability of the International Physical Activity Questionnaire (IPAQ) long-version and the Baecke Physical Activity Questionnaire (BPAQ) and their criterion-related validity against data derived from accelerometers in patients with chronic LBP. Cross-sectional study. Patients with non-specific chronic LBP were recruited. Each participant attended the clinic twice (one week interval) and completed self-report PA. Accelerometer measures >7 days included time spent in moderate-and-vigorous physical activity, steps/day, counts/minute, and vector magnitude counts/minute. Intraclass Correlation Coefficients (ICC) and Bland and Altman method were used to determine reliability and spearman rho correlation were used for criterion-related validity. A total of 73 patients were included in our analyses. The reliability analyses revealed that the BPAQ and its subscales have moderate to excellent reliability (ICC 2,1 : 0.61 to 0.81), whereas IPAQ and most IPAQ domains (except walking) showed poor reliability (ICC 2,1 : 0.20 to 0.40). The Bland and Altman method revealed larger discrepancies for the IPAQ. For the validity analysis, questionnaire and accelerometer measures showed at best fair correlation (rho reliability than the IPAQ long-version, both questionnaires did not demonstrate acceptable validity against accelerometer data. These findings suggest that questionnaire and accelerometer PA measures should not be used interchangeably in this population. Copyright © 2016 Elsevier Ltd. All rights reserved.
Martinez, Maria Carmen; Latorre, Maria do Rosário Dias de Oliveira; Fischer, Frida Marina
To evaluate the validity and reliability of the Portuguese language version of a work ability index. Cross sectional survey of a sample of 475 workers from an electrical company in the state of Sao Paulo, Southeastern Brazil (spread across ten municipalities in the Campinas area), carried out in 2005. The following aspects of the Brazilian version of the Work Ability Index were evaluated: construct validity, using factorial exploratory analysis, and discriminant capacity, by comparing mean Work Ability Index scores in two groups with different absenteeism levels; criterion validity, by determining the correlation between self-reported health and Work Ability Index score; and reliability, using Cronbach's alpha to determine the internal consistency of the questionnaire. Factorial analysis indicated three factors in the work ability construct: issues pertaining to 'mental resources' (20.6% of the variance), self-perceived work ability (18.9% of the variance), and presence of diseases and health-related limitations (18.4% of the variance). The index was capable of discriminating workers according to levels of absenteeism, identifying a significantly lower (pindex and all dimensions of health status analyzed (pindex was high, with a Cronbach's alpha of 0.72. The Brazilian version of the Work Ability Index showed satisfactory psychometric properties with respect to construct validity, thus constituting an appropriate option for evaluating work ability in both individual and population-based settings.
Tucker, Wesley J; Bhammar, Dharini M; Sawyer, Brandon J; Buman, Matthew P; Gaesser, Glenn A
The Nike + Fuelband is a commercially available, wrist-worn accelerometer used to track physical activity energy expenditure (PAEE) during exercise. However, validation studies assessing the accuracy of this device for estimating PAEE are lacking. Therefore, this study examined the validity and reliability of the Nike + Fuelband for estimating PAEE during physical activity in young adults. Secondarily, we compared PAEE estimation of the Nike + Fuelband with the previously validated SenseWear Armband (SWA). Twenty-four participants (n = 24) completed two, 60-min semi-structured routines consisting of sedentary/light-intensity, moderate-intensity, and vigorous-intensity physical activity. Participants wore a Nike + Fuelband and SWA, while oxygen uptake was measured continuously with an Oxycon Mobile (OM) metabolic measurement system (criterion). The Nike + Fuelband (ICC = 0.77) and SWA (ICC = 0.61) both demonstrated moderate to good validity. PAEE estimates provided by the Nike + Fuelband (246 ± 67 kcal) and SWA (238 ± 57 kcal) were not statistically different than OM (243 ± 67 kcal). Both devices also displayed similar mean absolute percent errors for PAEE estimates (Nike + Fuelband = 16 ± 13 %; SWA = 18 ± 18 %). Test-retest reliability for PAEE indicated good stability for Nike + Fuelband (ICC = 0.96) and SWA (ICC = 0.90). The Nike + Fuelband provided valid and reliable estimates of PAEE, that are similar to the previously validated SWA, during a routine that included approximately equal amounts of sedentary/light-, moderate- and vigorous-intensity physical activity.
Ferris, G.R.; Blickle, G.; Schneider, P.B.
made to also identify a single, higher-order factor solution through second-order factor analysis. The present research aims to expand on prior work and report on a two-study investigation of both the construct validity and antecedents and consequences of the political skill construct. Design/methodology...
Lai, CY; Lee, KY; Lams, MHS; Wu, CF; Peake, R; Flint, SW; Li, WHC; Ho, E
Objective: The purpose of this study was to examine the test-retest reliability and the criterion validity of a curlup\\ud test (CUT) as a measure of core stability, core endurance and dynamic stability in kindergarten children. CUT\\ud performance was also compared to half hold lying test (HHLT) and walking time on course (WTC) among without\\ud obstacle, with low obstacle and high obstacle measures of core stability, core endurance and dynamic stability.\\ud Methods: To estimate reliability, 33...
Yoshii, Hatsumi; Mandai, Nozomu; Saito, Hidemitsu; Akazawa, Kouhei
Self-stigma, defined by a negative attitude toward oneself combined with the consciousness of being a target of prejudice, is a critical problem for psychiatric patients. Self-stigma studies among psychiatric patients have indicated that high stigma is predictive of detrimental effects such as the delay of treatment and decreases in social participation in patients, and levels of self-stigma should be statistically evaluated. In this study, we developed the Workplace Social Distance Scale (WSDS), rephrasing the eight items of the Japanese version of the Social Distance Scale (SDSJ) to apply to the work setting in Japan. We examined the reliability and validity of the WSDS among 83 psychiatric patients. Factor analysis extracted three factors from the scale items: "work relations," "shallow relationships," and "employment." These factors are similar to the assessment factors of the SDSJ. Cronbach's alpha coefficient for the WSDS was 0.753. The split-half reliability for the WSDS was 0.801, indicating significant correlations. In addition, the WSDS was significantly correlated with the SDSJ. These findings suggest that the WSDS represents an approximation of self-stigma in the workplace among psychiatric patients. Our study assessed the reliability and validity of the WSDS for measuring self-stigma in Japan. Future studies should investigate the reliability and validity of the scale in other countries.
Noormohammadpour, Pardis; Hosseini Khezri, Alireza; Farahbakhsh, Farzin; Mansournia, Mohammad Ali; Smuck, Matthew; Kordi, Ramin
The purpose of this study was to evaluate validity and reliability of a new proposed questionnaire for assessment of functional disability in athletes with low back pain (LBP). Validity and reliability study. Elite athletes participating in different fields of sports. Participants were 165 male and female athletes (between 12 and 50 years old) with LBP. Athlete Disability Index (ADI) Questionnaire which is developed by the authors for assessing LBP-related disability in athletes, Oswestry Disability Index (ODI), and the Roland-Morris Disability Questionnaire (RDQ). Self-reported responses were collected regarding LBP-related disability through ADI, ODI, and RDQ. The test-retest reliability was strong, and intraclass correlation value ranged between 0.74 and 0.94. The Cronbach alpha coefficient value of 0.91 (P visual analog scale was r = 0.626 (P disability levels were mild in the large majority of subjects (91.5% and 86.0%, respectively). Alternatively, disability assessments by the ADI did not cluster at the mild level and ranged more broadly from mild to very high. The ADI is a reliable and valid instrument for assessing disability in athletes with LBP. Compared with the available LBP disability questionnaires used in the general population, ADI can more precisely stratify the disability levels of athletes due to LBP.
Higgins, Kathryn L; Caze, Todd; Maerlender, Arthur
The Immediate Postconcussion Assessment and Cognitive Testing (ImPACT) is a computerized neuropsychological test battery commonly used to determine cognitive recovery from concussion based on comparing post-injury scores to baseline scores. This model is based on the premise that ImPACT baseline test scores are a valid and reliable measure of optimal cognitive function at baseline. Growing evidence suggests that this premise may not be accurate and a large contributor to invalid and unreliable baseline test scores may be the protocol and environment in which baseline tests are administered. This study examined the effects of a standardized environment and administration protocol on the reliability and performance validity of athletes' baseline test scores on ImPACT by comparing scores obtained in two different group-testing settings. Three hundred-sixty one Division 1 cohort-matched collegiate athletes' baseline data were assessed using a variety of indicators of potential performance invalidity; internal reliability was also examined. Thirty-one to thirty-nine percent of the baseline cases had at least one indicator of low performance validity, but there were no significant differences in validity indicators based on environment in which the testing was conducted. Internal consistency reliability scores were in the acceptable to good range, with no significant differences between administration conditions. These results suggest that athletes may be reliably performing at levels lower than their best effort would produce. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: firstname.lastname@example.org.
Neil, Sarah E; Myring, Alec; Peeters, Mon Jef; Pirie, Ian; Jacobs, Rachel; Hunt, Michael A; Garland, S Jayne; Campbell, Kristin L
Muscular strength is a key parameter of rehabilitation programs and a strong predictor of functional capacity. Traditional methods to measure strength, such as manual muscle testing (MMT) and hand-held dynamometry (HHD), are limited by the strength and experience of the tester. The Performance Recorder 1 (PR1) is a strength assessment tool attached to resistance training equipment and may be a time- and cost-effective tool to measure strength in clinical practice that overcomes some limitations of MMT and HHD. However, reliability and validity of the PR1 have not been reported. Test-retest and inter-rater reliability was assessed using the PR1 in healthy adults (n = 15) during isometric knee flexion and extension. Criterion-related validity was assessed through comparison of values obtained from the PR1 and Biodex® isokinetic dynamometer. Test-retest reliability was excellent for peak knee flexion (intra-class correlation coefficient [ICC] of 0.96, 95% CI: 0.85, 0.99) and knee extension (ICC = 0.96, 95% CI: 0.87, 0.99). Inter-rater reliability was also excellent for peak knee flexion (ICC = 0.95, 95% CI: 0.85, 0.99) and peak knee extension (ICC = 0.97, 95% CI: 0.91, 0.99). Validity was moderate for peak knee flexion (ICC = 0.75, 95% CI: 0.38, 0.92) but poor for peak knee extension (ICC = 0.37, 95% CI: 0, 0.73). The PR1 provides a reliable measure of isometric knee flexor and extensor strength in healthy adults that could be used in the clinical setting, but absolute values may not be comparable to strength assessment by gold-standard measures.
Full Text Available Purpose: Learning-style instruments assist students in developing their own learning strategies and outcomes, in eliminating learning barriers, and in acknowledging peer diversity. Only a few psychometrically validated learning-style instruments are available. This study aimed to develop a valid and reliable learning-style instrument for nursing students. Methods: A cross-sectional survey study was conducted in two nursing schools in two countries. A purposive sample of 156 undergraduate nursing students participated in the study. Face and content validity was obtained from an expert panel. The LSS construct was established using principal axis factoring (PAF with oblimin rotation, a scree plot test, and parallel analysis (PA. The reliability of LSS was tested using Cronbach’s α, corrected item-total correlation, and test-retest. Results: Factor analysis revealed five components, confirmed by PA and a relatively clear curve on the scree plot. Component strength and interpretability were also confirmed. The factors were labeled as perceptive, solitary, analytic, competitive, and imaginative learning styles. Cronbach’s α was > 0.70 for all subscales in both study populations. The corrected item-total correlations were > 0.30 for the items in each component. Conclusion: The LSS is a valid and reliable inventory for evaluating learning style preferences in nursing students in various multicultural environments.
Carlsen, C G; Lindorff-Larsen, K; Funch-Jensen, P; Lund, L; Charles, P; Konge, L
Lichtenstein hernia repair is a common surgical procedure and one of the first procedures performed by a surgical trainee. However, formal assessment tools developed for this procedure are few and sparsely validated. The aim of this study was to determine the reliability and validity of an assessment tool designed to measure surgical skills in Lichtenstein hernia repair. Key issues were identified through a focus group interview. On this basis, an assessment tool with eight items was designed. Ten surgeons and surgical trainees were video recorded while performing Lichtenstein hernia repair, (four experts, three intermediates, and three novices). The videos were blindly and individually assessed by three raters (surgical consultants) using the assessment tool. Based on these assessments, validity and reliability were explored. The internal consistency of the items was high (Cronbach's alpha = 0.97). The inter-rater reliability was very good with an intra-class correlation coefficient (ICC) = 0.93. Generalizability analysis showed a coefficient above 0.8 even with one rater. The coefficient improved to 0.92 if three raters were used. One-way analysis of variance found a significant difference between the three groups which indicates construct validity, p fashion with the new procedure-specific assessment tool. We recommend this tool for future assessment of trainees performing Lichtenstein hernia repair to ensure that the objectives of competency-based surgical training are met.
Abdollahimohammad, Abdolghani; Ja'afar, Rogayah
Learning-style instruments assist students in developing their own learning strategies and outcomes, in eliminating learning barriers, and in acknowledging peer diversity. Only a few psychometrically validated learning-style instruments are available. This study aimed to develop a valid and reliable learning-style instrument for nursing students. A cross-sectional survey study was conducted in two nursing schools in two countries. A purposive sample of 156 undergraduate nursing students participated in the study. Face and content validity was obtained from an expert panel. The LSS construct was established using principal axis factoring (PAF) with oblimin rotation, a scree plot test, and parallel analysis (PA). The reliability of LSS was tested using Cronbach's α, corrected item-total correlation, and test-retest. Factor analysis revealed five components, confirmed by PA and a relatively clear curve on the scree plot. Component strength and interpretability were also confirmed. The factors were labeled as perceptive, solitary, analytic, competitive, and imaginative learning styles. Cronbach's α was >0.70 for all subscales in both study populations. The corrected item-total correlations were >0.30 for the items in each component. The LSS is a valid and reliable inventory for evaluating learning style preferences in nursing students in various multicultural environments.
Arevalo, Jimmy J; Brinkkemper, Tijn; van der Heide, Agnes; Rietjens, Judith A; Ribbe, Miel; Deliens, Luc; Loer, Stephan A; Zuurmond, Wouter W A; Perez, Roberto S G M
Observer-based sedation scales have been used to provide a measurable estimate of the comfort of nonalert patients in palliative sedation. However, their usefulness and appropriateness in this setting has not been demonstrated. To study the reliability and validity of observer-based sedation scales in palliative sedation. A prospective evaluation of 54 patients under intermittent or continuous sedation with four sedation scales was performed by 52 nurses. Included scales were the Minnesota Sedation Assessment Tool (MSAT), Richmond Agitation-Sedation Scale (RASS), Vancouver Interaction and Calmness Scale (VICS), and a sedation score proposed in the Guideline for Palliative Sedation of the Royal Dutch Medical Association (KNMG). Inter-rater reliability was tested with the intraclass correlation coefficient (ICC) and Cohen's kappa coefficient. Correlations between the scales using Spearman's rho tested concurrent validity. We also examined construct, discriminative, and evaluative validity. In addition, nurses completed a user-friendliness survey. Overall moderate to high inter-rater reliability was found for the VICS interaction subscale (ICC = 0.85), RASS (ICC = 0.73), and KNMG (ICC = 0.71). The largest correlation between scales was found for the RASS and KNMG (rho = 0.836). All scales showed discriminative and evaluative validity, except for the MSAT motor subscale and VICS calmness subscale. Finally, the RASS was less time consuming, clearer, and easier to use than the MSAT and VICS. The RASS and KNMG scales stand as the most reliable and valid among the evaluated scales. In addition, the RASS was less time consuming, clearer, and easier to use than the MSAT and VICS. Further research is needed to evaluate the impact of the scales on better symptom control and patient comfort. Copyright © 2012 U.S. Cancer Pain Relief Committee. Published by Elsevier Inc. All rights reserved.
Rodríguez-Marroyo, Jose A; Medina-Carrillo, Javier; García-López, Juan; Morante, Juan C; Villa, José G; Foster, Carl
To analyze the concurrent and construct validity of a volleyball intermittent endurance test (VIET). The VIET's test-retest reliability and sensitivity to assess seasonal changes was also studied. During the preseason, 71 volleyball players of different competitive levels took part in this study. All performed the VIET and a graded treadmill test with gas-exchange measurement (GXT). Thirty-one of the players performed an additional VIET to analyze the test-retest reliability. To test the VIET's sensitivity, 28 players repeated the VIET and GXT at the end of their season. Significant (P volleyball players.
Higueras-Fresnillo, Sara; Esteban-Cornejo, Irene; Gasque, Pablo; Veiga, Oscar L; Martinez-Gomez, David
Stair climbing is an activity of daily living that might contribute to increase levels of physical activity (PA). To date, there is no study examining the validity of climbing stairs assessed by self-report. The aim of this study was, therefore, to examine the validity of estimated stair climbing from one question included in a common questionnaire compared to a pattern-recognition activity monitor in older adults. A total of 138 older adults (94 women), aged 65-86 years (70.9 ± 4.7 years), from the IMPACT65 + study participated in this validity study. Estimates of stair climbing were obtained from the European Prospective Investigation into Cancer and Nutrition (EPIC) PA questionnaire. An objective assessment of stair climbing was obtained with the Intelligent Device for Energy Expenditure and Activity (IDEEA) monitor. The correlation between both methods to assess stair climbing was fair (ρ = 0.22, p = 0.008 for PA energy expenditure and ρ = 0.26, p = 0.002 for duration). Mean differences between self-report and the IDEEA were 7.96 ± 10.52 vs. 9.88 ± 3.32 METs-min/day for PA energy expenditure, and 0.99 ± 1.32 vs. 1.79 ± 2.02 min/day for duration (both Wilcoxon test p < 0.001). Results from the Bland-Altman analysis indicate that bias between both instruments were -1.91 ± 10.30 METs-min/day and -0.80 ± 1.99 min/day, and corresponding limits of agreement for the two instruments were from 18.27 to -22.10 METs-min/day and from 3.09 to -4.70 min/day, respectively. Our results indicate that self-reported stair climbing has modest validity to accurately rank old age participants, and underestimates both PAEE and its duration, as compared with an objectively measured method.
Cuesta-Vargas, Antonio I; Galan-Mercant, Alejandro; Martín-Borras, Maria Carmen; González-Sánchez, Manuel
Criterion-related validity of a self-administered questionnaire listed as gold standard requires objective testing. The aim of this study was to analyze the Foot Health Status Questionnaire (FHSQ) using functional variable measures (dynamic plantar pressure and foot strength). A total of 22 elderly healthy participants (13 women and 9 men) were screened by interview and physical examination for foot or gait abnormalities. Foot strength, footprint pressure, and foot health status were measured. All the items of the FHSQ show significant correlation with functional variables, but general foot health shows the highest correlation with the 4 physical variables related to plantar pressure (R2 = 0.741), followed by foot pain (R2 = 0.652). A set of different, directly measured physical variables related to foot strength and plantar pressure significantly correlate with the FHSQ dimensions. Cross-sectional trial.
Schultz-Larsen, K; Avlund, K; Kreiner, S
Criterion-related validity of a new measure of functional ability was conducted according to a causal model based on conceptual models employed in the area of rehabilitative and geriatric medicine. The criteria variables included concurrent diagnosed diseases, global self-rated health, drug...... consumption and general practitioner (GP) consultations. The measure of functional ability was developed with the intention of achieving a high degree of discrimination among a group of community dwelling elderly. Data were derived from a sample survey of 70-year-old men and women conducted in 1984...... different unidimensional index scales of functional ability divided into two types, with reduced speed and tiredness as subdimensions. The two scale types were mobility function and lower limb function. Early losses of ability together with global self-rated health were treated as outcome measures...
Donders, Jacobus; Janke, Kelly
The performance of 40 children with complicated mild to severe traumatic brain injury on the Wechsler Intelligence Scale for Children-Fourth Edition (WISC-IV; Wechsler, 2003) was compared with that of 40 demographically matched healthy controls. Of the four WISC-IV factor index scores, only Processing Speed yielded a statistically significant group difference (p < .001) as well as a statistically significant negative correlation with length of coma (p < .01). Logistic regression, using Processing Speed to classify individual children, yielded a sensitivity of 72.50% and a specificity of 62.50%, with false positive and false negative rates both exceeding 30%. We conclude that Processing Speed has acceptable criterion validity in the evaluation of children with complicated mild to severe traumatic brain injury but that the WISC-IV should be supplemented with other measures to assure sufficient accuracy in the diagnostic process.
Yildirim, Fatma; Ilhan, Inci Ozgür
Self-efficacy, which is a basic construct in social cognitive theory, has been defined as one's belief in his/her ability to start, continue, and complete an action in a manner that has an impact on his/her environment. This study aimed to investigate the psychometric properties of the General Self-Efficacy Scale-Turkish Form. The General Self-Efficacy Scale-Turkish Form was administered to 895 individuals ?18 years of age that had at least 5 years of education. Exploratory factor analysis, criterion validity testing (using the Beck Depression Scale, Spielberger Trait Anxiety Inventory, Locus of Control Scale, Learned Resourcefulness Scale, and Coopersmith Self Esteem Inventory), internal consistency analysis, and test-retest reliability analysis were performed. The 3-factor structure of the scale explained 41.5% of the observed variance. Correlations between the General Self-Efficacy Scale-Turkish Form and the other measures were statistically significant. The Cronbach's alpha coefficient for the entire scale was 0.80 and the test-retest reliability coefficient estimated from data for 236 individuals that were contacted for follow-up was 0.69. The General Self-Efficacy Scale-Turkish Form is a valid and reliable instrument for the assessment of general self-efficacy in individuals ?18 years of age with at least 5 years of education.
Zhang, Peng; Godin, Steven D; Owens, Matthew V
This study aimed to investigate the validity and reliability of the energy expenditure (EE) estimation of Apple Watch among college students. Thirty college students completed two sets of three 10-minute treadmill walking and running trials while wearing three Apple Watches and being connected to indirect calorimetry. The walking trials were at speeds of 54, 80, and 107 m•min-1 while the running trials were at 134, 161, 188m•min-1. Energy expenditure comparisons were made using Two-way ANOVA with repeatedmeasures. Reliability was analyzed by Intraclass Correlation. There was no significant device x speed interactions (F (15, 696) = 1.113, p = 0.341) between the indirect calorimetry (criterion) and Apple Watch. The lowest Inter-Class Correlation (ICC) scores were 0.49 (95%CI) at 54 while the highest were 0.72 (95%CI) at 107 and 134 m•min-1. Apple Watch demonstrated a low to moderate validity and reliability on measuring EE.
Lewin, John J; LeDroux, Shannon N; Shermock, Kenneth M; Thompson, Carol B; Goodwin, Haley E; Mirski, Erin A; Gill, Randeep S; Mirski, Marek A
To validate The Johns Hopkins Adapted Cognitive Exam designed to assess and quantify cognition in critically ill patients. Prospective cohort study. Neurosciences, surgical, and medical intensive care units at The Johns Hopkins Hospital. One hundred six adult critically ill patients. One expert neurologic assessment and four measurements of the Adapted Cognitive Exam (all patients). Four measurements of the Folstein Mini-Mental State Examination in nonintubated patients only. Adapted Cognitive Exam and Mini-Mental State Examination were performed by 76 different raters. One hundred six patients were assessed, 46 intubated and 60 nonintubated, resulting in 424 Adapted Cognitive Exam and 240 Mini-Mental State Examination measurements. Criterion validity was assessed by comparing Adapted Cognitive Exam with a neurointensivist's assessment of cognitive status (ρ = 0.83, p validity was assessed by comparing Adapted Cognitive Exam with Mini-Mental State Examination in nonintubated patients (ρ = 0.81, p validity was assessed by surveying raters who used both the Adapted Cognitive Exam and Mini-Mental State Examination and indicated the Adapted Cognitive Exam was an accurate reflection of the patient's cognitive status, more sensitive a marker of cognition than the Mini-Mental State Examination, and easy to use. The Adapted Cognitive Exam demonstrated excellent interrater reliability (intraclass correlation coefficient = 0.997; 95% confidence interval 0.997-0.998) and interitem reliability of each of the five subscales of the Adapted Cognitive Exam and Mini-Mental State Examination (Cronbach's α: range for Adapted Cognitive Exam = 0.83-0.88; range for Mini-Mental State Examination = 0.72-0.81). The Adapted Cognitive Exam is the first valid and reliable examination for the assessment and quantification of cognition in critically ill patients. It provides a useful, objective tool that can be used by any member of the interdisciplinary critical care team to support
O'Hare, L; Santin, O; Winter, K; McGuinness, C
There is a growing impetus across the research, policy and practice communities for children and young people to participate in decisions that affect their lives. Furthermore, there is a dearth of general instruments that measure children and young people's views on their participation in decision-making. This paper presents the reliability and validity of the Child and Adolescent Participation in Decision-Making Questionnaire (CAP-DMQ) and specifically looks at a population of looked-after children, where a lack of participation in decision-making is an acute issue. The participants were 151 looked after children and adolescents between 10-23 years of age who completed the 10 item CAP-DMQ. Of the participants 113 were in receipt of an advocacy service that had an aim of increasing participation in decision-making with the remaining participants not having received this service. The results showed that the CAP-DMQ had good reliability (Cronbach's alpha = 0.94) and showed promising uni-dimensional construct validity through an exploratory factor analysis. The items in the CAP-DMQ also demonstrated good content validity by overlapping with prominent models of child and adolescent participation (Lundy 2007) and decision-making (Halpern 2014). A regression analysis showed that age and gender were not significant predictors of CAP-DMQ scores but receipt of advocacy was a significant predictor of scores (effect size d = 0.88), thus showing appropriate discriminant criterion validity. Overall, the CAP-DMQ showed good reliability and validity. Therefore, the measure has excellent promise for theoretical investigation in the area of child and adolescent participation in decision-making and equally shows empirical promise for use as a measure in evaluating services, which have increasing the participation of children and adolescents in decision-making as an intended outcome. © 2016 John Wiley & Sons Ltd.
Hulteen, Ryan M; Lander, Natalie J; Morgan, Philip J; Barnett, Lisa M; Robertson, Samuel J; Lubans, David R
It has been suggested that young people should develop competence in a variety of 'lifelong physical activities' to ensure that they can be active across the lifespan. The primary aim of this systematic review is to report the methodological properties, validity, reliability, and test duration of field-based measures that assess movement skill competency in lifelong physical activities. A secondary aim was to clearly define those characteristics unique to lifelong physical activities. A search of four electronic databases (Scopus, SPORTDiscus, ProQuest, and PubMed) was conducted between June 2014 and April 2015 with no date restrictions. Studies addressing the validity and/or reliability of lifelong physical activity tests were reviewed. Included articles were required to assess lifelong physical activities using process-oriented measures, as well as report either one type of validity or reliability. Assessment criteria for methodological quality were adapted from a checklist used in a previous review of sport skill outcome assessments. Movement skill assessments for eight different lifelong physical activities (badminton, cycling, dance, golf, racquetball, resistance training, swimming, and tennis) in 17 studies were identified for inclusion. Methodological quality, validity, reliability, and test duration (time to assess a single participant), for each article were assessed. Moderate to excellent reliability results were found in 16 of 17 studies, with 71% reporting inter-rater reliability and 41% reporting intra-rater reliability. Only four studies in this review reported test-retest reliability. Ten studies reported validity results; content validity was cited in 41% of these studies. Construct validity was reported in 24% of studies, while criterion validity was only reported in 12% of studies. Numerous assessments for lifelong physical activities may exist, yet only assessments for eight lifelong physical activities were included in this review
Sorenson, Shawn C.; Romano, Russell; Scholefield, Robin M.; Schroeder, E. Todd; Azen, Stanley P.; Salem, George J.
Context Self-report questionnaires are an important method of evaluating lifespan health, exercise, and health-related quality of life (HRQL) outcomes among elite, competitive athletes. Few instruments, however, have undergone formal characterization of their psychometric properties within this population. Objective To evaluate the validity and reliability of a novel health and exercise questionnaire, the Trojan Lifetime Champions (TLC) Health Survey. Design Descriptive laboratory study. Setting A large National Collegiate Athletic Association Division I university. Patients or Other Participants A total of 63 university alumni (age range, 24 to 84 years), including former varsity collegiate athletes and a control group of nonathletes. Intervention(s) Participants completed the TLC Health Survey twice at a mean interval of 23 days with randomization to the paper or electronic version of the instrument. Main Outcome Measure(s) Content validity, feasibility of administration, test-retest reliability, parallel-form reliability between paper and electronic forms, and estimates of systematic and typical error versus differences of clinical interest were assessed across a broad range of health, exercise, and HRQL measures. Results Correlation coefficients, including intraclass correlation coefficients (ICCs) for continuous variables and κ agreement statistics for ordinal variables, for test-retest reliability averaged 0.86, 0.90, 0.80, and 0.74 for HRQL, lifetime health, recent health, and exercise variables, respectively. Correlation coefficients, again ICCs and κ, for parallel-form reliability (ie, equivalence) between paper and electronic versions averaged 0.90, 0.85, 0.85, and 0.81 for HRQL, lifetime health, recent health, and exercise variables, respectively. Typical measurement error was less than the a priori thresholds of clinical interest, and we found minimal evidence of systematic test-retest error. We found strong evidence of content validity, convergent
Nedelec, Bernadette; Correa, José A; Rachelska, Grazyna; Armour, Alexis; LaSalle, Léo
Research into the pathophysiology and treatment of hypertrophic scar (HSc) remains limited by the heterogeneity of scar and the imprecision with which its severity is measured. The objective of this study was to test the interrater reliability and concurrent validity of the Cutometer measurement of elasticity, the Mexameter measurement of erythema and pigmentation, and total thickness measure of the DermaScan C relative to the modified Vancouver Scar Scale (mVSS) in patient-matched normal skin, normal scar, and HSc. Three independent investigators evaluated 128 sites (severe HSc, moderate or mild HSc, donor site, and normal skin) on 32 burn survivors using all of the above measurement tools. The intraclass correlation coefficient, which was used to measure interrater reliability, reflects the inherent amount of error in the measure and is considered acceptable when it is >0.75. Interrater reliability of the totals of the height, pliability, and vascularity subscales of the mVSS fell below the acceptable limit ( congruent with0.50). The individual subscales of the mVSS fell well below the acceptable level (0.89) for each study site with the exception of severe scar. Mexameter and DermaScan C reliability measurements were acceptable for all sites (>0.82). Concurrent validity correlations with the mVSS were significant except for the comparison of the mVSS pliability subscale and the Cutometer maximum deformation measure comparison in severe scar. In conclusion, the Mexameter and DermaScan C measurements of scar color and thickness of all sites, as well as the Cutometer measurement of elasticity in all but the most severe scars shows high interrater reliability. Their significant concurrent validity with the mVSS confirms that these tools are measuring the same traits as the mVSS, and in a more objective way.
Collins, Cristiana Kahl; Johnson, Vicky Saliba; Godwin, Ellen M; Pappas, Evangelos
To determine the reliability and validity of the Saliba Postural Classification System (SPCS). Two physical therapists classified pictures of 100 volunteer participants standing in their habitual posture for inter and intra-tester reliability. For validity, 54 participants stood on a force plate in a habitual and a corrected posture, while a vertical force was applied through the shoulders until the clinician felt a postural give. Data were extracted at the time the give was felt and at a time in the corrected posture that matched the peak vertical ground reaction force (VGRF) in the habitual posture. Inter-tester reliability demonstrated 75% agreement with a Kappa = 0.64 (95% CI = 0.524-0.756, SE = 0.059). Intra-tester reliability demonstrated 87% agreement with a Kappa = 0.8, (95% CI = 0.702-0.898, SE = 0.05) and 80% agreement with a Kappa = 0.706, (95% CI = 0.594-0818, SE = 0.057). The examiner applied a significantly higher (p < 0.001) peak vertical force in the corrected posture prior to a postural give when compared to the habitual posture. Within the corrected posture, the %VGRF was higher when the test was ongoing vs. when a postural give was felt (p < 0.001). The %VGRF was not different between the two postures when comparing the peaks (p = 0.214). The SPCS has substantial agreement for inter- and intra-tester reliability and is largely a valid postural classification system as determined by the larger vertical forces in the corrected postures. Further studies on the correlation between the SPCS and diagnostic classifications are indicated.
Manuel V. Garnacho-Castaño
Full Text Available The objectives of the study were to determine the validity and reliability of peak velocity (PV, average velocity (AV, peak power (PP and average power (AP measurements were made using a linear position transducer. Validity was assessed by comparing measurements simultaneously obtained using the Tendo Weightlifting Analyzer Systemi and T-Force Dynamic Measurement Systemr (Ergotech, Murcia, Spain during two resistance exercises, bench press (BP and full back squat (BS, performed by 71 trained male subjects. For the reliability study, a further 32 men completed both lifts using the Tendo Weightlifting Analyzer Systemz in two identical testing sessions one week apart (session 1 vs. session 2. Intraclass correlation coefficients (ICCs indicating the validity of the Tendo Weightlifting Analyzer Systemi were high, with values ranging from 0.853 to 0.989. Systematic biases and random errors were low to moderate for almost all variables, being higher in the case of PP (bias ±157.56 W; error ±131.84 W. Proportional biases were identified for almost all variables. Test-retest reliability was strong with ICCs ranging from 0.922 to 0.988. Reliability results also showed minimal systematic biases and random errors, which were only significant for PP (bias -19.19 W; error ±67.57 W. Only PV recorded in the BS showed no significant proportional bias. The Tendo Weightlifting Analyzer Systemi emerged as a reliable system for measuring movement velocity and estimating power in resistance exercises. The low biases and random errors observed here (mainly AV, AP make this device a useful tool for monitoring resistance training.
Seguí, María del Mar; Cabrero-García, Julio; Crespo, Ana; Verdú, José; Ronda, Elena
To design and validate a questionnaire to measure visual symptoms related to exposure to computers in the workplace. Our computer vision syndrome questionnaire (CVS-Q) was based on a literature review and validated through discussion with experts and performance of a pretest, pilot test, and retest. Content validity was evaluated by occupational health, optometry, and ophthalmology experts. Rasch analysis was used in the psychometric evaluation of the questionnaire. Criterion validity was determined by calculating the sensitivity and specificity, receiver operator characteristic curve, and cutoff point. Test-retest repeatability was tested using the intraclass correlation coefficient (ICC) and concordance by Cohen's kappa (κ). The CVS-Q was developed with wide consensus among experts and was well accepted by the target group. It assesses the frequency and intensity of 16 symptoms using a single rating scale (symptom severity) that fits the Rasch rating scale model well. The questionnaire has sensitivity and specificity over 70% and achieved good test-retest repeatability both for the scores obtained [ICC = 0.802; 95% confidence interval (CI): 0.673, 0.884] and CVS classification (κ = 0.612; 95% CI: 0.384, 0.839). The CVS-Q has acceptable psychometric properties, making it a valid and reliable tool to control the visual health of computer workers, and can potentially be used in clinical trials and outcome research. Copyright © 2015 Elsevier Inc. All rights reserved.
Helou, Khalil; El Helou, Nour; Mahfouz, Maya; Mahfouz, Yara; Salameh, Pascale; Harmouche-Karaki, Mireille
The International Physical Actvity Questionnaire (IPAQ) is a validated tool for physical activity assessment used in many countries however no Arabic version of the long-form of this questionnaire exists to this date. Hence, the aim of this study was to cross-culturally adapt and validate an Arabic version of the long International Physical Activity Questionnaire (AIPAQ) equivalent to the French version (F-IPAQ) in a Lebanese population. The guidelines for cross-cultural adaptation provided by the World Health Organization and the International Physical Activity Questionnaire committee were followed. One hundred fifty-nine students and staff members from Saint Joseph University of Beirut were randomly recruited to participate in the study. Items of the A-IPAQ were compared to those from the F-IPAQ for concurrent validity using Spearman's correlation coefficient. Content validity of the questionnaire was assessed using factor analysis for the A-IPAQ's items. The physical activity indicators derived from the A-IPAQ were compared with the body mass index (BMI) of the participants for construct validity. The instrument was also evaluated for internal consistency reliability using Cronbach's alpha and Intraclass Correlation Coefficient (ICC). Finally, thirty-one participants were asked to complete the A-IPAQ on two occasions three weeks apart to examine its test-retest reliability. Bland-Altman analyses were performed to evaluate the extent of agreement between the two versions of the questionnaire and its repeated administrations. A high correlation was observed between answers of the F-IPAQ and those of the A-IPAQ, with Spearman's correlation coefficients ranging from 0.91 to 1.00 (p reliability with Cronbach's alpha ranging from 0.769-1.00 (p reliability for most of its items (ICC ranging from 0.66-0.96; p validity and reliability for the assessment of physical activity among Lebanese adults. More studies are necessary in the future to assess its validity compared
Calugi, Simona; Milanese, Chiara; Sartirana, Massimiliano; El Ghoch, Marwan; Sartori, Federica; Geccherle, Eleonora; Coppini, Andrea; Franchini, Cecilia; Dalle Grave, Riccardo
To examine the validity and reliability of a new Italian language version of the latest edition of the Eating Disorder Examination Questionnaire (EDE-Q 6.0). The sixth edition of the EDE-Q was translated into Italian and administered to 264 Italian-speaking inpatient and outpatient (257 females in their mid-20s) with eating disorder (75.4% anorexia nervosa) and 216 controls (205 females). Internal consistency was high for both the global EDE-Q and all subscale scores. Test-retest reliability was good to excellent (0.66-0.83) for global and subscale scores, and for items assessing key behavioral features of eating disorders (0.55-0.91). Patients with an eating disorder displayed significantly higher EDE-Q scores than controls, demonstrating the good criterion validity of the tool. Confirmatory factor analysis revealed a good fit for a modified seven-item three-factor structure. The study showed the good psychometric properties of the new Italian version of the EDE-Q 6.0, and validated its use in Italian eating disorder patients, particularly in young females with anorexia nervosa.
Erkan Alpsoy; Yeşim Şenol; Aslı Bilgiç Temel; G. Özge Baysal; Ayşe Akman Karakaş
Backround and design. Internalized stigma involves endorsing negative feelings and beliefs such as insignificance, shame and withdrawal triggered by applying these negative stereotypes to one self. Internalized Stigma Scale has not been applied to psoriasis patients. We aimed to evaluate the reliability and validity of Internalized Stigma Scale in psoriasis patients. Materials and Methods. 100 consecutive, volunteer psoriasis patients (48 female, 52 male; aged, 40.59±15.44 years) were enro...
Sayed Hadi Sayed Alitabar; Mojtaba Habibi; Maryam Falahatpisheh; Musa Arvin
Background and Objective: According to the increasing of substance use in the country, more researches about this phenomenon are necessary. This Study Investigates the Validity, Reliability and Confirmatory Factor Structure of the Drug Abuse Screening test (DAST). Materials and Methods: The Sample Consisted of 381 Patients (143 Women and 238 Men) with a Multi-Stage Cluster Sampling of Areas 2, 6 and 12 of Tehran Were Selected from Each Region, 6 Randomly Selected Drug Rehabilitation Center. T...
Wefald, Andrew J; Mills, Maura J; Smith, Michael R; Downey, Ronald G
Engagement is an emerging job attitude that purports to measure employees' psychological presence at and involvement in their work. This research compares three academic approaches to engagement, and makes recommendations regarding the most appropriate conceptualisation and measurement of the construct in future research. The current research also investigates whether any of these three approaches to engagement contribute unique variance to the prediction of turnover intentions above and beyond the predictive capacity of alternative constructs. An online survey was taken by 382 employees and managers from a mid-sized financial institution. Results failed to support either a multi- or unidimensional factor structure for the Utrecht Work Engagement Scale (UWES) engagement measure. For the Shirom-Melamed Vigor Measure (SMVM), a multi-dimensional structure was identified as a good fit, while a unidimensional structure fit poorly. The uni-factorial structure of Britt's engagement measure was confirmed. The Schaufeli measure of engagement was a strong predictor of work outcomes; however, when controlling for job satisfaction and affective commitment, that measure lost its ability to predict intentions to leave. Two components of the Shirom vigor measure held their predictive validity. Collectively, these findings suggest that the Shirom vigor measure may provide better insight into whether and how much a person is 'into' his or her job. The Schaufeli measure was a good predictor of important work outcomes, but when job satisfaction and affective commitment were controlled, it lost its predictive validity. We were not able to confirm the three-factor structure of the Schaufeli measure. Two components of the Shirom vigor measure predicted turnover intentions after controlling for job satisfaction and affective commitment, suggesting less overlap with those constructs than the Schaufeli measure of engagement. This research adds important information on the nature of
Park, Dae-Sung; Lee, GyuChang
A balance test provides important information such as the standard to judge an individual's functional recovery or make the prediction of falls. The development of a tool for a balance test that is inexpensive and widely available is needed, especially in clinical settings. The Wii Balance Board (WBB) is designed to test balance, but there is little software used in balance tests, and there are few studies on reliability and validity. Thus, we developed a balance assessment software using the Nintendo Wii Balance Board, investigated its reliability and validity, and compared it with a laboratory-grade force platform. Twenty healthy adults participated in our study. The participants participated in the test for inter-rater reliability, intra-rater reliability, and concurrent validity. The tests were performed with balance assessment software using the Nintendo Wii balance board and a laboratory-grade force platform. Data such as Center of Pressure (COP) path length and COP velocity were acquired from the assessment systems. The inter-rater reliability, the intra-rater reliability, and concurrent validity were analyzed by an intraclass correlation coefficient (ICC) value and a standard error of measurement (SEM). The inter-rater reliability (ICC: 0.89-0.79, SEM in path length: 7.14-1.90, SEM in velocity: 0.74-0.07), intra-rater reliability (ICC: 0.92-0.70, SEM in path length: 7.59-2.04, SEM in velocity: 0.80-0.07), and concurrent validity (ICC: 0.87-0.73, SEM in path length: 5.94-0.32, SEM in velocity: 0.62-0.08) were high in terms of COP path length and COP velocity. The balance assessment software incorporating the Nintendo Wii balance board was used in our study and was found to be a reliable assessment device. In clinical settings, the device can be remarkably inexpensive, portable, and convenient for the balance assessment.
Cetin, Fatma Cosar; Sezer, Ayse; Merih, Yeliz Dogan
OBJECTIVE: The objective of this study is to investigate the validity and the reliability of Birth Satisfaction Scale (BSS) and to adapt it into the Turkish language. This scale is used for measuring maternal satisfaction with birth in order to evaluate women’s birth perceptions. METHODS: In this study there were 150 women who attended to inpatient postpartum clinic. The participants filled in an information form and the BSS questionnaire forms. The properties of the scale were tested by conducting reliability and validation analyses. RESULTS: BSS entails 30 Likert-type questions. It was developed by Hollins Martin and Fleming. Total scale scores ranged between 30–150 points. Higher scores from the scale mean increases in birth satisfaction. Three overarching themes were identified in Scale: service provision (home assessment, birth environment, support, relationships with health care professionals); personal attributes (ability to cope during labour, feeling in control, childbirth preparation, relationship with baby); and stress experienced during labour (distress, obstetric injuries, receiving sufficient medical care, obstetric intervention, pain, prolonged labour and baby’s health). Cronbach’s alfa coefficient was 0.62. CONCLUSION: According to the present study, BSS entails 30 Likert-type questions and evaluates women’s birth perceptions. The Turkish version of BSS has been proven to be a valid and a reliable scale. PMID:28058355
Hill, C.; Robinson, L.
Mammographers currently score their own images according to criteria set out by Regional Quality Assurance. The criteria used are based on the ‘Perfect, Good, Moderate, Inadequate’ (PGMI) marking criteria established by the National Health Service Breast Screening Programme (NHSBSP) in their Quality Assurance Guidelines of 2006 1 . This document discusses the validity and reliability of the current mammography image assessment scheme. Commencing with a critical review of the literature this document sets out to highlight problems with the national approach to the use of marking schemes. The findings suggest that ‘PGMI’ scheme is flawed in terms of reliability and validity and is not universally applied across the UK. There also appear to be differences in schemes used by trainees and qualified mammographers. Initial recommendations are to be made in collaboration with colleagues within the National Health Service Breast Screening Programme (NHSBSP), Higher Education Centres, College of Radiographers and the Royal College of Radiologists in order to identify a mammography image appraisal scheme that is fit for purpose. - Highlights: • Currently no robust evidence based marking tools in use for the assessment of images in mammography. • Is current system valid, reliable and robust? • How can the current image assessment tool be improved? • Should students and qualified mammographers use the same tool? • What marking criteria are available for image assessment?
Monalize Salete Mota
Full Text Available Abstract This study aimed to evaluate the biocontrol potential of bacteria isolated from different plant species and soils. The production of compounds related to phytopathogen biocontrol and/or promotion of plant growth in bacterial isolates was evaluated by measuring the production of antimicrobial compounds (ammonia and antibiosis and hydrolytic enzymes (amylases, lipases, proteases, and chitinases and phosphate solubilization. Of the 1219 bacterial isolates, 92% produced one or more of the eight compounds evaluated, but only 1% of the isolates produced all the compounds. Proteolytic activity was most frequently observed among the bacterial isolates. Among the compounds which often determine the success of biocontrol, 43% produced compounds which inhibit mycelial growth of Monilinia fructicola, but only 11% hydrolyzed chitin. Bacteria from different plant species (rhizosphere or phylloplane exhibited differences in the ability to produce the compounds evaluated. Most bacterial isolates with biocontrol potential were isolated from rhizospheric soil. The most efficient bacteria (producing at least five compounds related to phytopathogen biocontrol and/or plant growth, 86 in total, were evaluated for their biocontrol potential by observing their ability to kill juvenile Mesocriconema xenoplax. Thus, we clearly observed that bacteria that produced more compounds related to phytopathogen biocontrol and/or plant growth had a higher efficacy for nematode biocontrol, which validated the selection strategy used.
Guise, Brian J; Thompson, Matthew D; Greve, Kevin W; Bianchini, Kevin J; West, Laura
The current study assessed performance validity on the Stroop Color and Word Test (Stroop) in mild traumatic brain injury (TBI) using criterion-groups validation. The sample consisted of 77 patients with a reported history of mild TBI. Data from 42 moderate-severe TBI and 75 non-head-injured patients with other clinical diagnoses were also examined. TBI patients were categorized on the basis of Slick, Sherman, and Iverson (1999) criteria for malingered neurocognitive dysfunction (MND). Classification accuracy is reported for three indicators (Word, Color, and Color-Word residual raw scores) from the Stroop across a range of injury severities. With false-positive rates set at approximately 5%, sensitivity was as high as 29%. The clinical implications of these findings are discussed. © 2012 The British Psychological Society.
Ganestam, Ann; Barfod, Kristoffer; Klit, Jakob; Troelsen, Anders
The best treatment of acute Achilles tendon rupture remains debated. Patient-reported outcome measures have become cornerstones in treatment evaluations. The Achilles tendon total rupture score (ATRS) has been developed for this purpose but requires additional validation. The purpose of the present study was to validate a Danish translation of the ATRS. The ATRS was translated into Danish according to internationally adopted standards. Of 142 patients, 90 with previous rupture of the Achilles tendon participated in the validity study and 52 in the reliability study. The ATRS showed moderately strong correlations with the physical subscores of the Medical Outcomes Study 36-item Short-Form Health Survey (r = .70 to .75; p questionnaire (r = .71; p validity. For study and follow-up purposes, the ATRS seems reliable for comparisons of groups of patients. Its usability is limited for repeated assessment of individual patients. The development of analysis guidelines would be desirable. Copyright © 2013 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.
Curry, M A; Campbell, R A; Christian, M
Two studies of low-income pregnant women (N = 179) were done to examine the validity and reliability of the Prenatal Psychosocial Profile (PPP). The PPP, a composite of the Rosenberg Self-Esteem Scale, the Support Behaviors Inventory, and a newly developed measure of stress, is a brief, comprehensive clinical assessment of psychosocial risk during pregnancy. Construct validity of the stress scale was supported by theoretically predicted negative correlations with self-esteem, partner support, and support from others (N = 91). Convergent validity of the stress scale was demonstrated by a correlation of .71 with the Difficult Life Circumstances Scale. Adequate levels of internal consistency were found. Interrelationships between the four subscales were consistent with the underlying conceptualization, and there was beginning evidence of the factorial independence of the subscales.
Anderson Kathryn L
Full Text Available Abstract The Quality of Life Scale (QOLS, created originally by American psychologist John Flanagan in the 1970's, has been adapted for use in chronic illness groups. This paper reviews the development and psychometric testing of the QOLS. A descriptive review of the published literature was undertaken and findings summarized in the frequently asked questions format. Reliability, content and construct validity testing has been performed on the QOLS and a number of translations have been made. The QOLS has low to moderate correlations with physical health status and disease measures. However, content validity analysis indicates that the instrument measures domains that diverse patient groups with chronic illness define as quality of life. The QOLS is a valid instrument for measuring quality of life across patient groups and cultures and is conceptually distinct from health status or other causal indicators of quality of life.
Full Text Available The purpose of this study was to assess the validity and reliability of the Sense of Contribution Scale (SCS, a newly developed, 7-item questionnaire used to measure sense of contribution in the workplace. Workers at 272 organizations answered questionnaires that included the SCS. Because of non-participation or missing data, the number of subjects included in the analyses for internal consistency and validity varied from 1,675 to 2,462 (response rates 54.6%–80.2%. Fifty-four workers were included in the analysis of test–retest reliability (response rate, 77.1%. The SCS showed high internal consistency (Cronbach’s α coefficients in men and women were 0.85 and 0.86, respectively and test–retest reliability (intraclass correlation coefficient = 0.91. Significant (p < 0.001, positive, moderate correlations were found between the SCS score and scores for organization-based self-esteem and work engagement in both genders, which support the SCS’s convergent and discriminant validity. The criterion validity of the SCS was supported by the finding that in both genders, the SCS scores were significantly (p < 0.05 and inversely associated with psychological distress and sleep disturbance in crude and in multivariable analyses that adjusted for demographics, organization-based self-esteem, work engagement, effort–reward ratio, workplace bullying, and procedural and interactional justice. The SCS is a psychometrically satisfactory measure of sense of contribution in the workplace. The SCS provides a new and useful instrument to measure sense of contribution, which is independently associated with mental health in workers, for studies in organizational science, occupational health psychology and occupational medicine.
Tsuno, Kanami; Yoshimasu, Kouichi; Hayashi, Takashi; Tatsuta, Nozomi; Ito, Yuki; Kamijima, Michihiro; Nakai, Kunihiko
Nowadays, attention deficit hyperactivity (ADH) problems are observed commonly among school-age children. However, questionnaires specific to ADH behaviors among preschool children are very few. The aim of this study was to investigate the reliability and validity of the 25-item Behavioral Check List (BCL), which was developed from interviews of parents with children who were diagnosed as having Attention-deficit/hyperactivity disorder (ADHD) and measures ADH behaviors in preschool age. We recruited 22 teachers from 10 nurseries/kindergartens in Miyagi Prefecture, Japan. A total of 138 preschool children were assessed using the BCL. To investigate inter-rater reliability, two teachers from each facility assess seven to twenty children in their class, and intraclass correlation coefficients (ICCs) were calculated. The teachers additionally answered questions in the 1/5-5 Caregiver-Teacher Report Form (C-TRF) to investigate the criterion validity of the BCL. To investigate structural validity, exploratory factor analysis with promax rotation and confirmatory factor analysis were performed. The internal consistency reliability of the BCL was good (α = 0.92) and correlation analyses also confirmed its excellent criterion validity. Although exploratory factor analysis for the BCL yielded a five-factor model that consisted of a factor structure different from that of the original one, the results were similar to the original six factors. The ICCs of the BCL were 0.38-0.99 and it was not high enough for inter-rater reliability in some facilities. However, there is a possibility to improve it by giving raters adequate explanations when using BCL. The present study showed acceptable levels of reliability and validity of the BCL among Japanese preschool children.
Stephan, Astrid; Mayer, Herbert; Renom Guiteras, Anna; Meyer, Gabriele
Instruments measuring caregiver reactions usually disregard positive aspects, and focus predominately on home care. The Caregiver Reaction Assessment (CRA) scale is an exception. Until now, no German version has been available. We translated the instrument to German (G-CRA) and evaluated its psychometric properties and feasibility. Face-to-face interviews with 234 informal caregivers of persons with dementia were performed. Half of the persons with dementia (n = 118) had been recently admitted to institutional long-term care (iLTC); the remainder (n = 116) lived at home. Exploratory factor analysis (EFA) was performed. Subscales were intercorrelated and further correlated with the Zarit Burden Interview (ZBI), the General Health Questionnaire (GHQ-12), and the EuroQuol (EQ-5D). Internal consistency was measured (Cronbach's α), and interviewers (n = 9) appraised feasibility. The time needed to apply the scale was measured in 20 interviews. The EFA yielded six factors (Kaiser criterion), but a scree plot supported the five dimensions of the original version that explained 56.2% of variance. Low-to-moderate subscales' inter-correlation was revealed. Highest correlation (r = 0.5) was found between impact on health and impact on daily schedule, indicating slight overlap. Criterion validity was supported by reasonable correlations between subscales and ZBI and GHQ-12 (r = -0.21-0.71). Subscale impact on health was negatively correlated with the EQ-5D. The internal consistency was sufficient (α = 0.67 − 0.78). Interviewers judged the G-CRA to be appropriate. Completion took 6.50 min (median value). Our results suggest that the G-CRA is sufficiently valid and internally reliable. The instrument is applicable in home care and iLTC as well as in the transitional phase.
Condon, David; Revelle, William
Separating the signal in a test from the irrelevant noise is a challenge for all measurement. Low test reliability limits test validity, attenuates important relationships, and can lead to regression artifacts. Multiple approaches to the assessment and improvement of reliability are discussed. The advantages and disadvantages of several different approaches to reliability are considered. Practical advice on how to assess reliability using open source software is provided.
Full Text Available In the field of state of charge (SOC estimation, the Kalman filter has been widely used for many years, although its performance strongly depends on the accuracy of the battery model as well as the noise covariance. The Kalman gain determines the confidence coefficient of the battery model by adjusting the weight of open circuit voltage (OCV correction, and has a strong correlation with the measurement noise covariance (R. In this paper, the online identification method is applied to acquire the real model parameters under different operation conditions. A criterion based on the OCV error is proposed to evaluate the reliability of online parameters. Besides, the equivalent circuit model produces an intrinsic model error which is dependent on the load current, and the property that a high battery current or a large current change induces a large model error can be observed. Based on the above prior knowledge, a fuzzy model is established to compensate the model error through updating R. Combining the positive strategy (i.e., online identification and negative strategy (i.e., fuzzy model, a more reliable and robust SOC estimation algorithm is proposed. The experiment results verify the proposed reliability criterion and SOC estimation method under various conditions for LiFePO4 batteries.
Full Text Available Objective: The Temperament and Character Inventory (TCI was developed to assess temperament including Novelty Seeking (NS, Harm Avoidance (HA, Reward Dependence (RD, Persistence (PS, and Character including Self-Directedness (SD, Cooperativeness (CO and Self Transcendence (ST dimensions of Cloninger's biopsychosocial model of personality in adults. The purpose of this study was to evaluate the reliability and validity of this inventory. Materials & Methods: In this validity test and standardization study, after translation of TCI into Farsi and back translation, the final form was prepared and administered to 220 students who were selected via simple sampling. Cronbach's alpha procedure and test-retest method was used to assess the reliability, and factor analysis of promax rotation was utilized to determine the validity of the inventory. Correlation of interscales and age with scales of TCI was calculated by Pearson correlation. A comparison of TCI scores between sex and also cross-cultural was down using independent t-test. Results: The alpha cofficients for the inventory ranged from 0.44 for the Persistence scale to 0.81 for the ST scale with a median 0f 0.68. The overall alpha cofficients for the whole inventory was 0.74. The Pearson correlation cofficient for the test-retest on 31 students after two months ranged from 0.53 for Novelty Seeking and Persistence to 0.82 for Harm Avoidance scales and from 0.24 for disorderliness vs regimentation (NS4 to 0.86 for fear of uncertainty vs self-confidene (HA2 subscales. The factor analysis showed six factors. Significant correlations were obtained between scales of Self–Directedness with Harm Avoidance (0.57, Self–Directedness with Cooperativeness (0.46. Conclusion: The current study confirms that Persian version of the Temperament and Character Inventory has satisfactory psychometric properties and acceptable reliability and validity for the use students of university population.
Aiyegbusi, Ayoola Ibifubara; Akodu, Ashiyat Kehinde; Agbede, Eniolorunda Olajide
Low back pain (LBP) is a major cause of disability, and the Oswestry Disability Index (ODI) is a validated assessment tool for evaluating disability in LBP patients. Cross-cultural adaptation of the ODI is important because not all populations are proficient in English. The Yoruba language is an indigenous language spoken by 40 million people in the Western part of Nigeria and some countries in West Africa and Latin America. Currently, no validated Yoruba version of ODI is available. The aim of the study was to translate, culturally adapt and validate the ODI in Yoruba language for participants with LBP. The ODI was translated into Yoruba, and this translated version was analysed in terms of semantics and linguistics. Then, the Yoruba version was translated back into English and both versions administered to 160 participants with LBP. The internal consistency using Cronbach's alpha coefficient, criterion validity and test-retest reliability were assessed using Spearman's rank correlation with significance set at Pdisability in LBP patients.
Conclusion: The APA shows good internal reliability, test–retest reliability, discriminant validity, and construct validity. However, evidence of psychometric properties was limited by a small sample size. Psychometric properties such as interrater reliability as well as concurrent validity and construct validity need to be tested using a larger sample size with representative demographics.
Ely, E Wesley; Truman, Brenda; Shintani, Ayumi; Thomason, Jason W W; Wheeler, Arthur P; Gordon, Sharon; Francis, Joseph; Speroff, Theodore; Gautam, Shiva; Margolin, Richard; Sessler, Curtis N; Dittus, Robert S; Bernard, Gordon R
Goal-directed delivery of sedative and analgesic medications is recommended as standard care in intensive care units (ICUs) because of the impact these medications have on ventilator weaning and ICU length of stay, but few of the available sedation scales have been appropriately tested for reliability and validity. To test the reliability and validity of the Richmond Agitation-Sedation Scale (RASS). Prospective cohort study. Adult medical and coronary ICUs of a university-based medical center. Thirty-eight medical ICU patients enrolled for reliability testing (46% receiving mechanical ventilation) from July 21, 1999, to September 7, 1999, and an independent cohort of 275 patients receiving mechanical ventilation were enrolled for validity testing from February 1, 2000, to May 3, 2001. Interrater reliability of the RASS, Glasgow Coma Scale (GCS), and Ramsay Scale (RS); validity of the RASS correlated with reference standard ratings, assessments of content of consciousness, GCS scores, doses of sedatives and analgesics, and bispectral electroencephalography. In 290-paired observations by nurses, results of both the RASS and RS demonstrated excellent interrater reliability (weighted kappa, 0.91 and 0.94, respectively), which were both superior to the GCS (weighted kappa, 0.64; P<.001 for both comparisons). Criterion validity was tested in 411-paired observations in the first 96 patients of the validation cohort, in whom the RASS showed significant differences between levels of consciousness (P<.001 for all) and correctly identified fluctuations within patients over time (P<.001). In addition, 5 methods were used to test the construct validity of the RASS, including correlation with an attention screening examination (r = 0.78, P<.001), GCS scores (r = 0.91, P<.001), quantity of different psychoactive medication dosages 8 hours prior to assessment (eg, lorazepam: r = - 0.31, P<.001), successful extubation (P =.07), and bispectral electroencephalography (r = 0.63, P
Sayed Hadi Sayed Alitabar
Full Text Available Background and Objective: According to the increasing of substance use in the country, more researches about this phenomenon are necessary. This Study Investigates the Validity, Reliability and Confirmatory Factor Structure of the Drug Abuse Screening test (DAST. Materials and Methods: The Sample Consisted of 381 Patients (143 Women and 238 Men with a Multi-Stage Cluster Sampling of Areas 2, 6 and 12 of Tehran Were Selected from Each Region, 6 Randomly Selected Drug Rehabilitation Center. The DAST Was Used as Instrument. Divergent & Convergent Validity of this Scale Was Assessed with Problems Assessment for Substance Using Psychiatric Patients (PASUPP and Relapse Prediction Scale (RPS.Results: The DAST after the First Time Factor Structure of Using Confirmatory Factor Analysis Was Confirmed. The DAST Had a Good Internal Consistency (Cranach’s Alpha, and the Reliability of the Test Within a Week, 0.9, 0.8. Also this Scale Had a Positive Correlation with Problems Assessment for Substance Using Psychiatric Patients and Relapse Prediction Scale (P<0.01.Conclusion: The Overall Results Showed that the Drug Abuse Screening Test in Iranian Society Is Valid. It Can Be Said that Self-Report Scale Tool Is Useful for Research Purposes and Addiction.
Haggerty, Greg; Zodan, Jennifer; Mehra, Ashwin; Zubair, Ayyan; Ghosh, Krishnendu; Siefert, Caleb J; Sinclair, Samuel J; DeFife, Jared
The current study investigated the interrater reliability and validity of prototype ratings of 5 common adolescent psychiatric disorders: attention-deficit/hyperactivity disorder, conduct disorder, major depressive disorder, generalized anxiety disorder, and posttraumatic stress disorder. One hundred fifty-seven adolescent inpatient participants consented to participate in this study. We compared ratings from 2 inpatient clinicians, blinded to each other's ratings and patient measures, after their separate initial diagnostic interview to assess interrater reliability. Prototype ratings completed by clinicians after their initial diagnostic interview with adolescent inpatients and outpatients were compared with patient-reported behavior problems and parents' report of their child's behavioral problems. Prototype ratings demonstrated good interrater reliability. Clinicians' prototype ratings showed predicted relationships with patient-reported behavior problems and parent-reported behavior problems. Prototype matching seems to be a possible alternative for psychiatric diagnosis. Prototype ratings showed good interrater reliability based on clinicians unique experiences with the patient (as opposed to video-/audio-recorded material) with no training.
Full Text Available The aim of this research is to adapt the Workplace Bullying Scale (Tınaz, Gök & Karatuna, 2013 to Albanian language and to examine its psychometric properties. The research was conducted on 386 person from different sectors of Albania. Results of exploratory and confirmatory factor analysis demonstrated that Albanian scale yielded 2 factors different from original form because of cultural differences. Internal consistency coefficients are,890 -,801 and split-half test reliability coefficients, 864 -,808. Comfirmatory Factor Analysis results change from,40 to,73. Corrected item-total correlations ranged,339 to,672 and according to t-test results differences between each item’s means of upper 27% and lower 27% points were significant. Thus Workplace Bullying Scale can be use as a valid and reliable instrument in social sciences in Albania.
Vanwolleghem, Griet; Van Dyck, Delfien; Ducheyne, Fabian; De Bourdeaudhuij, Ilse; Cardon, Greet
Google Street View provides a valuable and efficient alternative to observe the physical environment compared to on-site fieldwork. However, studies on the use, reliability and validity of Google Street View in a cycling-to-school context are lacking. We aimed to study the intra-, inter-rater reliability and criterion validity of EGA-Cycling (Environmental Google Street View Based Audit - Cycling to school), a newly developed audit using Google Street View to assess the physical environment along cycling routes to school. Parents (n = 52) of 11-to-12-year old Flemish children, who mostly cycled to school, completed a questionnaire and identified their child's cycling route to school on a street map. Fifty cycling routes of 11-to-12-year olds were identified and physical environmental characteristics along the identified routes were rated with EGA-Cycling (5 subscales; 37 items), based on Google Street View. To assess reliability, two researchers performed the audit. Criterion validity of the audit was examined by comparing the ratings based on Google Street View with ratings through on-site assessments. Intra-rater reliability was high (kappa range 0.47-1.00). Large variations in the inter-rater reliability (kappa range -0.03-1.00) and criterion validity scores (kappa range -0.06-1.00) were reported, with acceptable inter-rater reliability values for 43% of all items and acceptable criterion validity for 54% of all items. EGA-Cycling can be used to assess physical environmental characteristics along cycling routes to school. However, to assess the micro-environment specifically related to cycling, on-site assessments have to be added.
Bongers, Coen C W G; Daanen, Hein A M; Bogerd, Cornelis P; Hopman, Maria T E; Eijsvogels, Thijs M H
Telemetric temperature capsule systems are wireless, relatively noninvasive, and easily applicable in field conditions and have therefore great advantages for monitoring core body temperature. However, the accuracy and responsiveness of available capsule systems have not been compared previously. Therefore, the aim of this study was to examine the validity, reliability, and inertia characteristics of four ingestible temperature capsule systems (i.e., CorTemp, e-Celsius, myTemp, and VitalSense). Ten temperature capsules were examined for each system in a temperature-controlled water bath during three trials. The water bath temperature gradually increased from 33°C to 44°C in trials 1 and 2 to assess the validity and reliability, and from 36°C to 42°C in trial 3 to assess the inertia characteristics of the temperature capsules. A systematic difference between capsule and water bath temperature was found for CorTemp (0.077°C ± 0.040°C), e-Celsius (-0.081°C ± 0.055°C), myTemp (-0.003°C ± 0.006°C), and VitalSense (-0.017°C ± 0.023°C; P 0.05). Comparable inertia characteristics were found for CorTemp (25 ± 4 s), e-Celsius (21 ± 13 s), and myTemp (19 ± 2 s), whereas the VitalSense system responded more slowly (39 ± 6 s) to changes in water bath temperature (P inertia were observed between capsule systems, an excellent validity, test-retest reliability, and inertia was found for each system between 36°C and 44°C after removal of outliers.
Bonnet, Michael H; Doghramji, Karl; Roehrs, Timothy; Stepanski, Edward J; Sheldon, Stephen H; Walters, Arthur S; Wise, Merrill; Chesson, Andrew L
The reliability and validity of EEG arousals and other types of arousal are reviewed. Brief arousals during sleep had been observed for many years, but the evolution of sleep medicine in the 1980s directed new attention to these events. Early studies at that time in animals and humans linked brief EEG arousals and associated fragmentation of sleep to daytime sleepiness and degraded performance. Increasing interest in scoring of EEG arousals led the ASDA to publish a scoring manual in 1992. The current review summarizes numerous studies that have examined scoring reliability for these EEG arousals. Validity of EEG arousals was explored by review of studies that empirically varied arousals and found deficits similar to those found after total sleep deprivation depending upon the rate and extent of sleep fragmentation. Additional data from patients with clinical sleep disorders prior to and after effective treatment has also shown a continuing relationship between reduction in pathology-related arousals and improved sleep and daytime function. Finally, many suggestions have been made to refine arousal scoring to include additional elements (e.g., CAP), change the time frame, or focus on other physiological responses such as heart rate or blood pressure changes. Evidence to support the reliability and validity of these measures is presented. It was concluded that the scoring of EEG arousals has added much to our understanding of the sleep process but that significant work on the neurophysiology of arousal needs to be done. Additional refinement of arousal scoring will provide improved insight into sleep pathology and recovery.
Full Text Available Introduction: Physical activity (PA is protective against non-communicable diseases and it can reduce premature mortality. However, it is difficult to assess the frequency, duration, type and intensity of PA. The global physical activity questionnaire (GPAQ has been developed by World Health Organization with the aim of having valid and reliable estimates of PA. The primary aim of this study is to assess the repeatability of the GPAQ instrument and the secondary aim is to validate it against International Physical Activity Questionnaire (IPAQ and against an objective measure of PA (i.e., using pedometers in both rural and peri-urban areas of North India. Methods: A total of 262 subjects were recruited by random selection from Ballabgarh Block of Haryana State in India. For test retest repeatability of GPAQ and IPAQ, the instruments were administered on two occasions separated by at least 3 days. For concurrent validity, both questionnaires were administered in random order and for criterion validity step counters were used. Spearman′s correlation coefficient, intra-class correlation (ICC and Cohen′s kappa was used in the analysis. Results: For GPAQ validity, the spearman′s Rho ranged from 0.40 to 0.59 and ICC ranged from 0.43 to 0.81 while for IPAQ validity, spearman correlation coefficient ranged from 0.42 to 0.43 and ICC ranged from 0.56 to 0.68. The observed concurrent validity coefficients suggested that both the questionnaires had reasonable agreement (Spearman Rho of >0.90; P < 0.0001; ICC: 0.76-0.91, P < 0.05. Conclusions: GPAQ is similar to IPAQ in measuring PA and can be used for measurement of PA in community settings.
Perraton, Luke G.; Bower, Kelly J.; Adair, Brooke; Pua, Yong-Hao; Williams, Gavin P.; McGaw, Rebekah
Introduction Hand-held dynamometry (HHD) has never previously been used to examine isometric muscle power. Rate of force development (RFD) is often used for muscle power assessment, however no consensus currently exists on the most appropriate method of calculation. The aim of this study was to examine the reliability of different algorithms for RFD calculation and to examine the intra-rater, inter-rater, and inter-device reliability of HHD as well as the concurrent validity of HHD for the assessment of isometric lower limb muscle strength and power. Methods 30 healthy young adults (age: 23±5yrs, male: 15) were assessed on two sessions. Isometric muscle strength and power were measured using peak force and RFD respectively using two HHDs (Lafayette Model-01165 and Hoggan microFET2) and a criterion-reference KinCom dynamometer. Statistical analysis of reliability and validity comprised intraclass correlation coefficients (ICC), Pearson correlations, concordance correlations, standard error of measurement, and minimal detectable change. Results Comparison of RFD methods revealed that a peak 200ms moving window algorithm provided optimal reliability results. Intra-rater, inter-rater, and inter-device reliability analysis of peak force and RFD revealed mostly good to excellent reliability (coefficients ≥ 0.70) for all muscle groups. Concurrent validity analysis showed moderate to excellent relationships between HHD and fixed dynamometry for the hip and knee (ICCs ≥ 0.70) for both peak force and RFD, with mostly poor to good results shown for the ankle muscles (ICCs = 0.31–0.79). Conclusions Hand-held dynamometry has good to excellent reliability and validity for most measures of isometric lower limb strength and power in a healthy population, particularly for proximal muscle groups. To aid implementation we have created freely available software to extract these variables from data stored on the Lafayette device. Future research should examine the reliability
Remijn, Lianne; Speyer, Renée; Groen, Brenda E; van Limbeek, Jacques; Nijhuis-van der Sanden, Maria W G
The Mastication Observation and Evaluation (MOE) instrument was developed to allow objective assessment of a child's mastication process. It contains 14 items and was developed over three Delphi rounds. The present study concerns the further development of the MOE using the COSMIN (Consensus based Standard for the Selection of Measurement Instruments) and investigated the instrument's internal consistency, inter-observer reliability, construct validity and floor and ceiling effects. Consumption of three bites of bread and biscuit was evaluated using the MOE. Data of 59 healthy children (6-48 mths) and 38 children (bread) and 37 children (biscuit) with cerebral palsy (24-72 mths) were used. Four items were excluded before analysis due to zero variance. Principal Components Analysis showed one factor with 8 items. Internal consistency was >0.70 (Chronbach's alpha) for both food consistencies and for both groups of children. Inter-observer reliability varied from 0.51 to 0.98 (weighted Gwet's agreement coefficient). The total MOE scores for both groups showed normal distribution for the population. There were no floor or ceiling effects. The revised MOE now contains 8 items that (a) have a consistent concept for mastication and can be scored on a 4-point scale with sufficient reliability and (b) are sensitive to stages of chewing development in young children. The removed items are retained as part of a criterion referenced list within the MOE. Copyright © 2014 Elsevier Ltd. All rights reserved.
Wikstrom, Erik A.
Context: Interactive gaming systems have the potential to help rehabilitate patients with musculoskeletal conditions. The Nintendo Wii Balance Board, which is part of the Wii Fit game, could be an effective tool to monitor progress during rehabilitation because the board and game can provide objective measures of balance. However, the validity and reliability of Wii Fit balance scores remain unknown. Objective: To determine the concurrent validity of balance scores produced by the Wii Fit game and the intrasession and intersession reliability of Wii Fit balance scores. Design: Descriptive laboratory study. Setting: Sports medicine research laboratory. Patients or Other Participants: Forty-five recreationally active participants (age = 27.0 ± 9.8 years, height = 170.9 ± 9.2 cm, mass = 72.4 ± 11.8 kg) with a heterogeneous history of lower extremity injury. Intervention(s): Participants completed a single-limb–stance task on a force plate and the Star Excursion Balance Test (SEBT) during the first test session. Twelve Wii Fit balance activities were completed during 2 test sessions separated by 1 week. Main Outcome Measure(s): Postural sway in the anteroposterior (AP) and mediolateral (ML) directions and the AP, ML, and resultant center-of-pressure (COP) excursions were calculated from the single-limb stance. The normalized reach distance was recorded for the anterior, posteromedial, and posterolateral directions of the SEBT. Wii Fit balance scores that the game software generated also were recorded. Results: All 96 of the calculated correlation coefficients among Wii Fit activity outcomes and established balance outcomes were interpreted as poor (r Wii Fit balance activity scores ranged from good (intraclass correlation coefficient [ICC] = 0.80) to poor (ICC = 0.39), with 8 activities having poor intrasession reliability. Similarly, 11 of the 12 Wii Fit balance activity scores demonstrated poor intersession reliability, with
Wikstrom, Erik A
Interactive gaming systems have the potential to help rehabilitate patients with musculoskeletal conditions. The Nintendo Wii Balance Board, which is part of the Wii Fit game, could be an effective tool to monitor progress during rehabilitation because the board and game can provide objective measures of balance. However, the validity and reliability of Wii Fit balance scores remain unknown. To determine the concurrent validity of balance scores produced by the Wii Fit game and the intrasession and intersession reliability of Wii Fit balance scores. Descriptive laboratory study. Sports medicine research laboratory. Forty-five recreationally active participants (age = 27.0 ± 9.8 years, height = 170.9 ± 9.2 cm, mass = 72.4 ± 11.8 kg) with a heterogeneous history of lower extremity injury. Participants completed a single-limb-stance task on a force plate and the Star Excursion Balance Test (SEBT) during the first test session. Twelve Wii Fit balance activities were completed during 2 test sessions separated by 1 week. Postural sway in the anteroposterior (AP) and mediolateral (ML) directions and the AP, ML, and resultant center-of-pressure (COP) excursions were calculated from the single-limb stance. The normalized reach distance was recorded for the anterior, posteromedial, and posterolateral directions of the SEBT. Wii Fit balance scores that the game software generated also were recorded. All 96 of the calculated correlation coefficients among Wii Fit activity outcomes and established balance outcomes were interpreted as poor (r Wii Fit balance activity scores ranged from good (intraclass correlation coefficient [ICC] = 0.80) to poor (ICC = 0.39), with 8 activities having poor intrasession reliability. Similarly, 11 of the 12 Wii Fit balance activity scores demonstrated poor intersession reliability, with scores ranging from fair (ICC = 0.74) to poor (ICC = 0.29). Wii Fit balance activity scores had poor concurrent validity relative to COP outcomes and SEBT
Gardiner, Paul A; Clark, Bronwyn K; Healy, Genevieve N; Eakin, Elizabeth G; Winkler, Elisabeth A H; Owen, Neville
With evidence that prolonged sitting has deleterious health consequences, decreasing sedentary time is a potentially important preventive health target. High-quality measures, particularly for use with older adults, who are the most sedentary population group, are needed to evaluate the effect of sedentary behavior interventions. We examined the reliability, validity, and responsiveness to change of a self-report sedentary behavior questionnaire that assessed time spent in behaviors common among older adults: watching television, computer use, reading, socializing, transport and hobbies, and a summary measure (total sedentary time). In the context of a sedentary behavior intervention, nonworking older adults (n = 48, age = 73 ± 8 yr (mean ± SD)) completed the questionnaire on three occasions during a 2-wk period (7 d between administrations) and wore an accelerometer (ActiGraph model GT1M) for two periods of 6 d. Test-retest reliability (for the individual items and the summary measure) and validity (self-reported total sedentary time compared with accelerometer-derived sedentary time) were assessed during the 1-wk preintervention period, using Spearman (ρ) correlations and 95% confidence intervals (CI). Responsiveness to change after the intervention was assessed using the responsiveness statistic (RS). Test-retest reliability was excellent for television viewing time (ρ (95% CI) = 0.78 (0.63-0.89)), computer use (ρ (95% CI) = 0.90 (0.83-0.94)), and reading (ρ (95% CI) = 0.77 (0.62-0.86)); acceptable for hobbies (ρ (95% CI) = 0.61 (0.39-0.76)); and poor for socializing and transport (ρ < 0.45). Total sedentary time had acceptable test-retest reliability (ρ (95% CI) = 0.52 (0.27-0.70)) and validity (ρ (95% CI) = 0.30 (0.02-0.54)). Self-report total sedentary time was similarly responsive to change (RS = 0.47) as accelerometer-derived sedentary time (RS = 0.39). The summary measure of total sedentary time has good repeatability and modest validity and is
Monbaliu, Elegast; Ortibus, Els; Roelens, F; Desloovere, Kaat; Declerck, Jan; Prinzie, Peter; De Cock, Paul; Feys, Hilde
AIM: This study investigated the reliability and validity of the Barry-Albright Dystonia Scale (BADS), the Burke-Fahn-Marsden Movement Scale (BFMMS), and the Unified Dystonia Rating Scale (UDRS) in patients with bilateral dystonic cerebral palsy (CP). METHOD: Three raters independently scored videotapes of 10 patients (five males, five females; mean age 13 y 3 mo, SD 5 y 2 mo, range 5-22 y). One patient each was classified at levels I-IV in the Gross Motor Function Classification System a...
ATEŞ, Hatice KADIOĞLU; ADA, Sefer; BAYSAL, Z. Nurdan
Abstract The aim of this study is to develop visual presentation attitude rubric which is valid and reliable for the 4th grade students. 218 students took part in this study from Engin Can Güre which located in Istanbul, Esenler. While preparing this assessment tool with 34 criterias , 6 university lecturers view have been taken who are experts in their field. The answer key sheet has 4 (likert )type options. The rubric has been first tested by Kaiser-Meyer Olkin and Bartletts tests an...
Full Text Available Objective To develop Chinese Military Personnel Social Support Scaleand verify its reliability and validity. Methods The Chinese Military Personnel Social Support Scalewas initiated, organized and compiled based upon open-ended questionnaire survey done in a systematic manner, and previous researches were taken as references. A total of 630 military personnel were chosen by random cluster sampling and tested with the Scale, among them 50 were tested with Social Support Rating Scale(SSRS and Chinese Military Psychosomatic Health Scale(CMPHS simultaneously, and the test was done solely a second time with CMPHS 2 weeks later. The reliability and validity were assessed and verified by exploratory factor analysis, confirmatory factor analysis and correlation analysis. Results The Chinese Military Personnel Social Support Scalecomprised three factors, namely subjective support, objective support and utility of social support. Eighteen items were left in official scale after amendment by factor analysis, and one lying subscale was added. The correlation coefficients between the public factors ranged from 0.477 to 0.589 (P<0.01, and the correlation coefficients between factors and total scale ranged from 0.721 to 0.823 (P<0.01. The test-retest correlation coefficients of total scale and subscales ranged from 0.622 to 0.803 (P<0.01, the Cronbach α coefficients ranged from 0.624 to 0.874, and the split-half correlation coefficients ranged from 0.551 to 0.828. Significant correlation existed between this Scale and two criterion scales, namely SSRS and CMPHS. Conclusion It is verified that the Chinese Military Personnel Social Support Scalehas excellent reliability and validity, and complying with psychometric standards, it may be used to evaluate the social support level of Chinese military personnel.
Kurita, H; Miyake, Y
The Tokyo Autistic Behavior Scale (TABS) consisting of 39 items provisionally grouped in four areas--interpersonal-social relationship, language-communication, habit-mannerism and others--is an instrument used by a child's caretaker to rate the child's autistic behaviors on a 3-point scale. Test-retest reliability was satisfactory (i.e., an r for a total score was .94). Among six DSM-III diagnostic groups, infantile autism showed a significantly higher total TABS score than the other five groups, and a taxonomic validity coefficient was .54. An r between total scores of the TABS and the Childhood Autism Rating Scale--Tokyo Version was .59. The area scores showed a lower validity than the total score. The TABS appears to be a useful instrument to assess autistic behavior.
March-Rosselló, G A; Muñoz-Moreno, M F; García-Loygorri-Jordán de Urriés, M C; Bratos-Pérez, M A
Matrix-assisted laser desorption/ionization-time-of-flight mass spectrometry (MALDI-TOF) is a widely used tool in clinical microbiology for rapidly identifying microorganisms. This technique can be applied directly on positive blood cultures without the need for its culturing, thereby, reducing the time required for microbiological diagnosis. The present study proposes an innovative identification protocol applied to positive blood culture bottles using MALDI-TOF. We have processed 100 positive blood culture bottles, of which 36 of 37 Gram-negative bacteria (97.3 %) were correctly identified directly with 100 % of Enterobacteriaceae and other Gram-negative rods and 87.5 % of non-fermenting Gram-negative rods. We also correctly identified directly 62 of 63 of Gram-positive bacteria (98.4 %) with 100 % of Streptococcus, Enterococcus, and Gram-positive bacilli and 98 % of Staphylococcus. Applying the differential centrifugation protocol at the moment the automatic blood culture incubation system gives a positive reading together with the proposed validation criterion offers 98 % sensitivity (95 % confidence interval: 95.2-100 %). The MALDI-TOF system, thus, provides a rapid and reliable system for identifying microorganisms from blood culture growth bottles.
Müller, Alessandra Bombarda; Valentini, Nadia Cristina; Bandeira, Paulo Felipe Ribeiro
The range of stimuli provided by physical space, toys and care practices contributes to the motor, cognitive and social development of children. However, assessing the quality of child education environments is a challenge, and can be considered a health promotion initiative. This study investigated the validity of the criterion, content, construct and reliability of the Affordances in the Home Environment for Motor Development - Infant Scale (AHEMD-IS), version 3-18 months, for the use in daycare settings. Content validation was conducted with the participation of seven motor development and health care experts; and, face validity by 20 specialists in health and education. The results indicate the suitability of the adapted AHEMD-IS, evidencing its validity for the daycare setting a potential tool to assess the opportunities that the collective context offers to child development. Copyright © 2017 Elsevier Inc. All rights reserved.
Aggio, Daniel; Fairclough, Stuart; Knowles, Zoe; Graves, Lee
Adaptation of physical activity self-report questionnaires is sometimes required to reflect the activity behaviours of diverse populations. The processes used to modify self-report questionnaires though are typically underreported. This two-phased study used a formative approach to investigate the validity and reliability of the Physical Activity Questionnaire for Adolescents (PAQ-A) in English youth. Phase one examined test content and response process validity and subsequently informed a modified version of the PAQ-A. Phase two assessed the validity and reliability of the modified PAQ-A. In phase one, focus groups (n = 5) were conducted with adolescents (n = 20) to investigate test content and response processes of the original PAQ-A. Based on evidence gathered in phase one, a modified version of the questionnaire was administered to participants (n = 169, 14.5 ± 1.7 years) in phase two. Internal consistency and test-retest reliability were assessed using Cronbach's alpha and intra-class correlations, respectively. Spearman correlations were used to assess associations between modified PAQ-A scores and accelerometer-derived physical activity, self-reported fitness and physical activity self-efficacy. Phase one revealed that the original PAQ-A was unrepresentative for English youth and that item comprehension varied. Contextual and population/cultural-specific modifications were made to the PAQ-A for use in the subsequent phase. In phase two, modified PAQ-A scores had acceptable internal consistency (α = 0.72) and test-retest reliability (ICC = 0.78). Modified PAQ-A scores were significantly associated with objectively assessed moderate-to-vigorous physical activity (r = 0.39), total physical activity (r = 0.42), self-reported fitness (r = 0.35), and physical activity self-efficacy (r = 0.32) (p ≤ 0.01). The modified PAQ-A had acceptable internal consistency and test-retest reliability. Modified PAQ-A scores
McEwan, Troy E; Shea, Daniel E; Daffern, Michael; MacKenzie, Rachel D; Ogloff, James R P; Mullen, Paul E
This study assessed the reliability and validity of the Stalking Risk Profile (SRP), a structured measure for assessing stalking risks. The SRP was administered at the point of assessment or retrospectively from file review for 241 adult stalkers (91% male) referred to a community-based forensic mental health service. Interrater reliability was high for stalker type, and moderate-to-substantial for risk judgments and domain scores. Evidence for predictive validity and discrimination between stalking recidivists and nonrecidivists for risk judgments depended on follow-up duration. Discrimination was moderate (area under the curve = 0.66-0.68) and positive and negative predictive values good over the full follow-up period ( Mdn = 170.43 weeks). At 6 months, discrimination was better than chance only for judgments related to stalking of new victims (area under the curve = 0.75); however, high-risk stalkers still reoffended against their original victim(s) 2 to 4 times as often as low-risk stalkers. Implications for the clinical utility and refinement of the SRP are discussed.
Blazevich, Anthony J; Gill, Nicholas; Newton, Robert U
The purpose of the present study was first to examine the reliability of isometric squat (IS) and isometric forward hack squat (IFHS) tests to determine if repeated measures on the same subjects yielded reliable results. The second purpose was to examine the relation between isometric and dynamic measures of strength to assess validity. Fourteen male subjects performed maximal IS and IFHS tests on 2 occasions and 1 repetition maximum (1-RM) free-weight squat and forward hack squat (FHS) tests on 1 occasion. The 2 tests were found to be highly reliable (intraclass correlation coefficient [ICC](IS) = 0.97 and ICC(IFHS) = 1.00). There was a strong relation between average IS and 1-RM squat performance, and between IFHS and 1-RM FHS performance (r(squat) = 0.77, r(FHS) = 0.76; p squat and FHS test performances (r squat and FHS test performance can be attributed to differences in the movement patterns of the tests
Full Text Available The aim of this research is to develop a measurement instrument that will determine the cultural responsive teaching readiness level of teacher candidates. The study group consisted of a total of 231 candidate teachers, of which 83 were males and 148 were females, who were attending their final year of class teacher education programs at various Turkish universities during the 2016-2017 education year. In the first phase, a 33-item draft form was presented to experts to be reviewed. Based on the feedback received, revisions were made and the final scale was applied to a group of 231 candidate teachers. In the analysis of the data obtained as the result of the application, Exploratory Factor Analysis (EFA was performed. The EFA produced 21 items within a two-factor structure as, “Personal Readiness” and “Professional Readiness.” It was observed that the sub-factors were components of the “cultural responsive teaching readiness” dimension, and that the goodness of fit measures obtained as a result of the First and Second Level Confirmatory Factor Analyzes (CFA were high. In addition, reliability coefficients were found to be high as a result of reliability measurements. With the help of these findings, this study concludes that the Cultural Responsive Teaching Readiness scale is both valid and reliable.
Deng, Weiling; Monfils, Lora
Using simulated data, this study examined the impact of different levels of stringency of the valid case inclusion criterion on item response theory (IRT)-based true score equating over 5 years in the context of K-12 assessment when growth in student achievement is expected. Findings indicate that the use of the most stringent inclusion criterion…
Maljaars, Jarymke; Noens, Ilse; Scholte, Evert; van Berckelaer-Onnes, Ina
The Diagnostic Interview for Social and Communication Disorders (DISCO; Wing, 2006) is a standardized, semi-structured and interviewer-based schedule for diagnosis of autism spectrum disorder (ASD). The objective of this study was to evaluate the criterion and convergent validity of the DISCO-11 ICD-10 algorithm in young and low-functioning…
Bödeker, Malte; Bucksch, Jens; Wallmann-Sperlich, Birgit
The Neighborhood Physical Activity Questionnaire allows to assess physical activity within and outside the neighborhood. Study objectives were to examine the criterion-related validity and health/functioning associations of Neighborhood Physical Activity Questionnaire-derived physical activity in German older adults. A total of 107 adults aged…
Plantinga, E.; Tiesinga, L. J.; van der Schans, C. P.; Middel, B.
Objective: To investigate the criterion or concurrent validity of the Northwick Park Dependency Score (NPDS) for determining nursing dependence in different rehabilitation groups, with the Barthel Index (BI) and the Care Dependency Scale (C D S). Design: Cross-sectional study. Setting: Centre for
Nikjooy, Afsaneh; Jafari, Hassan; Saba, Maryam A; Ebrahimi, Naghmeh; Mirzaei, Rezvan
The Patient Assessment of Constipation Quality of Life (PAC-QOL) questionnaire is the most validated and the most specific tool for measuring the quality of life of patients with constipation. Over 120 million people live in countries whose official language is Persian. There is no reported Persian version of the PAC-QOL questionnaire yet. The aim of this study was to translate and culturally adapt the PAC-QOL questionnaire and to assess its reliability and validity among Persian patients with chronic constipation. Following the translation and cultural adaptation of the PAC-QOL questionnaire to Persian, 100 patients (mean±SD age=40.51±13.67) with constipation were recruited for validity measurement and 20 patients were re-examined for reliability. Content validity was assessed based on the opinions of an expert committee and the floor/ceiling effect. Construct validity was evaluated according to the hypothesis test. The SF-36 questionnaire was used for concurrent criterion validity, intra-class correlation coefficient for reliability, and Cronbach's alpha for internal consistency. The content validity of the PAC-QOL questionnaire was proven, and there was no floor/ceiling effect. Construct validity also was confirmed based on the hypothesis test. The overall Cronbach's alpha of the PAC-QOL questionnaire was 0.92 (range=0.72-0.92), and the overall intra-class correlation coefficient of the questionnaire was 0.88 (range=0.69-0.87). The correlation between the SF-36 and PAC-QOL questionnaires was moderate. The Persian version of the PAC-QOL questionnaire demonstrated good validity and reliability properties in chronic constipation. Accordingly, Persian researchers and clinicians can benefit from this questionnaire in further research and assessment of treatment outcomes.
Marc, Linda G; Henderson, Whitney R; Desrosiers, Astrid; Testa, Marcia A; Jean, Samuel E; Akom, Eniko Edit
There is limited information on depression in Haitians and this is partly attributable to the absence of culturally and linguistically adapted measures for depression. To perform a psychometric evaluation of the Haitian-Creole version of the PHQ-9 administered to men who have sex with men (MSM) in the Republic of Haiti. This study uses a cross-sectional design and data are from the Integrated Behavioral and Biological HIV Survey (IBBS) for MSM in Haiti. Inclusion criteria required that participants be male, ≥ 18 years, report sexual relations with a male partner in the last 12 months, and lived in Haiti during the past 3 months. Respondent Driven Sampling was used for participant recruitment. A structured questionnaire was verbally administered in Haitian-Creole capturing information on sociodemographics, sexual behaviors, human immunodeficiency virus (HIV) status and depressive symptomatology using the PHQ-9. Psychometric analyses of the translated PHQ-9 assessed unidimensionality, factor structure, reliability, construct validity, and differential item functioning (DIF) across subgroups (age, educational level, sexual orientation and HIV status). In a study population of 1,028 MSM, the Haitian-Creole version of the PHQ-9 is unidimensional, has moderately high internal consistency reliability (α = 0.78), and shows evidence of construct validity where HIV-positive subjects have greater depression (p = 0.002). There is no evidence of DIF across age, education, sexual orientation or HIV status. HIV-positive MSM are twice as likely to screen positive for moderately severe and severe depressive symptoms compared to their HIV-negative counterparts. There is strong evidence for the psychometric adequacy of the translated PHQ-9 screening tool as a measure of depression with MSM in Haiti. Future research is necessary to examine the predictive validity of depression for subsequent health behaviors or clinical outcomes among Haitian MSM.
Bakhtadze, Maxim A; Vernon, Howard; Zakharova, Olga B; Kuzminov, Kirill O; Bolotov, Dmitry A
Cross-cultural adaptation and psychometric testing. To perform a validated Russian translation and then to evaluate the validity and reliability of the Russian language version of the Neck Disability Index (NDI-RU). Neck pain is highly prevalent and can greatly affect daily activity. The Neck Disability Index (NDI) is the most frequently used scale for self-rating of disability due to neck pain. Its translated versions are applied in many countries. However, the Russian language version of the NDI has not been developed yet. Cross-cultural adaptation of the NDI-RU was performed according to established guidelines. Then, the NDI-RU was evaluated for content validity, concurrent criterion validity, internal consistency, test-retest reliability, factor structure, and minimum detectable change. Two hundred thirty-two patients took part in the study in total: 109 in validity (39.5 ± 10 yr), 123 in reliability (38.4 ± 11 yr; 80 in the test-retest phase). A culturally valid translation was achieved. NDI-RU total scores were distributed normally. Floor/ceiling effects were absent. Good values of Cronbach α were obtained for each item (from 0.80 to 0.84) and for the total NDI-RU (0.83). A 2-factor solution was found for the NDI-RU. The average interitem correlation coefficient was 0.53. Intraclass correlation coefficients for test-retest reliability coefficients ranged from 0.65 to 0.92 for different items and 0.91 for the total NDI-RU. Moderate correlation (Spearman rs = 0.62; P Russian language version of the Neck Disability Index resulted in a valid, reliable instrument that can be used both in clinical practice and scientific investigations. 1.
Full Text Available Objectives: The aim of this study was to evaluate the psychometric features of the Persian version of the Autism Behavior Checklist (ABC. Method:The International Quality of Life Assessment (IQOLA approach was used to translate the English ABC into Persian. A total sample of 184 parents of children including 114 children with autism disorder (mean age =7.21, SD =1.65 and 70 typically developing children (mean age = 6.82, SD =1.75 completed the ABC. Internal consistency, test-retest reliability, concurrent and discriminant validity, and cut-off score were assessed. Results: The results of this study revealed that the Persian version of the ABC has an acceptable degree of internal consistency (.73. Test–retest comparisons using interclass correlation confirmed the instrument’s time stability (.83. The instrument’s concurrent validity with Gilliam Autism Rating Scale (GARS was verified; the correlation between total scores was .94. In the discriminant validity, the autism group had significantly higher scores compared to the normal group. Receiver Operating Characteristic (ROC analysis revealed that individuals with total scores below 25 are less likely to be in the autism group. Conclusion:The Persian version of the ABC can be used as an initial screening tool in clinical contexts.
In educational research that calls itself empirical, the relationship between validity and reliability is that of trade-off: the stronger the bases for validity, the weaker the bases for reliability (and vice versa). Validity and reliability are widely regarded as basic criteria for evaluating research; however, there are ethical implications of…
Radman, Ivan; Ruzic, Lana; Padovan, Viktoria; Cigrovski, Vjekoslav; Podnar, Hrvoje
This study aimed to examine the reliability and validity of the inline skating skill test. Based on previous skating experience forty-two skaters (26 female and 16 male) were randomized into two groups (competitive level vs. recreational level). They performed the test four times, with a recovery time of 45 minutes between sessions. Prior to testing, the participants rated their skating skill using a scale from 1 to 10. The protocol included performance time measurement through a course, combining different skating techniques. Trivial changes in performance time between the repeated sessions were determined in both competitive females/males and recreational females/males (-1.7% [95% CI: -5.8–2.6%] – 2.2% [95% CI: 0.0–4.5%]). In all four subgroups, the skill test had a low mean within-individual variation (1.6% [95% CI: 1.2–2.4%] – 2.7% [95% CI: 2.1–4.0%]) and high mean inter-session correlation (ICC = 0.97 [95% CI: 0.92–0.99] – 0.99 [95% CI: 0.98–1.00]). The comparison of detected typical errors and smallest worthwhile changes (calculated as standard deviations × 0.2) revealed that the skill test was able to track changes in skaters’ performances. Competitive-level skaters needed shorter time (24.4–26.4%, all p skating skills in amateur competitive and recreational level skaters. Further studies are needed to evaluate the reproducibility of this skill test in different populations including elite inline skaters. Key points Study evaluated the reliability and construct validity of a newly developed inline skating skill test. Evaluated test is a first protocol designed to assess specific inline skating skill. Two groups of amateur skaters with different skating proficiency repeated the skill test in four separate occasions. The results suggest that evaluated test is reliable and valid to evaluate inline skating skill in amateur skaters. PMID:27803616
Conclusion: Only seven studies calculated validity coefficients within the study whereas 47 cited the validity coefficient. Twenty-six calculated a reliability coefficient whereas 47 cited the reliability of the ED measures. Four studies found validity evidence for the EAT, EDI, BULIT-R, QEDD, and EDE-Q in an athlete population. Few studies reviewed calculated validity and reliability coefficients of ED measures. Cross-validation of these measures in athlete populations is clearly needed.
Duruturk, Neslihan; Tonga, Eda; Gabel, Charles Philip; Acar, Manolya; Tekindal, Agah
This study aims to adapt culturally a Turkish version of the Lower Limb Functional Index (LLFI) and to determine its validity, reliability, internal consistency, measurement sensitivity and factor structure in lower limb problems. The LLFI was translated into Turkish and cross-culturally adapted with a double forward-backward protocol that determined face and content validity. Individuals (n = 120) with lower limb musculoskeletal disorders completed the LLFI and Short Form-36 questionnaires and the Timed Up and Go physical test. The psychometric properties were evaluated for the all participants from patient-reported outcome measures made at baseline and repeated at day 3 to determine criterion between scores (Pearson's r), internal consistency (Cronbachs α) and test-retest reliability (intraclass correlation coefficient - ICC 2.1 ). Error was determined using standard error of the measurement (SEM) and minimal detectable change at the 90% level (MDC 90 ), while factor structure was determined using exploratory factor analysis with maximum likelihood extraction and Varimax rotation. The psychometric characteristics showed strong criterion validity (r = 0.74-0.76), high internal consistency (α = 0.82) and high test-retest reability (ICC 2.1 = 0.97). The SEM of 3.2% gave an MDC 90 = 5.8%. The factor structure was uni-dimensional. Turkish version of LLFI was found to be valid and reliable for the measurement of lower limb function in a Turkish population. Implications for Rehabilitation Lower extremity musculoskeletal disorders are common and greatly impact activities among the affected individuals pertaining to daily living, work, leisure and quality of life. Patient-reported outcome (PRO) measures have advantages as they are practical, cost-effective and clinically convenient for use in patient-centered care. The Lower Limb Functional Index is a recently validated PRO measure shown to have strong clinimetric properties.
Hashimoto, Ryusaku; Kashiwagi, Mitsuru; Suzuki, Shuhei
We developed a rapid word reading test for examining the phonological processing ability of Japanese children. We prepared two versions of the test, version A and B. Each test has word and non-word tasks. Twenty-two healthy boys of third grade in primary schools participated in this validation study. For criterion related validity, we performed the serial Hiragana reading test, the sentence reading test, Raven's coloured progressive matrices (RCPM), the Token test for children, the Kana word dictation test, the standardized comprehension test of abstract words (SCTAW), and Trail Circle test. The reading times of the newly developed test correlated moderately or highly with those of the serial Hiragana reading test and the sentence reading test. However, the scores of the other tests (RCPM, Token test for children, Kana word dictation test, SCTAW, Trail Circle test) did not correlated with the reading time of the rapid word reading test. Test-retest reliabilities in the word tasks were more than moderate: 0.52 and 0.76 in versions A and B, while those in the non-word tasks were high: 0.91 and 0.88 in versions A and B. The correlation coefficient between versions A and B was 0.7 for the word tasks and 0.92 for the non-word tasks. This study showed that the rapid word reading test has substantial validity and reliability for testing the phonological processing ability of Japanese children. In addition, the non-word tasks were more suitable for selectively examining the speed of the grapheme to phoneme conversion process.
Full Text Available Abstract Background The use of short screening questionnaires may be a promising option for identifying children at risk for depression in a community setting. The objective of this study was to assess the validity of the Short Mood and Feelings Questionnaire (SMFQ and one- and two-item screening instruments for depressive disorders in a school-based sample of young adolescents. Methods Participants were 521 sixth-grade students attending public middle schools. Child and parent versions of the SMFQ were administered to evaluate the child's depressive symptoms. The presence of any depressive disorder during the previous month was assessed using the Diagnostic Interview Schedule for Children (DISC as the criterion standard. First, we assessed the diagnostic accuracy of child, parent, and combined scores of the full 13-item SMFQ by calculating the area under the receiver operating characteristic curve (AUC, sensitivity and specificity. The same approach was then used to evaluate the accuracy of a two-item scale consisting of only depressed mood and anhedonia items, and a single depressed mood item. Results The combined child + parent SMFQ score showed the highest accuracy (AUC = 0.86. Diagnostic accuracy was lower for child (AUC = 0.73 and parent (AUC = 0.74 SMFQ versions. Corresponding versions of one- and two-item screens had lower AUC estimates, but the combined versions of the brief screens each still showed moderate accuracy. Furthermore, child and combined versions of the two-item screen demonstrated higher sensitivity (although lower specificity than either the one-item screen or the full SMFQ. Conclusions Under conditions where parents accompany children to screening settings (e.g. primary care, use of a child + parent version of the SMFQ is recommended. However, when parents are not available, and the cost of a false positive result is minimal, then a one- or two-item screen may be useful for initial identification of at-risk youth.
Granier, Cyril; Hausswirth, Christophe; Dorel, Sylvain; Yann, Le Meur
This study aimed to determine the validity and the reliability of the Stages power meter crank system (Boulder, United States) during several laboratory cycling tasks. Eleven trained participants completed laboratory cycling trials on an indoor cycle fitted with SRM Professional and Stages systems. The trials consisted of an incremental test at 100W, 200W, 300W, 400W and four 7s sprints. The level of pedaling asymmetry was determined for each cycling intensity during a similar protocol completed on a Lode Excalibur Sport ergometer. The reliability of Stages and SRM power meters was compared by repeating the incremental test during a test-retest protocol on a Cyclus 2 ergometer. Over power ranges of 100-1250W the Stages system produced trivial to small differences compared to the SRM (standardized typical error values of 0.06, 0.24 and 0.08 for the incremental, sprint and combined trials, respectively). A large correlation was reported between the difference in power output (PO) between the two systems and the level of pedaling asymmetry (r=0.58, p system according to the level of pedaling asymmetry provided only marginal improvements in PO measures. The reliability of the Stages power meter at the sub-maximal intensities was similar to the SRM Professional model (coefficient of variation: 2.1 and 1.3% for Stages and SRM, respectively). The Stages system is a suitable device for PO measurements, except when a typical error of measurement power ranges of 100-1250W is expected.
Long, Kim Chenming
Real-world engineering optimization problems often require the consideration of multiple conflicting and noncommensurate objectives, subject to nonconvex constraint regions in a high-dimensional decision space. Further challenges occur for combinatorial multiobjective problems in which the decision variables are not continuous. Traditional multiobjective optimization methods of operations research, such as weighting and epsilon constraint methods, are ill-suited to solving these complex, multiobjective problems. This has given rise to the application of a wide range of metaheuristic optimization algorithms, such as evolutionary, particle swarm, simulated annealing, and ant colony methods, to multiobjective optimization. Several multiobjective evolutionary algorithms have been developed, including the strength Pareto evolutionary algorithm (SPEA) and the non-dominated sorting genetic algorithm (NSGA), for determining the Pareto-optimal set of non-dominated solutions. Although numerous researchers have developed a wide range of multiobjective optimization algorithms, there is a continuing need to construct computationally efficient algorithms with an improved ability to converge to globally non-dominated solutions along the Pareto-optimal front for complex, large-scale, multiobjective engineering optimization problems. This is particularly important when the multiple objective functions and constraints of the real-world system cannot be expressed in explicit mathematical representations. This research presents a novel metaheuristic evolutionary algorithm for complex multiobjective optimization problems, which combines the metaheuristic tabu search algorithm with the evolutionary algorithm (TSEA), as embodied in genetic algorithms. TSEA is successfully applied to bicriteria (i.e., structural reliability and retrofit cost) optimization of the aircraft tail structure fatigue life, which increases its reliability by prolonging fatigue life. A comparison for this
Renteria, Laura; Li, Susan Tinsley; Pliskin, Neil H
The utility of the Spanish WAIS-III was investigated by examining its reliability and validity among 100 Spanish-speaking participants. Results indicated that the internal consistency of the subtests was satisfactory, but inadequate for Letter Number Sequencing. Criterion validity was adequate. Convergent and discriminant validity results were generally similar to the North American normative sample. Paired sample t-tests suggested that the WAIS-III may underestimate ability when compared to the criterion measures that were utilized to assess validity. This study provides support for the use of the Spanish WAIS-III in urban Hispanic populations, but also suggests that caution be used when administering specific subtests, due to the nature of the Latin America alphabet and potential test bias.
This study describes the development and evaluation of the Nursery Teacher's Stress Scale (NTSS), which explores the relation between daily hassles at work and work-related stress. In Analysis 1, 29 items were chosen to construct the NTSS. Six factors were identified: I. Stress relating to child care; II. Stress from human relations at work; III. Stress from staff-parent relations; IV. Stress from lack of time; V. Stress relating to compensation; and VI. Stress from the difference between individual beliefs and school policy. All these factors had high degrees of internal consistency. In Analysis 2, the concurrent validity of the NTSS was examined. The results showed that the NTSS total scores were significantly correlated with the Job Stress Scale-Revised Version (job stressor scale, r = .68), the Pre-school Teacher-efficacy Scale (r = -.21), and the WHO-five Well-Being Index Japanese Version (r = -.40). Work stresses are affected by several daily hassles at work. The NTSS has acceptable reliability and validity, and can be used to improve nursery teacher's mental health.
Soraia M. Silva
Full Text Available BACKGROUND: Handgrip strength is currently considered a predictor of overall muscle strength and functional capacity. Therefore, it is important to find reliable and affordable instruments for this analysis, such as the modified sphygmomanometer test (MST. OBJECTIVES: To assess the concurrent criterion validity of the MST, to compare the MST with the Jamar dynamometer, and to analyze the reproducibility (i.e. reliability and agreement of the MST in individuals with Parkinson's disease (PD. METHOD: The authors recruited 50 subjects, 24 with PD (65.5±6.2 years of age and 26 healthy elderly subjects (63.4±7.2 years of age. The handgrip strength was measured using the Jamar dynamometer and modified sphygmomanometer. The concurrent criterion validity was analyzed using Pearson's correlation coefficient and a simple linear regression test. The reproducibility of the MST was evaluated with the coefficient of intra-class correlation (ICC2,1, the standard error of measurement (SEM, the minimal detectable change (MDC, and the Bland-Altman plot. For all of the analyses, α≤0.05 was considered a risk. RESULTS: There was a significant correlation of moderate magnitude (r≥0.45 between the MST and the Jamar dynamometer. The MST had excellent reliability (ICC2,1≥0.7. The SEM and the MDC were adequate; however, the Bland-Altman plot indicated an unsatisfactory interrater agreement. CONCLUSIONS: The MST exhibited adequate validity and excellent reliability and is, therefore, suitable for monitoring the handgrip strength in PD. However, if the goal is to compare the measurements between examiners, the authors recommend that the data be interpreted with caution.
Curlette, William L.; Stallings, William M.
The 10 criticisms of criterion-referenced tests addressed in this paper are: the domains tested; pedagogical influence; difficulty of items; cumbersome reports; reliability; arbitrary criteria; local objectives; labeling; predictive validity; and repeated testing. (SJL)
Full Text Available PURPOSE: The aim of this study was to examine the reliability, validity and usefulness of the 30-15IFT in competitive female soccer players. METHODS: Seventeen elite female soccer players participated in the study. A within subject test-retest study design was utilized to assess the reliability of the 30-15 intermittent fitness test (IFT. Seven days prior to 30-15IFT, subjects performed a continuous aerobic running test (CT under laboratory conditions to assess the criterion validity of the 30-15IFT. End running velocity (VCT and VIFT, peak heart rate (HRpeak and maximal oxygen consumption (VO2max were collected and/or estimated for both tests. RESULTS: VIFT (ICC = 0.91; CV = 1.8%, HRpeak (ICC = 0.94; CV = 1.2%, and VO2max (ICC = 0.94; CV = 1.6% obtained from the 30-15IFT were all deemed highly reliable (p>0.05. Pearson product moment correlations between the CT and 30-15IFT for VO2max, HRpeak and end running velocity were large (r = 0.67, p=0.013, very large (r = 0.77, p=0.02 and large (r = 0.57, p=0.042, respectively. CONCLUSION: Current findings suggest that the 30 -15IFT is a valid and reliable intermittent aerobic fitness test of elite female soccer players. The findings have also provided practitioners with evidence to support the accurate detection of meaningful individual changes in VIFT of 0.5 km/h (1 stage and HRpeak of 2 bpm. This information may assist coaches in monitoring ‘real’ aerobic fitness changes to better inform training of female intermittent team sport athletes. Lastly, coaches could use the 30-15IFT as a practical alternative to laboratory based assessments to assess and monitor intermittent aerobic fitness changes in their athletes. Keywords: 30-15 intermittent fitness test, aerobic, cardiorespiratory fitness, intermittent activity, soccer, high intensity interval training.
Harro, Cathy C; Garascia, Chelsea
Postural control declines with aging and is an independent risk factor for falls in older adults. Objective examination of balance function is warranted to direct fall prevention strategies. Force platform (FP) systems provide quantitative measures of postural control and analysis of different aspects of balance. The purpose of this study was to examine the reliability and validity of FP measures in healthy older adults. This study enrolled 46 healthy elderly adults, mean age 67.67 (5.1) years, who had no history of falls. They were assessed on 3 standardized tests on the NeuroCom Equitest FP system: limits of stability (LOS), motor control test (MCT), and sensory organization test (SOT). The test battery was administered twice within a 10-day period for test-retest reliability; intraclass correlation coefficients (ICCs), standard error of measurement (SEM), and minimal detectable change based on a 95% confidence interval (MDC95) were calculated. FP measures were compared with criterion clinical balance (Mini-BESTest and Functional Gait Assessment) and gait (10-m walk and 6-minute walk) measures to examine concurrent validity using Pearson correlation coefficients. Multiple linear regression analysis examined whether age and activity level were associated with FP performance. The α level was set at P point excursion measures all demonstrated excellent test-retest reliability (ICC = 0.90, 0.85, and 0.77, respectively), whereas moderate to good reliability was found for SOT vestibular ratio score (ICC = 0.71). There was large variability in performance in this healthy elderly cohort, resulting in relatively large MDC95 for these measures, especially for the LOS test. Fair correlations were found between LOS end point excursion and clinical balance and gait measures (r = 0.31-0.49), and between MCT average latency and gait measures only (r = -0.32). No correlations were found between SOT measures and clinical balance and gait measures. Age was only marginally
Chang, Jasper O; Levy, Susan S; Seay, Seth W; Goble, Daniel J
Recent guidelines advocate sports medicine professionals to use balance tests to assess sensorimotor status in the management of concussions. The present study sought to determine whether a low-cost balance board could provide a valid, reliable, and objective means of performing this balance testing. Criterion validity testing relative to a gold standard and 7 day test-retest reliability. University biomechanics laboratory. Thirty healthy young adults. Balance ability was assessed on 2 days separated by 1 week using (1) a gold standard measure (ie, scientific grade force plate), (2) a low-cost Nintendo Wii Balance Board (WBB), and (3) the Balance Error Scoring System (BESS). Validity of the WBB center of pressure path length and BESS scores were determined relative to the force plate data. Test-retest reliability was established based on intraclass correlation coefficients. Composite scores for the WBB had excellent validity (r = 0.99) and test-retest reliability (R = 0.88). Both the validity (r = 0.10-0.52) and test-retest reliability (r = 0.61-0.78) were lower for the BESS. These findings demonstrate that a low-cost balance board can provide improved balance testing accuracy/reliability compared with the BESS. This approach provides a potentially more valid/reliable, yet affordable, means of assessing sports-related concussion compared with current methods.
Lee, Justin W Y; Cai, Ming-Jing; Yung, Patrick S H; Chan, Kai-Ming
To evaluate the test-retest reliability, sensitivity, and concurrent validity of a smartphone-based method for assessing eccentric hamstring strength among male professional football players. A total of 25 healthy male professional football players performed the Chinese University of Hong Kong (CUHK) Nordic break-point test, hamstring fatigue protocol, and isokinetic hamstring strength test. The CUHK Nordic break-point test is based on a Nordic hamstring exercise. The Nordic break-point angle was defined as the maximum point where the participant could no longer support the weight of his body against gravity. The criterion for the sensitivity test was the presprinting and postsprinting difference of the Nordic break-point angle with a hamstring fatigue protocol. The hamstring fatigue protocol consists of 12 repetitions of the 30-m sprint with 30-s recoveries between sprints. Hamstring peak torque of the isokinetic hamstring strength test was used as the criterion for validity. A high test-retest reliability (intraclass correlation coefficient = .94; 95% confidence interval, .82-.98) was found in the Nordic break-point angle measurements. The Nordic break-point angle significantly correlated with isokinetic hamstring peak torques at eccentric action of 30°/s (r = .88, r 2 = .77, P hamstring strength measures among male professional football players.
Orange, Samuel T; Metcalfe, James W; Liefeith, Andreas; Marshall, Phil; Madden, Leigh A; Fewster, Connor R; Vince, Rebecca V
Orange, ST, Metcalfe, JW, Liefeith, A, Marshall, P, Madden, LA, Fewster, CR, and Vince, RV. Validity and reliability of a wearable inertial sensor to measure velocity and power in the back squat and bench press. J Strength Cond Res XX(X): 000-000, 2018-This study examined the validity and reliability of a wearable inertial sensor to measure velocity and power in the free-weight back squat and bench press. Twenty-nine youth rugby league players (18 ± 1 years) completed 2 test-retest sessions for the back squat followed by 2 test-retest sessions for the bench press. Repetitions were performed at 20, 40, 60, 80, and 90% of 1 repetition maximum (1RM) with mean velocity, peak velocity, mean power (MP), and peak power (PP) simultaneously measured using an inertial sensor (PUSH) and a linear position transducer (GymAware PowerTool). The PUSH demonstrated good validity (Pearson's product-moment correlation coefficient [r]) and reliability (intraclass correlation coefficient [ICC]) only for measurements of MP (r = 0.91; ICC = 0.83) and PP (r = 0.90; ICC = 0.80) at 20% of 1RM in the back squat. However, it may be more appropriate for athletes to jump off the ground with this load to optimize power output. Further research should therefore evaluate the usability of inertial sensors in the jump squat exercise. In the bench press, good validity and reliability were evident only for the measurement of MP at 40% of 1RM (r = 0.89; ICC = 0.83). The PUSH was unable to provide a valid and reliable estimate of any other criterion variable in either exercise. Practitioners must be cognizant of the measurement error when using inertial sensor technology to quantify velocity and power during resistance training, particularly with loads other than 20% of 1RM in the back squat and 40% of 1RM in the bench press.
Shou, Juan; Ren, Limin; Wang, Haitang; Yan, Fei; Cao, Xiaoyun; Wang, Hui; Wang, Zhiliang; Zhu, Shanzhu; Liu, Yao
The 12-item Short-Form Health Survey (SF-12) is the abridged practical version of SF-36. This cross-sectional study was aimed to assess the reliability and validity of SF-12 for the health status of Chinese community elderly population. The Chinese community elderly people in Xujiahui district of Shanghai were investigated. The internal consistency reliability was assessed using Cronbach's alpha and split-half reliability coefficients. Construct validity was analyzed using exploratory factor analysis (EFA) and confirmatory factor analysis (CFA). Spearman's correlation coefficient (ρ) was used for the evaluation of criterion, convergent, and discriminant validity with Spearman's ρ ≥ 0.4 as satisfactory. Comparisons of the SF-12 summary scores among populations that differed in demographics were performed for discriminant validity. Total 1343 individuals aged ≥60 and reliability coefficient (0.812) reflected satisfactory internal consistency reliability of SF-12. EFA extracted a two-factor model (physical and mental health). About 60.7 % of the total variance was explained by the two factors. CFA showed that the two-factor solution provided a good fit to the data. Good convergent validity and discriminant validity of SF-12 were proved by the correction analyses (Spearman's ρ > 0.4) and the comparisons of the SF-12 summary scores among populations (P 0.4, P reliability and validity in measuring health status of Chinese community elderly population in Xujiahui district of Shanghai.
Nicholas A. Petrunoff
Full Text Available Background. The purpose of this study was to assess the (previously untested reliability and validity of survey questions commonly used to assess travel mode and travel time. Methods. Sixty-five respondents from a staff survey of travel behaviour conducted in a south-western Sydney hospital agreed to complete a travel diary for a week, wear an accelerometer over the same period, and twice complete an online travel survey an average of 21 days apart. The agreement in travel modes between the self-reported online survey and travel diary was examined with the kappa statistic. Spearman’s correlation coefficient was used to examine agreement of travel time from home to workplace measured between the self-reported online survey and four-day travel diary. Moderate-to-vigorous physical activity (MVPA time of active and nonactive travellers was compared by t-test. Results. There was substantial agreement between travel modes (K=0.62, P<0.0001 and a moderate correlation for travel time (ρ=0.75, P<0.0001 reported in the travel diary and online survey. There was a high level of agreement for travel mode (K=0.82, P<0.0001 and travel time (ρ=0.83, P<0.0001 between the two travel surveys. Accelerometer data indicated that for active travellers, 16% of the journey-to-work time is MVPA, compared with 6% for car drivers. Active travellers were significantly more active across the whole workday. Conclusions. The survey question “How did you travel to work this week? If you used more than one transport mode specify the one you used for the longest (distance portion of your journey” is reliable over 21 days and agrees well with a travel diary.
Voss, Christine; Dean, Paige H.; Gardner, Ross F.; Duncombe, Stephanie L.; Harris, Kevin C.
Objective To assess the criterion validity, internal consistency, reliability and cut-point for the Physical Activity Questionnaire for Children (PAQ-C) and Adolescents (PAQ-A) in children and adolescents with congenital heart disease?a special population at high cardiovascular risk in whom physical activity has not been extensively evaluated. Methods We included 84 participants (13.6?2.9 yrs, 50% female) with simple (37%), moderate (31%), or severe congenital heart disease (27%), as well as ...
Full Text Available Backround and design. Internalized stigma involves endorsing negative feelings and beliefs such as insignificance, shame and withdrawal triggered by applying these negative stereotypes to one self. Internalized Stigma Scale has not been applied to psoriasis patients. We aimed to evaluate the reliability and validity of Internalized Stigma Scale in psoriasis patients. Materials and Methods. 100 consecutive, volunteer psoriasis patients (48 female, 52 male; aged, 40.59±15.44 years were enrolled in the study. PASI and BSA were evaluated by physician (A.B.. Patients responded contemporaneously to Psoriasis Internalized Stigma Scale (PISS, DQoL, and Perceived Health Status (PHS, single-item self-rated general health question, of which Likert scores 1, 2, and 3 were classified as “from fair to very poor”, and 4, 5 as “good”. Results. Cronbach's alpha coefficient of PISS subscales was 0.83 for alienation, 0.70 for stereotype endorsement, 0.70 for perceived discrimination, 0.84 for social withdrawal and 0.68 for stigma resistance. The same value was 0.89 for the total scale. PISS and DQoL scores mean values were 58.8±12.6 and 10.0±9.4, respectively. PISS was significantly correlated with the patients' DQoL scores (r=,726, p=0,001. PISS was also significantly correlated with disease duration (r=,209, p=0,047. There was no any significant relationship between PASI or BSA and PISS. Mean DQoL scores in patients reporting their PHS as “from fair to very poor” and “good” were 12.1±7.3 and 5.0±4.3, respectively. Mean values of PISS in patients reporting their PHS as “from fair to very poor” was significantly increased compared with patients reporting their PHS as “good” (p=0.001. Conclusion. PISS can be used as a reliable and valid tool in assesing internalized stigmatization in psoriasis patients. Our results indicate a high level of stigmatization in psoriasis patients. Low DQoL scores show a correlation with increased levels of
Ivan Radman, Lana Ruzic, Viktoria Padovan, Vjekoslav Cigrovski, Hrvoje Podnar
Full Text Available This study aimed to examine the reliability and validity of the inline skating skill test. Based on previous skating experience forty-two skaters (26 female and 16 male were randomized into two groups (competitive level vs. recreational level. They performed the test four times, with a recovery time of 45 minutes between sessions. Prior to testing, the participants rated their skating skill using a scale from 1 to 10. The protocol included performance time measurement through a course, combining different skating techniques. Trivial changes in performance time between the repeated sessions were determined in both competitive females/males and recreational females/males (-1.7% [95% CI: -5.8–2.6%] – 2.2% [95% CI: 0.0–4.5%]. In all four subgroups, the skill test had a low mean within-individual variation (1.6% [95% CI: 1.2–2.4%] – 2.7% [95% CI: 2.1–4.0%] and high mean inter-session correlation (ICC = 0.97 [95% CI: 0.92–0.99] – 0.99 [95% CI: 0.98–1.00]. The comparison of detected typical errors and smallest worthwhile changes (calculated as standard deviations × 0.2 revealed that the skill test was able to track changes in skaters’ performances. Competitive-level skaters needed shorter time (24.4–26.4%, all p < 0.01 to complete the test in comparison to recreational-level skaters. Moreover, moderate correlation (ρ = 0.80–0.82; all p < 0.01 was observed between the participant’s self-rating and achieved performance times. In conclusion, the proposed test is a reliable and valid method to evaluate inline skating skills in amateur competitive and recreational level skaters. Further studies are needed to evaluate the reproducibility of this skill test in different populations including elite inline skaters.
Full Text Available Background: The agreement of new instruments or clinical tests with other instruments or tests defines the possibility of these being used interchangeably. Aim: To investigate the validity and reliability of the SW-100 autokeratometer using a Bausch & Lomb (B&L keratometer as the ‘gold standard’. Methods: Eighty subjects (80 right eyes aged between 21 and 38 years were recruited. For intra-test repeatability, two measurements of the corneal radius of curvature were taken with the SW-100 and B&L keratometers. Forty of the 80 subjects participated in the inter-test repeatability measurement. Results: Corneal radius of curvature was found to be statistically different between the two instruments (p < 0.001, with the SW-100 providing slightly flatter values of 0.11 mm and 0.05 mm for the horizontal and vertical meridians, respectively, than the B&L keratometer. The average corneal curvature was 0.07 mm flatter with the SW-100 autokeratometer than with the B&L device. Agreement between the SW-100 and B&L keratometers’ axes was 45% within ± 5°, 60.3% within ± 10°, 78.8% within ± 15°, 80.3% within ± 20°, and 88.7% within ± 40°. Intertest repeatability was better for the B&L device than the SW-100 and showed no significant difference between the two sessions. Both instruments demonstrated comparable intrasession repeatability. As such, both instruments were comparatively reliable (per coefficients of repeatability. The range of limits of agreement of ± 0.14 mm (horizontal meridian and ± 0.17 mm (vertical meridian between the SW-100 and B&L devices showed good agreement. Conclusion: The results suggest that the SW-100 autokeratometer is a reliable and objective instrument that, however, provides flatter radii of curvature measurements than the B&L keratometer. A compensating factor incorporated into the instrument could reduce the difference between the two instruments and make them more interchangeable.
Hoppe, Matthias W; Baumgart, Christian; Polglaze, Ted; Freiwald, Jürgen
This study aimed to investigate the validity and reliability of global (GPS) and local (LPS) positioning systems for measuring distances covered and sprint mechanical properties in team sports. Here, we evaluated two recently released 18 Hz GPS and 20 Hz LPS technologies together with one established 10 Hz GPS technology. Six male athletes (age: 27±2 years; VO2max: 48.8±4.7 ml/min/kg) performed outdoors on 10 trials of a team sport-specific circuit that was equipped with double-light timing gates. The circuit included various walking, jogging, and sprinting sections that were performed either in straight-lines or with changes of direction. During the circuit, athletes wore two devices of each positioning system. From the reported and filtered velocity data, the distances covered and sprint mechanical properties (i.e., the theoretical maximal horizontal velocity, force, and power output) were computed. The sprint mechanical properties were modeled via an inverse dynamic approach applied to the center of mass. The validity was determined by comparing the measured and criterion data via the typical error of estimate (TEE), whereas the reliability was examined by comparing the two devices of each technology (i.e., the between-device reliability) via the coefficient of variation (CV). Outliers due to measurement errors were statistically identified and excluded from validity and reliability analyses. The 18 Hz GPS showed better validity and reliability for determining the distances covered (TEE: 1.6-8.0%; CV: 1.1-5.1%) and sprint mechanical properties (TEE: 4.5-14.3%; CV: 3.1-7.5%) than the 10 Hz GPS (TEE: 3.0-12.9%; CV: 2.5-13.0% and TEE: 4.1-23.1%; CV: 3.3-20.0%). However, the 20 Hz LPS demonstrated superior validity and reliability overall (TEE: 1.0-6.0%; CV: 0.7-5.0% and TEE: 2.1-9.2%; CV: 1.6-7.3%). For the 10 Hz GPS, 18 Hz GPS, and 20 Hz LPS, the relative loss of data sets due to measurement errors was 10.0%, 20.0%, and 15.8%, respectively. This study shows that
Paulus, David C; Reynolds, Michael C; Schilling, Brian K
During the concentric portion of the free-weight squat exercise, accelerating the mass from rest results in a fluctuation in ground reaction force. It is characterized by an initial period of force greater than the load while accelerating from rest followed by a period of force lower than the external load during negative acceleration. During the deceleration phase, less force is exerted and muscles are loaded sub-optimally. Thus, using a reduced inertia form of resistance such as pneumatics has the capability to minimize these inertial effects as well as control the force in real time to maximize the force exerted over the exercise cycle. To improve the system response of a preliminary design, a squat device was designed with a reduced mass barbell and two smaller pneumatic cylinders. The resistance was controlled by regulating cylinder pressure such that it is capable of adjusting force within a repetition to maximize force exerted during the lift. The resistance force production of the machine was statically validated with the input voltage and output force R2 =0.9997 for at four increments of the range of motion, and the intraclass correlation coefficient (ICC) between trials at the different heights equaled 0.999. The slew rate at three forces was 749.3 N/s +/- 252.3. Dynamic human subject testing showed the desired input force correlated with average and peak ground reaction force with R2 = 0.9981 and R2 = 0.9315, respectively. The ICC between desired force and average and peak ground reaction force was 0.963. Thus, the system is able to deliver constant levels of static and dynamic force with validity and reliability. Future work will be required to develop the control strategy required for real-time control, and performance testing is required to determine its efficacy.
Full Text Available Objective: To translate the Perceived Stress Scale (versions PSS-4, -10 and -14 and to assess its psychometric properties in a sample of general Greek population. Methods: 941 individuals completed anonymously questionnaires comprising of PSS, the Depression Anxiety and Stress scale (DASS-21 version, and a list of stress-related symptoms. Psychometric properties of PSS were investigated by confirmatory factor analysis (construct validity, Cronbach’s alpha (reliability, and by investigating relations with the DASS-21 scores and the number of symptoms, across individuals’ characteristics. The two-factor structure of PSS-10 and PSS-14 was confirmed in our analysis. We found satisfactory Cronbach’s alpha values (0.82 for the full scale for PSS-14 and PSS-10 and marginal satisfactory values for PSS-4 (0.69. PSS score exhibited high correlation coefficients with DASS-21 subscales scores, meaning stress (r = 0.64, depression (r = 0.61, and anxiety (r = 0.54. Women reported significantly more stress compared to men and divorced or widows compared to married or singled only. A strong significant (p < 0.001 positive correlation between the stress score and the number of self-reported symptoms was also noted. Conclusions: The Greek versions of the PSS-14 and PSS-10 exhibited satisfactory psychometric properties and their use for research and health care practice is warranted.
Andreou, Eleni; Alexopoulos, Evangelos C; Lionis, Christos; Varvogli, Liza; Gnardellis, Charalambos; Chrousos, George P; Darviri, Christina
To translate the Perceived Stress Scale (versions PSS-4, -10 and -14) and to assess its psychometric properties in a sample of general Greek population. 941 individuals completed anonymously questionnaires comprising of PSS, the Depression Anxiety and Stress scale (DASS-21 version), and a list of stress-related symptoms. Psychometric properties of PSS were investigated by confirmatory factor analysis (construct validity), Cronbach's alpha (reliability), and by investigating relations with the DASS-21 scores and the number of symptoms, across individuals' characteristics. The two-factor structure of PSS-10 and PSS-14 was confirmed in our analysis. We found satisfactory Cronbach's alpha values (0.82 for the full scale) for PSS-14 and PSS-10 and marginal satisfactory values for PSS-4 (0.69). PSS score exhibited high correlation coefficients with DASS-21 subscales scores, meaning stress (r = 0.64), depression (r = 0.61), and anxiety (r = 0.54). Women reported significantly more stress compared to men and divorced or widows compared to married or singled only. A strong significant (p < 0.001) positive correlation between the stress score and the number of self-reported symptoms was also noted. The Greek versions of the PSS-14 and PSS-10 exhibited satisfactory psychometric properties and their use for research and health care practice is warranted.
Frank, Guido K W; Favaro, Angela; Marsh, Rachel; Ehrlich, Stefan; Lawson, Elizabeth A
Human brain imaging can help improve our understanding of mechanisms underlying brain function and how they drive behavior in health and disease. Such knowledge may eventually help us to devise better treatments for psychiatric disorders. However, the brain imaging literature in psychiatry and especially eating disorders has been inconsistent, and studies are often difficult to replicate. The extent or severity of extremes of eating and state of illness, which are often associated with differences in, for instance hormonal status, comorbidity, and medication use, commonly differ between studies and likely add to variation across study results. Those effects are in addition to the well-described problems arising from differences in task designs, data quality control procedures, image data preprocessing and analysis or statistical thresholds applied across studies. Which of those factors are most relevant to improve reproducibility is still a question for debate and further research. Here we propose guidelines for brain imaging research in eating disorders to acquire valid results that are more reliable and clinically useful. © 2018 Wiley Periodicals, Inc.
Lillo-Bevia, José R; Pallarés, Jesús G
To validate the new drive indoor trainer Hammer designed by Cycleops®. Eleven cyclists performed 44 randomized and counterbalanced graded exercise tests (100-500W), at 70, 85 and 100 rev.min -1 cadences, in seated and standing positions, on 3 different Hammer units, while a scientific SRM system continuously recorded cadence and power output data. No significant differences were detected between the three Hammer devices and the SRM for any workload, cadence, or pedalling condition (P value between 1.00 and 0.350), except for some minor differences (P 0.03 and 0.04) found in the Hammer 1 at low workloads, and for Hammer 2 and 3 at high workloads, all in seated position. Strong ICCs were found between the power output values recorded by the Hammers and the SRM (≥0.996; P=0.001), independently from the cadence condition and seated position. Bland-Altman analysis revealed low Bias (-5.5-3.8) and low SD of Bias (2.5-5.3) for all testing conditions, except marginal values found for the Hammer 1 at high cadences and seated position (9.6±6.6). High absolute reliability values were detected for the 3 Hammers (150-500W; CVreliable device to drive and measure power output in cyclists, providing an alternative to larger and more expensive laboratory ergometers, and allowing cyclists to use their own bicycle.
Marbach, G.; Beche, M.; Pajot, J.
The excellent behavior of PHENIX driver fuel and the burnup values currently reached suggest that the first SUPERPHENIX fuel load will meet the design lifetime. However, to ensure the reliability of the entire load, all the parameters affecting fuel behavior in reactor must be analyzed. For that purpose, we have taken into account all the results of the examination and verifications during the fabrication process of the first load subassemblies. These data concern geometrical parameters or oxide composition as well as the cladding tube and plug weld soundness tests. The objective is to determine the actual dispersion of all the parameters to ensure the absence of failure due to fabrication defects with very high statistical confidence limits. The influence of all the parameters has been investigated for the situations which can occur during power-up, steady-state operation and transients. The fabrication quality allows us to demonstrate that in all cases good behavior criteria for fuel and structure will be maintained. This demonstration is based on calculation code results as well as on validation by specific experiments
Salacinski, Amanda J; Alford, Micah; Drevets, Kathryn; Hart, Sarah; Hunt, Brian E
As an appealing alternative to reference glucose analyzers, portable glucometers are recommended for self-monitoring at home, in the field, and in research settings. The purpose was to characterize the accuracy and precision, and bias of glucometers in biomedical research. Fifteen young (20-36 years; mean = 24.5), moderately to highly active men (n = 10) and women (n = 5), defined by exercising 2 to 3 times a week for the past 6 months, were given an oral glucose tolerance test (OGTT) after an overnight fast. Participants ingested 50, 75, or 150 grams of glucose over a 5-minute period. The glucometer was compared to a reference instrument. The glucometer had 39% of values within 15% of measurements made using the reference instrument ranging from 45.05 to 169.37 mg/dl. There was both a proportional (-0.45 to -0.39) and small fixed (5.06 and 0.90 mg/dl) bias. Results of the present study suggest that the glucometer provided poor validity and reliability results compared to the results provided by the reference laboratory analyzer. The portable glucometers should be used for patient management, but not for diagnosis, treatment, or research purposes. © 2014 Diabetes Technology Society.
Leticia de Matos Malavasi
Full Text Available The lack of adherence to practice physical activities urges several researchers to ind answers for this matter. Among these researches, it is investigated how or what motivates people to perform any type of physical activity. Besides that, the environmental conditions are an important reason to establish a healthier lifestyle among individuals. In Brazil, the amount of validated scales about environmental barriers for physical activity in communities is restricted. The validation and the cultural adaptation of these instruments are important not only to compare with studies from other countries, but mainly for planning public politics to improve the adherence to practice physical activities. Thus, the present research aimed to analyze the validity and reliability of the Brazilian version of the Neighborhood EnvironmentWalkability Scale (NEWS. The methodological procedures were structured in three stages. The first stage had the following procedures: translation of NEWS and back-translation by bilingual specialists. The second stage was the adaptation of NEWS to the Brazilian reality through a pilot study and with reliability. The third stage, together with a professional urban panel indicating which neighborhoods had better or worse mobility, it was accomplished a application of the NEWS questionnaire to assure construct validation. The sample of this research were separated in two parts, 75persons for the reliability; and for the validity of the questionnaire 200 residents from the four neighborhoods pointed by the specialists of the city of Florianópolis (SC. Through the NEWS the subjects answered questions about the neighborhoods regarding: type of residences, stores and trade proximity, perception of access to these places, streets characteristics, facilities to walk and ride bicycle, and safety related to traffic and crimes. The statistical analysis was made in the SPSS 11.0 version for the intra-class correlation and reliability for the
Shields, Ann; Cicchetti, Dante
Two studies examined psychometric properties of a new criterion Q-sort for children's emotion regulation and autonomy. Multitrait-multimethod matrix and factor analyses indicated impressive convergence among the emotion regulation Q-scale and established affect regulation measures. The new scale was not discriminable from measures of related…
de Paula, Jonas Jardim; Costa, Mônica Vieira; Bocardi, Matheus Bortolosso; Cortezzi, Mariana; De Moraes, Edgar Nunes; Malloy-Diniz, Leandro Fernandes
The assessment of visuospatial abilities is usually performed by drawing tasks. In patients with very low formal education, the use of these tasks might be biased by their cultural background. The Stick Design Test was developed for the assessment of this population. We aim to expand the test psychometric properties by assessing its construct, criterion-related and ecological validity in older adults with low formal education. Healthy older adults (n = 63) and Alzheimer's disease patients (n = 92) performed the Stick Design Test, Mini-Mental State Examination, Digit Span Forward and the Clock Drawing Test. Their caregivers answered Personal Care and Instrumental Activities of Daily Living). Construct validity was assessed by factor analysis, convergent correlations (with the Clock Drawing Test), and divergent correlations (with Digit Span Forward); criterion-related validity by receiver operating characteristic curve analysis and binary logistic regression; and Ecological validity by correlations with ADL. The test factor structure was composed by one component (R 2 = 64%). Significant correlations with the Clock Drawing Test and Digit Span Forward were found, and the relationship was stronger with the first measure. The test was less associated with formal education than the Clock Drawing Test. It classified about 76% of the participants correctly and had and additive effect with the Mini-Mental State Examination (84% of correct classification). The test also correlated significantly with measures of ADL, suggesting ecological validity. The Stick Design Test shows evidence of construct, criterion-related and ecological validity. It is an interesting alternative to drawing tasks for the assessment of visuospatial abilities.
Yusof Zamros YM
Full Text Available Abstract Background The study aimed to develop and test a Malay version of the Child-OIDP index, evaluate its psychometric properties and report on the prevalence of oral impacts on eight daily performances in a sample of 11–12 year old Malaysian schoolchildren. Methods The Child-OIDP index was translated from English into Malay. The Malay version was tested for reliability and validity on a non-random sample of 132, 11–12 year old schoolchildren from two urban schools in Kuala Lumpur. Psychometric analysis of the Malay Child-OIDP involved face, content, criterion and construct validity tests as well as internal and test-retest reliability. Non-parametric statistical methods were used to assess relationships between Child-OIDP scores and other subjective outcome measures. Results The standardised Cronbach’s alpha was 0.80 and the weighted Kappa was 0.84 (intraclass correlation = 0.79. The index showed significant associations with different subjective measures viz. perceived satisfaction with mouth, perceived needs for dental treatment, perceived oral health status and toothache experience in the previous 3 months (p Conclusion This study indicated that the Malay Child-OIDP index is a valid and reliable instrument to measure the oral impacts of daily performances in 11–12 year old urban schoolchildren in Malaysia.
Wang, Xiao; Sun, Zhenghai; Xiong, Lingchuan; Semrau, Maya; He, Jianhua; Li, Yang; Zhu, Jianzhong; Zhang, Nan; Wang, Aimin; Jiang, Qinpu; Mu, Nan; Zhao, Yuping; Chen, Wei; Wu, Donghui; Zheng, Zhanjie; Sun, Yongan; Zhang, Jing; Xu, Jun; Meng, Xue; Zhao, Mei; Zhang, Haifeng; Lv, Xiaozhen; Sartorius, Norman; Li, Tao; Yu, Xin; Wang, Huali
Clinical and social services both are important for dementia care. The International Dementia Alliance (IDEAL) Schedule for the Assessment and Staging of Care was developed to guide clinical and social care for dementia. Our study aimed to assess the validity and reliability of the IDEAL schedule in China. Two hundred eighty-two dementia patients and their caregivers were recruited from 15 hospitals in China. Each patient-caregiver dyad was assessed with the IDEAL schedule by a rater and an observer simultaneously. The Clinical Dementia Rating (CDR), Mini-Mental Status Examination (MMSE), and Caregiver Burden Inventory (CBI) were assessed for criterion validity. IDEAL repeated assessment was conducted 7-10 days after the initial interview for 62 dyads. Two hundred seventy-seven patient-caregiver dyads completed the IDEAL assessment. Inter-rater reliability for the total score of the IDEAL schedule was 0.93 (95%CI = 0.92-0.95). The inter-class coefficient for the total score of IDEAL was 0.95 for the interviewers and 0.93 for the silent raters. The IDEAL total score correlated with the global CDR score (ρ = 0.72, p valid and reliable tool for the staging of care for dementia in the Chinese population.
Negahban, Hossein; Mazaheri, Masood; Salavati, Mahyar; Sohani, Soheil Mansour; Askari, Marjan; Fanian, Hossein; Parnianpour, Mohamad
The aims of this study were to culturally adapt and validate the Persian version of Foot and Ankle Outcome Score (FAOS) and present data on its psychometric properties for patients with different foot and ankle problems. The Persian version of FAOS was developed after a standard forward-backward translation and cultural adaptation process. The sample included 93 patients with foot and ankle disorders who were asked to complete two questionnaires: FAOS and Short-Form 36 Health Survey (SF-36). To determine test-retest reliability, 60 randomly chosen patients completed the FAOS again 2 to 6 days after the first administration. Test-retest reliability and internal consistency were assessed using intraclass correlation coefficient (ICC) and Cronbach's alpha, respectively. To evaluate convergent and divergent validity of FAOS compared to similar and dissimilar concepts of SF-36, the Spearman's rank correlation was used. Dimensionality was determined by assessing item-subscale correlation corrected for overlap. The results of test-retest reliability show that all the FAOS subscales have a very high ICC, ranging from 0.92 to 0.96. The minimum Cronbach's alpha level of 0.70 was exceeded by most subscales. The Spearman's correlation coefficient for convergent construct validity fell within 0.32 to 0.58 for the main hypotheses presented a priori between FAOS and SF-36 subscales. For dimensionality, the minimum Spearman's correlation coefficient of 0.40 was exceeded by most items. In conclusion, the results of our study show that the Persian version of FAOS seems to be suitable for Iranian patients with various foot and ankle problems especially lateral ankle sprain. Future studies are needed to establish stronger psychometric properties for patients with different foot and ankle problems.
Ausserhofer, Dietmar; Anderson, Ruth A; Colón-Emeric, Cathleen; Schwendimann, René
The Safety Organizing Scale is a valid and reliable measure on safety behaviors and practices in hospitals. This study aimed to explore the psychometric properties of the Safety Organizing Scale-Nursing Home version (SOS-NH). In a cross-sectional analysis of staff survey data, we examined validity and reliability of the 9-item Safety SOS-NH using American Educational Research Association guidelines. This substudy of a larger trial used baseline survey data collected from staff members (n = 627) in a variety of work roles in 13 nursing homes (NHs) in North Carolina and Virginia. Psychometric evaluation of the SOS-NH revealed good response patterns with low average of missing values across all items (3.05%). Analyses of the SOS-NH's internal structure (eg, comparative fit indices = 0.929, standardized root mean square error of approximation = 0.045) and consistency (composite reliability = 0.94) suggested its 1-dimensionality. Significant between-facility variability, intraclass correlations, within-group agreement, and design effect confirmed appropriateness of the SOS-NH for measurement at the NH level, justifying data aggregation. The SOS-NH showed discriminate validity from one related concept: communication openness. Initial evidence regarding validity and reliability of the SOS-NH supports its utility in measuring safety behaviors and practices among a wide range of NH staff members, including those with low literacy. Further psychometric evaluation should focus on testing concurrent and criterion validity, using resident outcome measures (eg, patient fall rates). Copyright © 2013 American Medical Directors Association, Inc. All rights reserved.
Brugha, T S; Cragg, D
During the 23 years since the original work of Holmes & Rahe, research into stressful life events on human subjects has tended towards the development of longer and more complex inventories. The List of Threatening Experiences (LTE) of Brugha et al., by virtue of its brevity, overcomes difficulties of clinical application. In a study of 50 psychiatric patients and informants, the questionnaire version of the list (LTE-Q) was shown to have high test-retest reliability, and good agreement with informant information. Concurrent validity, based on the criterion of independently rated adversity derived from a semistructured life events interview, making use of the Life Events and Difficulties Scales (LEDS) method developed by Brown & Harris, showed both high specificity and sensitivity. The LTE-Q is particularly recommended for use in psychiatric, psychological and social studies in which other intervening variables such as social support, coping, and cognitive variables are of interest, and resources do not allow for the use of extensive interview measures of stress.
Myles Benjamin Walton
Full Text Available To test the 'Liverpool Osteoarthritis in Dogs' (LOAD questionnaire for construct and criterion validity, and to similarly test the Helsinki Chronic Pain Index (HCPI and the Canine Brief Pain Inventory (CBPI.Prospective Study.222 dogs with osteoarthritis.Osteoarthritis was diagnosed in a cohort of dogs on the basis of clinical history and orthopedic examination. Force-platform analysis was performed and a "symmetry index" for peak vertical force (PVF was calculated. Owners completed LOAD, CBPI and HCPI instruments. As a test of construct validity, inter-instrument correlations were calculated. As a test of criterion validity, the correlations between instrument scores and PVF symmetry scores were calculated. Additionally, internal consistency of all instruments was calculated and compared to those previously reported. Factor analysis is reported for the first time for LOAD, and is compared to that previously reported for CBPI and HCPI.Significant moderate correlations were found between all instruments, implying construct validity for all instruments. Significant weak correlations were found between LOAD scores and PVF symmetry index, and between CBPI scores and PVF symmetry index.LOAD is an owner-completed clinical metrology instrument that can be recommended for the measurement of canine osteoarthritis. It is convenient to use, validated and, as demonstrated here for the first time, has a correlation with force-platform data.
Carreon, Leah Y; Sanders, James O; Polly, David W; Sucato, Daniel J; Parent, Stefan; Roy-Beaudry, Marjolaine; Hopkins, Jeffrey; McClung, Anna; Bratcher, Kelly R; Diamond, Beverly E
Cross sectional. This study presents the factor analysis of the Spinal Appearance Questionnaire (SAQ) and its psychometric properties. Although the SAQ has been administered to a large sample of patients with adolescent idiopathic scoliosis (AIS) treated surgically, its psychometric properties have not been fully evaluated. This study presents the factor analysis and scoring of the SAQ and evaluates its psychometric properties. The SAQ and the Scoliosis Research Society-22 (SRS-22) were administered to AIS patients who were being observed, braced or scheduled for surgery. Standard demographic data and radiographic measures including Lenke type and curve magnitude were also collected. Of the 1802 patients, 83% were female; with a mean age of 14.8 years and mean initial Cobb angle of 55.8° (range, 0°-123°). From the 32 items of the SAQ, 15 loaded on two factors with consistent and significant correlations across all Lenke types. There is an Appearance (items 1-10) and an Expectations factor (items 12-15). Responses are summed giving a range of 5 to 50 for the Appearance domain and 5 to 20 for the Expectations domain. The Cronbach's α was 0.88 for both domains and Total score with a test-retest reliability of 0.81 for Appearance and 0.91 for Expectations. Correlations with major curve magnitude were higher for the SAQ Appearance and SAQ Total scores compared to correlations between the SRS Appearance and SRS Total scores. The SAQ and SRS-22 Scores were statistically significantly different in patients who were scheduled for surgery compared to those who were observed or braced. The SAQ is a valid measure of self-image in patients with AIS with greater correlation to curve magnitude than SRS Appearance and Total score. It also discriminates between patients who require surgery from those who do not.
Radman, Ivan; Ruzic, Lana; Padovan, Viktoria; Cigrovski, Vjekoslav; Podnar, Hrvoje
This study aimed to examine the reliability and validity of the inline skating skill test. Based on previous skating experience forty-two skaters (26 female and 16 male) were randomized into two groups (competitive level vs. recreational level). They performed the test four times, with a recovery time of 45 minutes between sessions. Prior to testing, the participants rated their skating skill using a scale from 1 to 10. The protocol included performance time measurement through a course, combining different skating techniques. Trivial changes in performance time between the repeated sessions were determined in both competitive females/males and recreational females/males (-1.7% [95% CI: -5.8-2.6%] - 2.2% [95% CI: 0.0-4.5%]). In all four subgroups, the skill test had a low mean within-individual variation (1.6% [95% CI: 1.2-2.4%] - 2.7% [95% CI: 2.1-4.0%]) and high mean inter-session correlation (ICC = 0.97 [95% CI: 0.92-0.99] - 0.99 [95% CI: 0.98-1.00]). The comparison of detected typical errors and smallest worthwhile changes (calculated as standard deviations × 0.2) revealed that the skill test was able to track changes in skaters' performances. Competitive-level skaters needed shorter time (24.4-26.4%, all p skating skills in amateur competitive and recreational level skaters. Further studies are needed to evaluate the reproducibility of this skill test in different populations including elite inline skaters.
Unver, Vesile; Basak, Tulay; Watts, Penni; Gaioso, Vanessa; Moss, Jacqueline; Tastan, Sevinc; Iyigun, Emine; Tosun, Nuran
The purpose of this study was to adapt the "Student Satisfaction and Self-Confidence in Learning Scale" (SCLS), "Simulation Design Scale" (SDS), and "Educational Practices Questionnaire" (EPQ) developed by Jeffries and Rizzolo into Turkish and establish the reliability and the validity of these translated scales. A sample of 87 nursing students participated in this study. These scales were cross-culturally adapted through a process including translation, comparison with original version, back translation, and pretesting. Construct validity was evaluated by factor analysis, and criterion validity was evaluated using the Perceived Learning Scale, Patient Intervention Self-confidence/Competency Scale, and Educational Belief Scale. Cronbach's alpha values were found as 0.77-0.85 for SCLS, 0.73-0.86 for SDS, and 0.61-0.86 for EPQ. The results of this study show that the Turkish versions of all scales are validated and reliable measurement tools.
Anderson-Butcher, Dawn; Iachini, Aidyn L.; Amorose, Anthony J.
Objective: This study describes the development and validation of a perceived social competence scale that social workers can easily use to assess children's and youth's social competence. Method: Exploratory and confirmatory factor analyses were conducted on a calibration and a cross-validation sample of youth. Predictive validity was also…
Tierney, M; Fraser, A; Kennedy, N
The International Physical Activity Questionnaire Short Form (IPAQ-SF) is a self-report questionnaire commonly used in patients with rheumatoid arthritis (RA) to measure physical activity. However, despite its frequent use in patients with RA, its validity has not been ascertained in this population. The aim of this study was to examine the criterion validity of energy expenditure from physical activity recorded with the IPAQ-SF in patients with RA compared with the objective criterion measure, the SenseWear Armband (SWA) which has been validated previously in this population. Cross-sectional criterion validation study. Regional hospital outpatient setting. Twenty-two patients with RA attending outpatient rheumatology clinics. Subjects wore an SWA for 7 full consecutive days and completed the IPAQ-SF. Energy expenditure from physical activity recorded by the SWA and the IPAQ-SF. Energy expenditure from physical activity recorded by the IPAQ-SF and the SWA showed a small, non-significant correlation (r=0.407, P=0.60). The IPAQ-SF underestimated energy expenditure from physical activity by 41% compared with the SWA. This was corroborated using Bland and Altman plots, as the IPAQ-SF was found to overestimate energy expenditure from physical activity in nine of the 22 individuals, and underestimate energy expenditure from physical activity in the remaining 13 individuals. The IPAQ-SF has limited use as an accurate and absolute measure for estimating energy expenditure from physical activity in patients with RA. Copyright © 2014 Chartered Society of Physiotherapy. Published by Elsevier Ltd. All rights reserved.
Full Text Available Background: nowadays, oral health in people with disabilities is an important topic. The phsychological and behavioural problems of these people, their difficulties with environmental adaptations and the absence of any traditional communication determine the compliance needed for treatment The aim of this work was to test the validity and reliability of an original questionnaire that could become an instrument assessing the individual features in people with mental retardation and other developmental disabilities at the time of dental treatment.Methods: it was created a questionnaire with standardised answers regarding four specific areas: neuropsychology, emotional-affect, autonomy and environmental resources. The questionnaire was completed by 63 patients from three different institutes (two rehabilitation institutes and an Institute of Dentistry for patients with special needs. To analyse the answers, each item was transformed into a numeric value. A value of 1 was displayed as the minimum while 4 represented full possession of the considered skills. A total of 17 variables were analysed with descriptive statistics and multivariate analysis. Internal consistency reliability was measured using Cronbach’s alpha. Furthermore, an analysis on convergent/discriminant validity was provided.Results: all variables were positively correlated. The most significant were “guidance”, “communication”, “sociability”, “view”, “hearing” and “feeding”. Items like “self-control”, “equanimity”, “problematic behaviour”, “extroversion” and “autonomy” offered vague and less significant information in identifying the patient’s collaboration level. Variables like “evaluation by the compiler about the patient’s collaboration”, “previous dental experiences” and “attendant” were confirmed. Cronbach’s alpha was 0.77 (standardized result, which meet the a priori criterion of 0.90≥alpha≥0.70.Conclusions
Full Text Available The purpose of this study is to develop a scale unique to our culture, concerning individual instrument performance anxiety of the students who are getting instrument training in the Department of Music Education. In the study, the descriptive research model is used and qualitative research techniques are utilized. The study population consists of the students attending the 23 universities which has Music Education Department. The sample of the study consists of 438 girls and 312 boys, totally 750 students who are studying in the Department of Music Education of randomly selected 10 universities. As a result of the explanatory and confirmatory factor analyses that were performed, a one-dimensional structure consisting of 14 items was obtained. Also, t-scores and the coefficient scores of total item correlation concerning the distinguishing power of the items, the difference in the scores of the set of lower and upper 27% was calculated, and it was observed that the items are distinguishing as a result of both analyses. Of the scale, Cronbach's alpha coefficient of internal consistency was calculated as .94, and test-retest reliability coefficient was calculated as .93. As a result, a valid and reliable assessment and evaluation instrument that measures the exam performance anxiety of the students studying in the Department of Music Education, has been developed.Extended AbstractsIntroductionAnxiety is a universal phenomenon which people experience once or a few times during lives. It was accepted as concern for the future or as an unpleasant emotional experience regarding probable hitches of the events (Di Tomasso & Gosch, 2002.In general, the occasions on which negative feelings are experienced cause anxiety to arise (Baltaş and Baltaş, 2000. People also feel anxious in dangerous situations. Anxiety may lead a person to be creative, while it may have hindering characteristics. Anxiety is that an individual considers him
Yin, Liqin; Tang, Changfa; Tao, Xia
To study the criterion-related validity of simple muscle strength test (SMST) indicators and assess whole body muscle strength in Chinese children aged 10 to 12 years old. Two hundred and forty children were equally divided into four groups in different genders and residences. The SMST indicators (hand-grip, knee bent push-up, back muscle strength, sit-up, leg muscle strength, and standing long jump) were tested. We set up the total level of the whole-body muscle strength ( F total ) through testing isokinetic muscle strength of the six joints' flexion and extension movements. Pearson correlation analyses were used to analyze the correlation between the SMST indicators and the F total . (1) Leg muscle strength and back muscle strength demonstrated the highest validity scores. Sit-ups, hand grip, and standing long jump demonstrated the lowest validity scores. (2) Leg muscle strength had the highest validity for males, but back muscle strength had the highest validity for females. Back muscle strength and leg muscle strength can give the highest validity of assessing whole body muscle strength, and also has higher validity in both the urban and rural children. For urban children, but not rural, the knee bent push-up also has a high validity indicator.
Full Text Available Objective. To study the criterion-related validity of simple muscle strength test (SMST indicators and assess whole body muscle strength in Chinese children aged 10 to 12 years old. Methods. Two hundred and forty children were equally divided into four groups in different genders and residences. The SMST indicators (hand-grip, knee bent push-up, back muscle strength, sit-up, leg muscle strength, and standing long jump were tested. We set up the total level of the whole-body muscle strength (Ftotal through testing isokinetic muscle strength of the six joints’ flexion and extension movements. Pearson correlation analyses were used to analyze the correlation between the SMST indicators and the Ftotal. Results. (1 Leg muscle strength and back muscle strength demonstrated the highest validity scores. Sit-ups, hand grip, and standing long jump demonstrated the lowest validity scores. (2 Leg muscle strength had the highest validity for males, but back muscle strength had the highest validity for females. Conclusions. Back muscle strength and leg muscle strength can give the highest validity of assessing whole body muscle strength, and also has higher validity in both the urban and rural children. For urban children, but not rural, the knee bent push-up also has a high validity indicator.
Erkin, Özüm; Göl, İlknur
This study aims to measure the validity and reliability of Turkish male breast self-examination (MBSE) instrument. The methodological study was performed in 2016 at Ege University, Faculty of Nursing, İzmir, Turkey. The MBSE includes ten steps. For validity studies, face validity, content validity, and construct validity (exploratory factor analysis) were done. For reliability study, Kuder Richardson was calculated. The content validity index was found to be 0.94. Kendall W coefficient was 0.80 (p=0.551). The total variance explained by the two factors was found to be 63.24%. Kuder Richardson 21 was done for reliability study and found to be 0.97 for the instrument. The final instrument included 10 steps and two stages. The Turkish version of MBSE is a valid and reliable instrument for early diagnose. The MBSE can be used in Turkish speaking countries and cultures with two stages and 10 steps.
Full Text Available Objectives: Reliable and valid instruments are essential for understanding fatigue in occupational settings. This study analyzed the psychometric properties of the Portuguese version of the Swedish Occupational Fatigue Inventory (SOFI. Material and Methods: A cross-sectional study was conducted with 218 workers from an automotive industry involved in assembly tasks for fabrication of mechanical cables. Convergent and discriminant validity, internal consistency reliability and confirmatory factor analysis were performed. Results: Results showed adequate fit to data, yielding a 20-item, 5-factor structure (all intercorrelated: Chi2/df (ratio Chi2 and degrees of freedom = 2.530, confirmatory fit index (CFI = 0.919, goodness of fit index (GFI = 0.845, root mean square error of approximation (RMSEA = 0.084. The SOFI presented an adequate internal consistency, with the sub-scales and total scale presenting good reliability values (Cronbach’s α values from 0.742 to 0.903 and 0.943 respectively. Conclusions: Findings suggest that the Portuguese version of the SOFI may be a useful tool to assess fatigue and prevent work-related injuries. In future research, other instruments should be used as an external criterion to correlate with the SOFI dimensions. Int J Occup Med Environ Health 2017;30(3:407–417
Ohaeri, Jude U; Awadallab, Abdel W
There is rising interest in quality of life (QOL) research in Arabian countries. The aim of this study was to assess in a nationwide sample of Kuwaiti subjects the reliability and validity of the World Health Organization Quality of Life (WHOQOL-BREF), a shorter version of the widely used QOL assessment instrument that comprises 26 items in the domains of physical health, psychological health, social relationships, and the environment. A one-in-three systematic random proportionate sample of consenting Kuwaiti nationals attending large cooperative stores and municipal government offices in the six governorates completed the Arabic translation of the questionnaire. The indices assessed included test-retest reliability, internal consistency, item internal consistency (2C), item discriminant validity (IDV), known-groups and construct validity. There were 3303 participants (44.8% males, 55.2% females, mean age 35.4 years, range 16 to 87 years). The intra-class correlation for the test-retest statistic and the internal consistency values for the full questionnaire and the domains had a Cronbach's alpha > - 0.7. Of the 24 items that constitute the domains, 21 met the 2 C requirement of correlation > - 0.4 with the corresponding domain, while 16 met the IDV criterion of having a higher correlation with their corresponding domain than other domains. Domain scores discriminated significantly between well and sick groups. In the factor analysis, four strong factors emerged with the same construct as in the WHO report. The Arabic translation of the WHOQOL-BREF has impressive reliability and validity indices. The poor IDV findings are due to the multidimensional nature of the questionnaire. The highly significant validity indices should reassure researchers that the questionnaire represents the same constructs across cultures. Negatively worded items possibly need refinement. (author)
Arce-Ferrer, Alvaro J.; Castillo, Irene Borges
The use of face-to-face interviews is controversial for college admissions decisions in light of the lack of availability of validity and reliability evidence for most college admission processes. This study investigated reliability and incremental predictive validity of a face-to-face postgraduate college admission interview with a sample of…
de Groot, Sonja; Balvers, Inge J.M.; Kouwenhoven, Sanne M.; Janssen, Thomas W.J.
The purpose of this study was to investigate the reliability and validity of wheelchair basketball field tests. Nineteen wheelchair basketball players performed 10 test items twice to determine the reliability. The validity of the tests was assessed by relating the scores to the players'
Nederhof, Esther; Brink, Michel S.; Lemmink, Koen A. P. M.
The purpose of the present study was to investigate the cross-cultural validity of the Recovery Stress Questionnaire for Athletes (RESTQ-sport) by analysing reliability and validity of a Dutch translation. Two studies were performed to assess test-retest reliability with a one week interval,
De Groot, Sonja; Balvers, Inge J. M.; Kouwenhoven, Sanne M.; Janssen, Thomas W. J.
The purpose of this study was to investigate the reliability and validity of wheelchair basketball field tests. Nineteen wheelchair basketball players performed 10 test items twice to determine the reliability. The validity of the tests was assessed by relating the scores to the players'
The present study aims to determine the validity and reliability of the academic resilience scale in Turkish high school. The participances of the study includes 378 high school students in total (192 female and 186 male). A set of analyses were conducted in order to determine the validity and reliability of the study. Firstly, both exploratory…
This study presents the processes of developing and establishing reliability and validity of a reading test by administering an integrative approach as conventional reliability and validity measures superficially reveals the difficulty of a reading test. In this respect, analysing vocabulary frequency of the test is regarded as a more eligible way…
Bhat, Mehraj A.
This paper is based on the construction and evaluation of reliability and validity of reasoning ability test at secondary school students. In this paper an attempt was made to evaluate validity, reliability and to determine the appropriate standards to interpret the results of reasoning ability test. The test includes 45 items to measure six types…
Markon, Kristian E.; Chmielewski, Michael; Miller, Christopher J.
In 2 meta-analyses involving 58 studies and 59,575 participants, we quantitatively summarized the relative reliability and validity of continuous (i.e., dimensional) and discrete (i.e., categorical) measures of psychopathology. Overall, results suggest an expected 15% increase in reliability and 37% increase in validity through adoption of a…
Worrell, Frank C.; Mello, Zena R.
In this study, the authors examined the reliability, structural validity, and concurrent validity of Zimbardo Time Perspective Inventory (ZTPI) scores in a group of 815 academically talented adolescents. Reliability estimates of the purported factors' scores were in the low to moderate range. Exploratory factor analysis supported a five-factor…
Smith, Jack E.; Hakel, Milton D.
Examined are questions pertinent to the use of the Position Analysis Questionnaire: Who can use the PAQ reliably and validly? Must one rely on trained job analysts? Can people having no direct contact with the job use the PAQ reliably and validly? Do response biases influence PAQ responses? (Author/KC)
Boonstra, Anne M.; Schiphorst Preuper, Henrica R.; Reneman, Michiel F.; Posthumus, Jitze B.; Stewart, Roy E.
To determine the reliability and concurrent validity of a visual analogue scale (VAS) for disability as a single-item instrument measuring disability in chronic pain patients was the objective of the study. For the reliability study a test-retest design and for the validity study a cross-sectional
Boonstra, Anne M.; Reneman, Michiel F.; Stewart, Roy E.; Balk, Gerlof A.
The aim of this study was to determine the reliability and discriminant validity of the Dutch version of the life satisfaction questionnaire (Lisat-9 DV) to assess patients with an acquired brain injury. The reliability study used a test-retest design, and the validity study used a cross-sectional design. The setting was the general rehabilitation…
Mehmet Emrah Karadere
Conclusion: The preliminary data obtained from the study of reliability and validity of the scale shows that Reasoning with Inductive Argument Test supports reliability and validity in Turkish population. [JCBPR 2013; 2(3.000: 156-161
Barrett, Eva; McCreesh, Karen; Lewis, Jeremy
A wide array of instruments are available for non-invasive thoracic kyphosis measurement. Guidelines for selecting outcome measures for use in clinical and research practice recommend that properties such as validity and reliability are considered. This systematic review reports on the reliability and validity of non-invasive methods for measuring thoracic kyphosis. A systematic search of 11 electronic databases located studies assessing reliability and/or validity of non-invasive thoracic kyphosis measurement techniques. Two independent reviewers used a critical appraisal tool to assess the quality of retrieved studies. Data was extracted by the primary reviewer. The results were synthesized qualitatively using a level of evidence approach. 27 studies satisfied the eligibility criteria and were included in the review. The reliability, validity and both reliability and validity were investigated by sixteen, two and nine studies respectively. 17/27 studies were deemed to be of high quality. In total, 15 methods of thoracic kyphosis were evaluated in retrieved studies. All investigated methods showed high (ICC ≥ .7) to very high (ICC ≥ .9) levels of reliability. The validity of the methods ranged from low to very high. The strongest levels of evidence for reliability exists in support of the Debrunner kyphometer, Spinal Mouse and Flexicurve index, and for validity supports the arcometer and Flexicurve index. Further reliability and validity studies are required to strengthen the level of evidence for the remaining methods of measurement. This should be addressed by future research. Copyright © 2013 Elsevier Ltd. All rights reserved.
Prevention strategies are effective only when there are epidemiological data for the targeted populations. The collection of such .... Proquest, Sport discuss and Cochrane as these are ... 0.74, test retest reliability 0.70; Diet: internal consistency:.
Travel time reliability (TTR) has been proposed as : a better measure of a facilitys performance than : a statistical measure like peak hour demand. TTR : is based on more information about average traffic : flows and longer time periods, thus inc...
The EuReDatA Working Group produced a basic document that addressed many of the problems associated with the design of a suitable data collection scheme to achieve pre-defined objectives. The book that resulted from this work describes the need for reliability data, data sources and collection procedures, component description and classification, form design, data management, updating and checking procedures, the estimation of failure rates, availability and utilisation factors, and uncertainties in reliability parameters. (DG)
McCrae, Robert R.; Kurtz, John E.; Yamagata, Shinji; Terracciano, Antonio
We examined data (N = 34,108) on the differential reliability and validity of facet scales from the NEO Inventories. We evaluated the extent to which (a) psychometric properties of facet scales are generalizable across ages, cultures, and methods of measurement; and (b) validity criteria are associated with different forms of reliability. Composite estimates of facet scale stability, heritability, and cross-observer validity were broadly generalizable. Two estimates of retest reliability were independent predictors of the three validity criteria; none of three estimates of internal consistency was. Available evidence suggests the same pattern of results for other personality inventories. Internal consistency of scales can be useful as a check on data quality, but appears to be of limited utility for evaluating the potential validity of developed scales, and it should not be used as a substitute for retest reliability. Further research on the nature and determinants of retest reliability is needed. PMID:20435807
seyed abolfazl zakerian; Roya Azizi; Mehdi Rahgozar
The term usability refers to a special index for success of an operating system. This study aimed to determine the reliability and validity of the Software Usability Measurements Inventory (SUMI) questionnaire as one of the valid and common questionnaires about usability evaluation. The back translation method was used to translate the questionnaire from English to Persian back to English. Moreover, repeatability or test-retest reliability was practically used to determine the reliability of ...
El-Housseiny, Azza A; Alsadat, Farah A; Alamoudi, Najlaa M; El Derwi, Douaa A; Farsi, Najat M; Attar, Moaz H; Andijani, Basil M
Early recognition of dental fear is essential for the effective delivery of dental care. This study aimed to test the reliability and validity of the Arabic version of the Children's Fear Survey Schedule-Dental Subscale (CFSS-DS). A school-based sample of 1546 children was randomly recruited. The Arabic version of the CFSS-DS was completed by children during class time. The scale was tested for internal consistency and test-retest reliability. To test criterion validity, children's behavior was assessed using the Frankl scale during dental examination, and results were compared with children's CFSS-DS scores. To test the scale's construct validity, scores on "fear of going to the dentist soon" were correlated with CFSS-DS scores. Factor analysis was also used. The Arabic version of the CFSS-DS showed high reliability regarding both test-retest reliability (intraclass correlation = 0.83, p children with negative behavior had significantly higher fear scores (t = 13.67, p fear of invasive dental procedures," "fear of less invasive dental procedures" and "fear of strangers." The Arabic version of the CFSS-DS is a reliable and valid measure of dental fear in Arabic-speaking children. Pediatric dentists and researchers may use this validated version of the CFSS-DS to measure dental fear in Arabic-speaking children.
Randall Simpson, Janis; Gumbley, Jillian; Whyte, Kylie; Lac, Jane; Morra, Crystal; Rysdale, Lee; Turfryer, Mary; McGibbon, Kim; Beyers, Joanne; Keller, Heather
Nutrition is vital for optimal growth and development of young children. Nutrition risk screening can facilitate early intervention when followed by nutritional assessment and treatment. NutriSTEP (Nutrition Screening Tool for Every Preschooler) is a valid and reliable nutrition risk screening questionnaire for preschoolers (aged 3-5 years). A need was identified for a similar questionnaire for toddlers (aged 18-35 months). The purpose was to develop a reliable and valid Toddler NutriSTEP. Toddler NutriSTEP was developed in 4 phases. Content and face validity were determined with a literature review, parent focus groups (n = 6; 48 participants), and experts (n = 13) (phase A). A draft questionnaire was refined with key intercept interviews of 107 parents/caregivers (phase B). Test-retest reliability (phase C), based on intra-class correlations (ICC), Kappa (κ) statistics, and Wilcoxon tests was assessed with 133 parents/caregivers. Criterion validity (phase D) was assessed using Receiver Operating Characteristic (ROC) curves by comparing scores on the Toddler NutriSTEP to a comprehensive nutritional assessment of 200 toddlers with a registered dietitian (RD). The Toddler NutriSTEP was reliable between 2 administrations (ICC = 0.951, F = 20.53, p Toddler NutriSTEP were correlated (r = 0.67, p Toddler NutriSTEP questionnaire is both reliable and valid for screening for nutritional risk in toddlers.
Shioda, Ai; Tadaka, Etsuko; Okochi, Ayako
Community integration is an essential right for people with schizophrenia that affects their well-being and quality of life, but no valid instrument exists to measure it in Japan. The aim of the present study is to develop and evaluate the reliability and validity of the Japanese version of the Community Integration Measure (CIM) for people with schizophrenia. The Japanese version of the CIM was developed as a self-administered questionnaire based on the original version of the CIM, which was developed by McColl et al. This study of the Japanese CIM had a cross-sectional design. Construct validity was determined using a confirmatory factor analysis (CFA) and data from 291 community-dwelling people with schizophrenia in Japan. Internal consistency was calculated using Cronbach's alpha. The Lubben Social Network Scale (LSNS-6), the Rosenberg Self-Esteem Scale (RSE) and the UCLA Loneliness Scale, version 3 (UCLALS) were administered to assess the criterion-related validity of the Japanese version of the CIM. The participants were 263 people with schizophrenia who provided valid responses. The Cronbach's alpha was 0.87, and CFA identified one domain with ten items that demonstrated the following values: goodness of fit index = 0.924, adjusted goodness of fit index = 0.881, comparative fit index = 0.925, and root mean square error of approximation = 0.085. The correlation coefficients were 0.43 (p reliability and validity for assessing community integration for people with schizophrenia in Japan.
... (job learning difficulty and cross-AFS differences in aptitude requirements), (b) XJRThs exhibited some postdictive validity when evaluated against Airman Retraining Program Survey retraining ease criteria, (c...
Zhang, Tingting; Yin, Anchun; Sun, Xiaohong; Liu, Qigui; Song, Guirong; Li, Lianhong
To develop psychosocial adaptation scale for Parkinson's disease (PD) in Chinese population and evaluate its reliability and validity. The items were designed by literature review, expert consultation and semi-structured interview. The methods of corrected item-total correlation, discrimination analysis and exploratory factor analysis were used for items selection. 427 valid scales from PD patients were collected in the study to test the reliability and validity. The scale incorporated six dimensions: anxiety, self-esteem, attitude, self-acceptance, self-efficacy and social support, a total of 32 items. The scale possessed good internal consistency. The test-retest correlation coefficient was 0.99 and average content validation rate was 0.97. The Hoehn and Yahr stage were correlated with total score of the scale. The psychosocial adaptation scale in this study showed good reliability and validity, it can be used as a reliable and valid instrument to evaluate the psychosocial adaptation of PD objectively and effectively.
Mehrnoosh Pazargadi; Tahereh Ashktorab; Sharareh Khosravi; Hamid Alavi majd
Background: The necessity of a valid and reliable assessment tool is one of the most repeated issues in nursing students` clinical evaluation. But it is believed that present tools are not mostly valid and can not assess students` performance properly.Objectives: This study was conducted to design a valid and reliable assessment tool for evaluating nursing students` performance in clinical education.Methods: In this methodological study considering nursing students` performance definition; th...
Ko, Jupil; Rosen, Adam B; Brown, Cathleen N
To cross-culturally adapt the Identification Functional Ankle Instability for use with Korean-speaking participants. The English version of the IdFAI was cross-culturally adapted into Korean based on the guidelines. The psychometric properties in the Korean version of the IdFAI were measured for test-retest reliability, internal consistency, criterion-related validity, discriminative validity, and measurement error 181 native Korean-speakers. Intra-class correlation coefficients (ICC 2,1 ) between the English and Korean versions of the IdFAI for test-retest reliability was 0.98 (standard error of measurement = 1.41). The Cronbach's alpha coefficient was 0.89 for the Korean versions of IdFAI. The Korean versions of the IdFAI had a strong correlation with the SF-36 (r s = -0.69, p 10 was the optimal cutoff score to distinguish between the group memberships. The minimally detectable change of the Korean versions of the IdFAI score was 3.91. The Korean versions of the IdFAI have shown to be an excellent, reliable, and valid instrument. The Korean versions of the IdFAI can be utilized to assess the presence of Chronic Ankle Instability by researchers and clinicians working among Korean-speaking populations. Implications for rehabilitation The high recurrence rate of sprains may result into Chronic Ankle Instability (CAI). The Identification of Functional Ankle Instability Tool (IdFAI) has been validated and recommended to identify patients with Chronic Ankle Instability (CAI). The Korean version of the Identification of Functional Ankle Instability Tool (IdFAI) may be also recommend to researchers and clinicians for assessing the presence of Chronic Ankle Instability (CAI) in Korean-speaking population.
Full Text Available Objectives. The optimal tool for identifying postsroke depression (PSD is yet to be identified. In the present study, we rely on the depression subscale of the Hospital Anxiety and Depression Scale (HADS-D as a meaningful criterion to investigate the psychometric properties of the HRQOLISP-E, a new context-specific screening tool for PSD developed from a large cross-cultural sample. Methods. We assessed baseline data being collected as part of an intervention to improve one-year blood pressure control among recent (≤one month stroke survivors. Depression was measured using the HADS-D and the HRQOLISP-E. We determined sensitivity, specificity, likelihood ratios, and posttest probability. The area under a receiver operator curve (AUC and the most appropriate HRQOLISP-E cut-off were also determined using standard procedures. Results. Using data derived from 387 recent stroke survivors, the HRQOLISP-E showed high agreement with the HADS-D, sensitivity = 73.7%, specificity = 79.3%, and posterior test probability = 88% (95% CI = 84%–91%. The AUC was 0.81 (95% CI = 0.76–0.86. The HRQOLISP-E cut-off, corresponding to HADS-D score ≥ 8, was 20/21 (out of a total score of 30. Conclusions. Within limitations of using the HADS-D as a referent criterion, the present results provide justification for further development of the HRQOLISP-E as the first stroke-specific screening tool for depression.
Aguiar, Larissa T; Lara, Eliza M; Martins, Julia C; Teixeira-Salmela, Luci F; Quintino, Ludmylla F; Christo, Paulo P; DE Morais Fairaa, Christina
Limitations in activities have been related to weakness of the upper limbs (UL), lower limbs (LL) and trunk muscles after stroke. Therefore, the measurement of strength after stroke becomes essential. The Modified Sphygmomanometer Test (MST) is an alternative method for the measurement of strength, since it is cheap and provides objective values. However, no studies have investigated the measurement properties of the MST in sub-acute stroke. To investigate the test-retest and inter-rater reliabilities and criterion-related validity of the MST for the measurement of strength of the UL, LL, and trunk muscles in subjects with sub-acute stroke, and verify whether the number of trials would affect the results. Diagnostic accuracy. Local community, out-patient clinics, and university laboratory. Sixty- five subjects with sub-acute stroke (62±14 years) participated of the present study. The strength of 36 muscular groups was measured with the MST and dynamometers (criterion standard). To investigate whether the number of trials would affect the results, analysis of variance was applied. For the test-retest and inter-rater reliabilities and criterion-related validity of the MST, intra-class correlation coefficients (ICC), Pearson correlation coefficients, and coefficients of determination were calculated. Similar results were found for all muscular groups and number of trials (0.01≤F≤0.14; 0.87≤p≤0.99) with significant and adequate values of test-retest (0.57≤ICC≥0.98) (exception: first trial of the non-paretic ankle dorsiflexors) and inter-rater (0.50≤ICC≥0.99) (exception: non-paretic ankle plantar flexors) reliabilities and validity (0.70≤r≥0.95; p≤0.001). The values obtained with the MST were good predictors of those obtained with the dynamometers (0.54≤r2≤0.90). In general, the MST showed adequate reliabilities and criterion-related validity for measuring strength of subjects with sub-acute stroke, and only one trial, after familiarization
1 University of Northern Iowa, Division of Athletic Training, 003C Human. Performance Center, Cedar ... concurrent validity of the fingertip-to-floor distance test (FFD) ... in these protocols are spinal and extremity range of motion, pelvic control ...
This article discusses the use of assessment by teachers to replace external marking. It shows how professional participation and moderation can provide reliability in summative assessment, even in public examinations for older students. It draws on historical experiences of assessment for A-level English literature.
All three of these instruments do not involve high costs, do not require high technical skills, mobile, save time, and are suitable for use in large populations. Because all three instruments can estimate the percentage of body fat, but it is important to identify the most appropriate instruments and have high reliability. Hence, this ...
Kern, Jeffrey M.; MacDonald, Marian L.
The reliability and meaning of assertiveness tests were explored using 120 female undergraduates. Several self-report inventories (the College Self-Expression Scale, Conflict Resolution Inventory, and a global rating from one to seven) were administered, as were three anxiety measures (Timed Behavior Checklist, response latency, and response…
Konge, Lars; Larsen, Klaus Richter; Clementsen, Paul
: The interrater reliability was high, with Cronbach's a = 0.86. Assessment of 3 bronchoscopies by a single rater had a generalizability coefficient of 0.84. The correlation between experience and performance was good (Pearson correlation = 0.76). There were significant differences between the groups for all...
Cetin, Bayram; Yaman, Erkan; Peker, Adem
The purpose of this study is to develop a reliable and valid scale, which determines cyber victimization and bullying behaviors of high school students. Research group consisted of 404 students (250 male, 154 male) in Sakarya, in 2009-2010 academic years. In the study sample, mean age is 16.68. Content validity and face validity of the scale was…
Shek, Daniel T. L.; Lai, Kelly Y. C.
Reliability and validity of Chinese Self-Report Family Inventory (C-SFI) were examined in three studies. Study 1 showed C-SFI was temporally stable and internally consistent. Study 2 indicated C-SFI could discriminate between clinical and nonclinical groups. Study 3 gave support for internal consistency, concurrent validity and construct validity.…
Dobbin, Nick; Hunwicks, Richard; Jones, Ben; Till, Kevin; Highton, Jamie; Twist, Craig
To examine the criterion and construct validity of an isometric midthigh-pull dynamometer to assess whole-body strength in professional rugby league players. Fifty-six male rugby league players (33 senior and 23 youth players) performed 4 isometric midthigh-pull efforts (ie, 2 on the dynamometer and 2 on the force platform) in a randomized and counterbalanced order. Isometric peak force was underestimated (P .05) between the predicted and peak force from the force platform and an adjusted R 2 (79.6%) that represented shrinkage of 0.4% relative to the cross-validation model (80%). Peak force was greater for the senior than the youth professionals using the dynamometer (2261.2 ± 222 cf 1725.1 ± 298.0 N, respectively; P isometric midthigh pull assessed using a dynamometer underestimates criterion peak force but is capable of distinguishing muscle-function characteristics between professional rugby league players of different standards.
Atik Altınok, Yasemin; Özgür, Suriye; Meseri, Reci; Özen, Samim; Darcan, Şükran; Gökşen, Damla
The aim of this study was to show the reliability and validity of a Turkish version of Diabetes Eating Problem Survey-Revised (DEPS-R) in children and adolescents with type 1 diabetes mellitus. A total of 200 children and adolescents with type 1 diabetes, ages 9-18 years, completed the DEPS-R Turkish version. In addition to tests of validity, confirmatory factor analysis was conducted to investigate the factor structure of the 16-item Turkish version of DEPS-R. The Turkish version of DEPS-R demonstrated satisfactory Cronbach's ∝ (0.847) and was significantly correlated with age (r=0.194; p1), hemoglobin A1c levels (r=0.303; p1), and body mass index-standard deviation score (r=0.412; p1) indicating criterion validity. Median DEPS-R scores of Turkish version for the total samples, females, and males were 11.0, 11.5, and 10.5, respectively. Disturbed eating behaviors and insulin restriction were associated with poor metabolic control. A short, self-administered diabetes-specific screening tool for disordered eating behavior can be used routinely in the clinical care of adolescents with type 1 diabetes. The Turkish version of DEPS-R is a valid screening tool for disordered eating behaviors in type 1 diabetes and it is potentially important to early detect disordered eating behaviors.
Uysal, Hilal; Ozcan, Şeyda
Many new measuring devices have been developed so that broader psychometric measurements in the coronary artery disease, disease-specific health status measurements, and identification of the broader quality of life can be performed in the recent years. The study was intended to determine whether, and to what extent, MIDAS is a valid and reliable measurement to the patients suffering from myocardial infarction for the first time in Turkey. The research was conducted with the patients hospitalized and treated with myocardial infarction in the cardiology departments of 2 hospitals in Istanbul, Turkey, between 2007 and 2008. Psychometric evaluations of TR-MIDAS were used for validity studies; language validity, content validity, construct validity were examined. For reliability studies; the tool's internal consistency reliability, Cronbach's alpha reliability coefficient, and test-retest reliability were completed. The instrument's content validity index was determined to be "0.95". Principal component analysis revealed six factors with an eigenvalue >1.5. Cronbach's alpha was found to be 0.89 for total scale which was an acceptable value. The total's test-retest reliability was 0.51 (p<0.01). Data obtained at the end of the study supports that Turkish Myocardial Infarction Dimensional Assessment Scale is a valid and reliable instrument as a disease-specific scale to assess the patients' quality of life suffering from myocardial infarction in Turkey. Copyright © 2010 European Society of Cardiology. Published by Elsevier B.V. All rights reserved.
Watt, Torquil; Hegedüs, Laszlo; Groenvold, Mogens
Background Appropriate scale validity and internal consistency reliability have recently been documented for the new thyroid-specific quality of life (QoL) patient-reported outcome (PRO) measure for benign thyroid disorders, the ThyPRO. However, before clinical use, clinical validity and test......-retest reliability should be evaluated. Aim To investigate clinical ('known-groups') validity and test-retest reliability of the Danish version of the ThyPRO. Methods For each of the 13 ThyPRO scales, we defined groups expected to have high versus low scores ('known-groups'). The clinical validity (known......-groups validity) was evaluated by whether the ThyPRO scales could detect expected differences in a cross-sectional study of 907 thyroid patients. Test-retest reliability was evaluated by intra-class correlations of two responses to the ThyPRO 2 weeks apart in a subsample of 87 stable patients. Results On all 13...
Skinner, T. C.; Howells, L.; Greene, S.
Aims: This article reports on the development and validity of a Diabetes-specific Illness Representations Questionnaire (DIRQ) to assess all five dimensions of an individual's perception of diabetes, for adolescents with Type 1 diabetes mellitus. Methods: There were two development studies. Study 1...... with a diabetes self-efficacy and barriers to adherence questionnaire. Subsequently there were two validation studies. Study 3: participants (n = 44 adolescents and 28 parents) completed the DIRQ and questionnaires assessing their self-care and psychological well-being. Glycaemic control was assessed through...... consist of two subscales, perceived threat and perceived impact, and provide further support for the distinction between treatment effectiveness to control diabetes and treatment effectiveness to prevent complications. Along with the validation studies, the results indicate that the questionnaire scales...
O'Sullivan, Elizabeth J; Rasmussen, Kathleen M
, and mode of infant HM consumption and duration of maternal HM production that is reliable within 19 to 35 months postpartum. Criterion-validity testing of these questions will improve the utility of the Questionnaire on Infant Feeding as a surveillance tool. Copyright © 2017 Academy of Nutrition and Dietetics. Published by Elsevier Inc. All rights reserved.
Chao, Kuo-Yu; Wang, Huei-Shyong; Chang, Hsueh-Ling; Wang, Yi-Wen; See, Lai-Chu
The aim of this study was to evaluate the validity and reliability of the stress index for 10-18-years-old children or adolescents with Tourette syndrome. Tourette syndrome is a chronic tic disorder, which occurs in childhood. Children with Tourette syndrome exhibit sudden and unexpected voices or movements that may have influence on their daily activities and cause interaction barriers for children with Tourette syndrome. Therefore, a self-report stress index is necessary for children with Tourette syndrome to quickly measure the stress they have. Eight experts rated appropriateness, comprehensiveness and relevance of the questionnaire to establish content validity. A total of 116 paediatric patients filled out the stress index for 10-18-years-old children or adolescents with Tourette syndrome to evaluate its construct validity using exploratory factor analysis and internal consistency. Data from 90 pairs of paediatric patients and their caregivers were used to evaluate the inter-rater reliability. The criterion validity index ranged from 80-98%. One item was deleted because of a small item-to-total correlation. Therefore, 26 items made up the final stress index for 10-18-years-old children or adolescents with Tourette syndrome. In exploratory factor analysis, four factors (unfairly treated, psychological, symptom control and future concern) were achieved and accounted for 52.3% of the total variance. Cronbach's alphas of the stress index for 10-18-years-old children or adolescents with Tourette syndrome were 0.89. The inter-rater reliability of stress Index for 10-18-years-old children or adolescents with Tourette syndrome (Pearson correlation coefficient between patients and their caregivers) was 0.56. The stress Index for 10-18-years-old children or adolescents with Tourette syndrome is a self-administered tool to assess the stress of children or adolescents with Tourette syndrome. Validity (content and construct) and reliability (internal consistency and inter
Bech, B; Lönn, L; Falkenberg, M
Objectives To study the construct validity and reliability of a novel endovascular global rating scale, Structured Assessment of endoVascular Expertise (SAVE). Design A Clinical, experimental study. Materials Twenty physicians with endovascular experiences ranging from complete novices to highly....... Validity was analysed by correlating experience with performance results. Reliability was analysed according to generalisability theory. Results The mean score on the 29 items of the SAVE scale correlated well with clinical experience (R = 0.84, P ... with clinical experience (R = -0.53, P validity and reliability of assessment with the SAVE scale was high when applied to performances in a simulation setting with advanced realism. No ceiling effect...
Sharma, Sonia; Crow, Heidi C; McCall, W D; Gonzalez, Yoly M
To conduct a systematic review of papers reporting the reliability and diagnostic validity of the joint vibration analysis (JVA) for diagnosis of temporomandibular disorders (TMD). A search of Pubmed identified English-language publications of the reliability and diagnostic validity of the JVA. Guidelines were adapted from applied STAndards for the Reporting of Diagnostic accuracy studies (STARD) to evaluate the publications. Fifteen publications were included in this review, each of which presented methodological limitations. This literature is unable to provide evidence to support the reliability and diagnostic validity of the JVA for diagnosis of TMD.
Turel, Yalin Kilic
The interactive whiteboard (IWB) has become a popular technology for instructors over the last decade. Though research asserts that the IWBs facilitate learning in different ways, there is a lack of studies examining actual IWB use in classroom settings based on learners' perspectives by means of valid instruments. The purpose of this study is to…
They completed this 15 item self-rated instrument that assesses patient satisfaction with services using a 5 point response format. Results:The internal consistency for the scale was high ( a=0.91), and item total correlations ranged between 0.33 to 0.70. Its convergent validity was supported by significant correlations of all ...
Ilker, Gokce Erturan; Arslan, Yunus; Demirhan, Giyasettin
The Trichotomous Achievement Goal Scale was developed by Agbuga and Xiang (2008) by including selected items from the scales of Duda and Nicholls (1992), Elliot (1999), and Elliot and Church (1997) and adapting them into Turkish. The scale consists of 18 items, and students rated each item on a 7-point Likert scale. To ascertain the validity and…
Full Text Available Objective assessment methods to monitor residual limb volume following lower-limb amputation are required to enhance practitioner-led prosthetic fitting. Computer aided systems, including 3D scanners, present numerous advantages and the recent Artec Eva scanner, based on laser free technology, could potentially be an effective solution for monitoring residual limb volumes.The aim of this study was to assess the validity and reliability of the Artec Eva scanner (practical measurement against a high precision laser 3D scanner (criterion measurement for the determination of residual limb model shape and volume.Three observers completed three repeat assessments of ten residual limb models, using both the scanners. Validity of the Artec Eva scanner was assessed (mean percentage error <2% and Bland-Altman statistics were adopted to assess the agreement between the two scanners. Intra and inter-rater reliability (repeatability coefficient <5% of the Artec Eva scanner was calculated for measuring indices of residual limb model volume and shape (i.e. residual limb cross sectional areas and perimeters.Residual limb model volumes ranged from 885 to 4399 ml. Mean percentage error of the Artec Eva scanner (validity was 1.4% of the criterion volumes. Correlation coefficients between the Artec Eva and the Romer determined variables were higher than 0.9. Volume intra-rater and inter-rater reliability coefficients were 0.5% and 0.7%, respectively. Shape percentage maximal error was 2% at the distal end of the residual limb, with intra-rater reliability coefficients presenting the lowest errors (0.2%, both for cross sectional areas and perimeters of the residual limb models.The Artec Eva scanner is a valid and reliable method for assessing residual limb model shapes and volumes. While the method needs to be tested on human residual limbs and the results compared with the current system used in clinical practice, it has the potential to quantify shape and volume
Hadadi, Mohammad; Ebrahimi Takamjani, Ismail; Ebrahim Mosavi, Mohammad; Aminian, Gholamreza; Fardipour, Shima; Abbasi, Faeze
The purpose of the present study was to translate and to cross-culturally adapt the Cumberland Ankle Instability Tool (CAIT) into Persian language and to evaluate its psychometric properties. The International Quality of Life Assessment process was pursued to translate CAIT into Persian. Two groups of Persian-speaking individuals, 105 participants with a history of ankle sprain and 30 participants with no history of ankle sprain, were asked to fill out Persian version of CAIT (CAIT-P), Foot and Ankle Ability Measure (FAAM), and Visual Analog Scale (VAS). Data obtained from the first administration of CAIT were used to evaluate floor and ceiling effects, internal consistency, dimensionality, and criterion validity. To determine the test-retest reliability, 45 individuals re-filled CAIT 5-7 days after the first session. Cronbach's alpha was over the cutoff point of 0.70 for both ankles and in both groups. The intra-class correlation coefficient was high for right (0.95) and left (0.91) ankles. There was a strong correlation between each item and the total score of the CAIT-P. Although the CAIT-P had strong correlation with VAS, its correlation with both subscales of FAAM was moderate. The CAIT-P has good validity and reliability and it can be used by clinicians and researchers for identification and investigation of functional ankle instability. Implications for Rehabilitation Chronic ankle instability is one of the most common consequences of acute ankle sprain. Cumberland Ankle Instability Tool is an acceptable measure to determine functional ankle instability and its severity. The Persian version of Cumberland Ankle Instability Tool is a valid and reliable tool for clinical and research purpose in Persian-speaking individuals.
Yehya, Arij; Ghuloum, Suhaila; Mahfoud, Ziyad; Opler, Mark; Khan, Anzalee; Hammoudeh, Samer; Abdulhakam, Abdulmoneim; Al-Mujalli, Azza; Hani, Yahya; Elsherbiny, Reem; Al-Amin, Hassen
The Positive and Negative Syndrome Scale (PANSS) is widely used for patients with schizophrenia. This scale is reliable and valid. The PANSS was translated and validated in several languages. The aim of this study was to translate and validate the PANSS in the Arab population. The PANSS was translated into formal Arabic language using the back-translation method. 101 Arab patients with schizophrenia and 98 Arabs with no diagnosis of any mental disorder were recruited. The Arabic version of the Mini International Neuropsychiatric Interview (MINI-6) was used as a diagnostic tool to confirm the diagnosis of schizophrenia or rule out any diagnosis for the healthy control group. Reliability of the scale was assessed by calculating internal consistency, interrater reliability and test-retest reliability. Construct validity was assessed using the Arabic version of the MINI-6. PANSS total scores were correlated with the Clinical Global Impression-Severity scale. Our findings showed that the internal consistency was good (0.92). Scores on the PANSS of the patients were much higher than those of the healthy controls. The PANSS showed good interrater reliability and test-retest reliability (0.92 and 0.75, respectively). In comparison with the MINI-6, the PANSS showed good sensitivity and specificity, which implies good construct validity of this version. In conclusion, the Arabic version of the PANSS is a reliable and valid instrument for the assessment of patients with schizophrenia in the Arab population. © 2016 S. Karger AG, Basel.
Marshall, Skye; Young, Adrienne; Bauer, Judith; Isenring, Elizabeth
Accurate identification and management of malnutrition is essential so that patient outcomes can be improved and resources used efficaciously. In malnourished older adults admitted to rehabilitation: 1) report the prevalence, health and aged care use, and mortality of malnourished older adults; 2) determine and compare the criterion (concurrent and predictive) validity of the Scored Patient-Generated Subjective Global Assessment (PG-SGA) and the Mini Nutritional Assessment (MNA) in diagnosing malnutrition; and 3) identify the Scored PG-SGA score cut-off value associated with malnutrition. Observational, prospective cohort. Participants were 57 older adults (65 years and older; mean±standard deviation age=79.1±7.3 years) from two rural rehabilitation units in New South Wales, Australia. Scored PG-SGA; MNA; and the International Statistical Classification of Diseases and Health Related Problems, 10th revision, Australian Modification (ICD-10-AM) classification of malnutrition were compared to establish concurrent validity and report malnutrition prevalence. Length of stay, discharge location, rehospitalization, admission to a residential aged care facility, and mortality were measured to report health-related outcomes and to establish predictive validity. Malnutrition prevalence varied according to assessment tool (ICD-10-AM: 46%; Scored PG-SGA: 53%; MNA: 28%). Using the ICD-10-AM as the reference standard, the Scored PG-SGA ratings (sensitivity 100%, specificity 87%) and score (sensitivity 92%, specificity 84%, ROC AUC [receiver operating characteristics area under the curve]=0.910±0.038) showed strong concurrent validity, and the MNA had moderate concurrent validity (sensitivity 58%, specificity 97%, receiver operating characteristics area under the curve=0.854±0.052). The Scored PG-SGA rating, Scored PG-SGA score, and MNA showed good predictive validity. Malnutrition can increase the risk of longer rehospitalization length of stay, admission to a residential
Oikonomidi, Theodora; Vikelis, Michail; Artemiadis, Artemios; Chrousos, George P; Darviri, Christina
The Migraine Disability Assessment (MIDAS) Questionnaire is a reliable and valid instrument for migraine-related disability. Such a tool is needed to quantify migraine-related disability in the Greek population. This validation study aims to assess the test-retest reliability, internal consistency, item discriminant and convergent validity of the Greek translation of the MIDAS. Adults diagnosed with migraine completed the MIDAS Questionnaire on two occasions 3 weeks apart to assess reliability, and completed the RAND-36 to assess validity. Participants (n = 152) had a median MIDAS score of 24 and mostly severe disability (58% were grade IV). The test-retest reliability analysis (N = 59) revealed excellent reliability for the total score. Internal consistency was α = 0.71 for initial and α = 0.82 for retest completion. For item discriminant validity, the correlations between each question and the total score were significant, with high correlations for questions 2-5 (range 0.67 ≤ r ≤ 0.79; p MIDAS score tended to have better wellbeing. Psychometric properties are comparable with those of other published validation studies of the MIDAS and the original. Findings on question 1 show that missing work/school days may be closely related with increased affect issues. The Greek version of the MIDAS Questionnaire has good reliability and validity. This study allowed for cross-cultural comparability of research findings.
dr René Butter; Marise Born
In this paper the concept of "ecological personality scales" is introduced. These are contextualized inventories with a high ecological validity. They are developed in a bottom-up or qualitative way and combine a relatively high trait specificity with a relatively high situational specificity. An
Anderson, Daniel; Rowley, Brock; Alonzo, Julie; Tindal, Gerald
The easyCBM© CCSS Math tests were developed to help inform teachers' instructional decisions by providing relevant information on students' mathematical skills, relative to the Common Core State Standards (CCSS). This technical report describes a study to explore the validity of the easyCBM© CCSS Math tests by evaluating the relation between…
McGill, Ryan J.
The current study examined the incremental validity of the clinical clusters from the Woodcock-Johnson III Tests of Cognitive Abilities (WJ-III COG) for predicting scores on the Woodcock-Johnson III Tests of Achievement (WJ-III ACH). All participants were children and adolescents (N = 4,722) drawn from the nationally representative WJ-III…
Scholes, Shaun; Coombs, Ngaire; Pedisic, Zeljko; Mindell, Jennifer S.; Bauman, Adrian; Rowlands, Alex V.; Stamatakis, Emmanuel
The criterion validity of the 2008 Physical Activity and Sedentary Behavior Assessment Questionnaire (PASBAQ) was examined in a nationally representative sample of 2,175 persons aged ≥16 years in England using accelerometry. Using accelerometer minutes/day greater than or equal to 200 counts as a criterion, Spearman's correlation coefficient (ρ) for PASBAQ-assessed total activity was 0.30 (95% confidence interval (CI): 0.25, 0.35) in women and 0.20 (95% CI: 0.15, 0.26) in men. Correlations between accelerometer counts/minute of wear time and questionnaire-assessed relative energy expenditure (metabolic equivalent-minutes/day) were higher in women (ρ = 0.41, 95% CI: 0.36, 0.46) than in men (ρ = 0.32, 95% CI: 0.26, 0.38). Similar correlations were observed for minutes/day spent in vigorous activity (women: ρ = 0.39, 95% CI: 0.33, 0.46; men: ρ = 0.31, 95% CI: 0.26, 0.36) and moderate-to-vigorous activity (women: ρ = 0.42, 95% CI: 0.36, 0.48; men: ρ = 0.38, 95% CI: 0.32, 0.45). Correlations for time spent being sedentary (physical activity was higher in older age groups, but validity was higher in younger persons for vigorous-intensity activity. The PASBAQ is a useful and valid instrument for ranking individuals according to levels of physical activity and sedentary behavior. PMID:24863551
Ringsted, C; Lippert, F; Hesselfeldt, R
Cardiac Arrest Simulation Test (CASTest) scenarios for the assessments according to guidelines 2005. AIMS: To analyse the reliability and validity of the individual sub-tests provided by ERC and to find a combination of MCQ and CASTest that provides a reliable and valid single effect measure of ALS...... that possessed high reliability, equality of test sets, and ability to discriminate between the two groups of supposedly different ALS competence. CONCLUSIONS: ERC sub-tests of ALS competence possess sufficient reliability and validity. A combined ALS score with equal weighting of one MCQ and one CASTest can...... competence. METHODS: Two groups of participants were included in this randomised, controlled experimental study: a group of newly graduated doctors, who had not taken the ALS course (N=17) and a group of students, who had passed the ALS course 9 months before the study (N=16). Reliability in terms of inter...
Rikkert, Marcel G M Olde; Tona, Klodiana Daphne; Janssen, Lieneke
New staging systems of dementia require adaptation of disease management programs and adequate staging instruments. Therefore, we systematically reviewed the literature on validity and reliability of clinically applicable, multidomain, and dementia staging instruments. A total of 23 articles...
M. Reijman (Max); J.M.W. Hazes (Mieke); H.A.P. Pols (Huib); R.M.D. Bernsen (Roos); B.W. Koes (Bart); S.M. Bierma-Zeinstra (Sita)
textabstractOBJECTIVES: To compare the reliability and validity in a large open population of three frequently used radiological definitions of hip osteoarthritis (OA): Kellgren and Lawrence grade, minimal joint space (MJS), and Croft grade; and to investigate whether the
Chorong Park, MSN, RN
Conclusion: The K-HES had acceptable validity and reliability. The brevity and ease of administration of the K-HES makes it a suitable tool for evaluating empowerment-based education programs targeted towards older populations.
Betül Tosun, RN, PhD
Conclusions: The findings of this study reveal that the ICQ is a valid and reliable tool for assessing the comfort of patients in Turkey who are immobilized because of lower extremity orthopedic problems.
Letafatkar, Amir; Amirsasan, Ramin; Abdolvahabi, Zahra; Hadadnezhad, Malihe
The aim of this study was to determine the reliability and validity of the AutoCAD software method in lumbar lordosis measurement. Fifty healthy volunteers with a mean age of 23 ± 1.80 years were enrolled. A lumbar lateral radiograph was taken on all participants, and the lordosis was measured according to the Cobb method. Afterward, the lumbar lordosis degree was measured via AutoCAD software and flexible ruler methods. The current study is accomplished in 2 parts: intratester and intertester evaluations of reliability as well as the validity of the flexible ruler and software methods. Based on the intraclass correlation coefficient, AutoCAD's reliability and validity in measuring lumbar lordosis were 0.984 and 0.962, respectively. AutoCAD showed to be a reliable and valid method to measure lordosis. It is suggested that this method may replace those that are costly and involve health risks, such as radiography, in evaluating lumbar lordosis.
Boer, Y.A. de; Ende, C.H.M. van den; Eygendaal, D.; Jolie, I.M.M.; Hazes, J.M.W.; Rozing, P.M.
OBJECTIVES: (1) To investigate the measurement characteristics of the Hospital for Special Surgery (HSS) and Mayo Clinic elbow assessment instruments, utilizing methodological criteria including feasibility, reliability, validity, and discriminative ability; and (2) to develop an efficient and
knowledge-dietary behaviour relationship require use of valid and reliable knowledge .... Which of the following beverages has the lowest energy content per cup (250 ml)?b .... Diploma (ND): Consumer Science: Food and Nutrition together.
Gamze Sarikoc, PhD, RN
Conclusion: Results showed that the SNSI had a satisfactory level of reliability and validity in nursing students in Turkey. Multicenter studies including nursing students from different nursing schools are recommended for the SNSI to be generalized.
McMullen, Tara; Resnick, Barbara
To establish the reliability and validity of the Rosenberg Self-Esteem Scale (RSES) when used with nursing assistants (NAs). Testing the RSES used baseline data from a randomized controlled trial testing the Res-Care Intervention. Female NAs were recruited from nursing homes (n = 508). Validity testing for the positive and negative subscales of the RSES was based on confirmatory factor analysis (CFA) using structural equation modeling and Rasch analysis. Estimates of reliability were based on Rasch analysis and the person separation index. Evidence supports the reliability and validity of the RSES in NAs although we recommend minor revisions to the measure for subsequent use. Establishing reliable and valid measures of self-esteem in NAs will facilitate testing of interventions to strengthen workplace self-esteem, job satisfaction, and retention.
Nikolaidis, Pantelis T; Clemente, Filipe M; van der Linden, Cornelis M I; Rosemann, Thomas; Knechtle, Beat
The objectives of the present study were to examine the validity and reliability of the 10 Hz Johan GPS unit in assessing in-line movement and change of direction. The validity was tested against the criterion measure of 200 m track-and-field (track-and-field athletes, n = 8) and 20 m shuttle run endurance test (female soccer players, n = 20). Intra-unit and inter-unit reliability was tested by intra-class correlation coefficient (ICC) and coefficient of variation (CV), respectively. An analysis of variance examined differences between the GPS measurement and five laps of 200 m at 15 km/h, and t -test examined differences between the GPS measurement and 20 m shuttle run endurance test. The difference between the GPS measurement and 200 m distance ranged from -0.13 ± 3.94 m (95% CI -3.42; 3.17) in the first lap to 2.13 ± 2.64 m (95% CI -0.08; 4.33) in the fifth lap. A good intra-unit reliability was observed in 200 m (ICC = 0.833, 95% CI 0.535; 0.962). Inter-unit CV ranged from 1.31% (fifth lap) to 2.20% (third lap). The difference between the GPS measurement and 20 m shuttle run endurance test ranged from 0.33 ± 4.16 m (95% CI -10.01; 10.68) in 11.5 km/h to 9.00 ± 5.30 m (95% CI 6.44; 11.56) in 8.0 km/h. A moderate intra-unit reliability was shown in the second and third stage of the 20 m shuttle run endurance test (ICC = 0.718, 95% CI 0.222;0.898) and good reliability in the fifth, sixth, seventh and eighth (ICC = 0.831, 95% CI -0.229;0.996). Inter-unit CV ranged from 2.08% (11.5 km/h) to 3.92% (8.5 km/h). Based on these findings, it was concluded that the 10 Hz Johan system offers an affordable valid and reliable tool for coaches and fitness trainers to monitor training and performance.
Pantelis T. Nikolaidis
Full Text Available The objectives of the present study were to examine the validity and reliability of the 10 Hz Johan GPS unit in assessing in-line movement and change of direction. The validity was tested against the criterion measure of 200 m track-and-field (track-and-field athletes, n = 8 and 20 m shuttle run endurance test (female soccer players, n = 20. Intra-unit and inter-unit reliability was tested by intra-class correlation coefficient (ICC and coefficient of variation (CV, respectively. An analysis of variance examined differences between the GPS measurement and five laps of 200 m at 15 km/h, and t-test examined differences between the GPS measurement and 20 m shuttle run endurance test. The difference between the GPS measurement and 200 m distance ranged from −0.13 ± 3.94 m (95% CI −3.42; 3.17 in the first lap to 2.13 ± 2.64 m (95% CI −0.08; 4.33 in the fifth lap. A good intra-unit reliability was observed in 200 m (ICC = 0.833, 95% CI 0.535; 0.962. Inter-unit CV ranged from 1.31% (fifth lap to 2.20% (third lap. The difference between the GPS measurement and 20 m shuttle run endurance test ranged from 0.33 ± 4.16 m (95% CI −10.01; 10.68 in 11.5 km/h to 9.00 ± 5.30 m (95% CI 6.44; 11.56 in 8.0 km/h. A moderate intra-unit reliability was shown in the second and third stage of the 20 m shuttle run endurance test (ICC = 0.718, 95% CI 0.222;0.898 and good reliability in the fifth, sixth, seventh and eighth (ICC = 0.831, 95% CI −0.229;0.996. Inter-unit CV ranged from 2.08% (11.5 km/h to 3.92% (8.5 km/h. Based on these findings, it was concluded that the 10 Hz Johan system offers an affordable valid and reliable tool for coaches and fitness trainers to monitor training and performance.
Sidor, Anna; Cierpka, Manfred
A standardized assessment of a family system plays a crucial role in family therapy research and diagnostic, as well as in a family therapy itself. A 14-item short version of the General Family Questionnaire (FB-K) was designed to get a tool for assessing family functionality that is low time-consuming. The short version was developed by factor analysis from the long version FA-A. The quality criteria of the family questionnaire were verified in a control sample of 208 high-risk families four months after the birth of their child. The new family questionnaire demonstrates a very good reliability and a satisfactory 8-months-stability. The concurrent validity with the FACES scale "cohesion" is assured. Regarding the construct validity a positive correlation to the feeling of coherence was found. The family questionnaire shows a negative correlation to the maternal postnatal depressive symptoms, the degree of maternal stress burden, the dysfunctionality of the mother-child-relationship and impaired bonding. The values taken from a norm sample with infants are higher by trend and in the sample with children under 18 do not deviate from the values of the risk sample. FB-K covers two aspects of family functioning, the bond between family members and their willingness to communicate. The internal consistency of FB-K is excellent, the criterion and the construct validity are good.
Mazaheri, Maryam Amidi; Karbasi, Mojtaba
Background: With regard to large number of mobile users especially among college students in Iran, addiction to mobile phone is attracting increasing concern. There is an urgent need for reliable and valid instrument to measure this phenomenon. This study examines validity and reliability of the Persian version of mobile phone addiction scale (MPAIS) in college students. Materials and Methods: this methodological study was down in Isfahan University of Medical Sciences. One thousand one hundr...
Reijman, Max; Hazes, Mieke; Pols, Huib; Bernsen, Roos; Koes, Bart; Bierma-Zeinstra, Sita
textabstractOBJECTIVES: To compare the reliability and validity in a large open population of three frequently used radiological definitions of hip osteoarthritis (OA): Kellgren and Lawrence grade, minimal joint space (MJS), and Croft grade; and to investigate whether the validity of the three definitions of hip OA is sex dependent. METHODS: SUBJECTS: from the Rotterdam study (aged > or= 55 years, n = 3585) were evaluated. The inter-rater reliability was tested in a random set of 148 x rays. ...
Li, Fengzhi; Li, Changji; Long, Yunfang; Zhan, Chenglie; Hennessy, Dwight
The present research was designed to examine the psychometric properties of Chinese versions of the Self Report Driver Behavior Aggression and Assertiveness subscales, the Driving Vengeance Questionnaire, and the Violent Driving Questionnaire. Study 1 found that the all scales demonstrated good internal consistency, with alphas ranging from .76 to .87 and that assertive driving was related to demerit points received over the past 12 months while driver aggression and violence were linked to collisions over the past 12 months. Study 2 found that the scales exhibited reasonable test-retest reliability, with correlations ranging from .82 to .89. Finally, Study 3 showed that each scale was predicted by other dangerous driving attitudes and behaviors, similar to the original versions. The consistency between the translated and original scales, the implications for use in a Chinese sample, and the uniformity of actions in the traffic environment across cultures are discussed.
Conclusion: Considering that Validity and Reliability factors of the questionnaire were be appropriate, it can be recommended that NIOSH Generic Job Stress Questionnaire (GJSQ can be used as a Valid and Reliable questionnaire for job stress evaluation in Iran.
Schoppen, Tanneke; Boonstra, Antje; Groothoff, JW; de Vries, J; Goeken, LNH; Eisma, Willem
Objective: To determine the interrater and interrater reliability and the validity of the Timed "up and go" test as a measure for physical mobility in elderly patients with an amputation of the lower extremity. Design: To test interrater reliability, the test was performed for two observers at
Michailov, Michail Lubomirov; Baláš, Jirí; Tanev, Stoyan Kolev; Andonov, Hristo Stoyanov; Kodejška, Jan; Brown, Lee
Purpose: An advanced system for the assessment of climbing-specific performance was developed and used to: (a) investigate the effect of arm fixation (AF) on construct validity evidence and reliability of climbing-specific finger-strength measurement; (b) assess reliability of finger-strength and endurance measurements; and (c) evaluate the…
The Rey Visual Design Learning Test (Rey, 1964, in Spreen & Strauss, 1991) assesses immediate memory span, new learning and recognition for non-verbal material. Three studies are presented that focused on the reliability and validity of the RVDLT in primary school children. Test-retest reliability
Rae, James R.; Olson, Kristina R.
The Implicit Association Test (IAT) is increasingly used in developmental research despite minimal evidence of whether children's IAT scores are reliable across time or predictive of behavior. When test-retest reliability and predictive validity have been assessed, the results have been mixed, and because these studies have differed on many…
Brody, Michelle L.; And Others
Examined reliability and validity of binge eating disorder (BED), proposed for inclusion in Diagnostic and Statistical Manual of Mental Disorders (DSM), fourth edition. Interrater reliability of BED diagnosis compared favorably with that of most diagnoses in DSM revised third edition. Study comparing obese individuals with and without BED and…
Conclusion: The tool designed to assess bag-mask ventilation and tracheal intubation skills in anesthesia trainees demonstrated excellent inter-rater reliability, fair test-retest reliability, and good construct validity. The authors recommend its use for formative and summative assessment of junior anesthesia trainees.
Bartlett, Susan J; Barbic, Skye P; Bykerk, Vivian P
-FQ), and the voting results at OMERACT 2016. METHODS: Classic and modern psychometric methods were used to assess reliability, validity, sensitivity, factor structure, scoring, and thresholds. Interviews with patients and clinicians also assessed content validity, utility, and meaningfulness of RA-FQ scores. RESULTS......: People with RA in observational trials in Canada (n = 896) and France (n = 138), and an RCT in the Netherlands (n = 178) completed 5 items (11-point numerical rating scale) representing RA Flare core domains. There was moderate to high evidence of reliability, content and construct validity...... to identify and measure RA flares. Its review through OMERACT Filter 2.0 shows evidence of reliability, content and construct validity, and responsiveness. These properties merit its further validation as an outcome for clinical trials....
Livesey, Alexandra; Dodd, Karen; Pote, Helen; Marlow, Elizabeth
The aim of the study was to explore the validity of the social-moral awareness test (SMAT) a measure designed for assessing socio-moral rule knowledge and reasoning in people with learning disabilities. Comparisons between Theory of Mind and socio-moral reasoning allowed the exploration of construct validity of the tool. Factor structure, reliability and discriminant validity were also assessed. Seventy-one participants with mild-moderate learning disabilities completed the two scales of the SMAT and two False Belief Tasks for Theory of Mind. Reliability of the SMAT was very good, and the scales were shown to be uni-dimensional in factor structure. There was a significant positive relationship between Theory of Mind and both SMAT scales. There is early evidence of the construct validity and reliability of the SMAT. Further assessment of the validity of the SMAT will be required. © 2012 Blackwell Publishing Ltd.
Patterson, P Daniel; Weaver, Matthew D; Fabio, Anthony; Teasley, Ellen M; Renn, Megan L; Curtis, Brett R; Matthews, Margaret E; Kroemer, Andrew J; Xun, Xiaoshuang; Bizhanova, Zhadyra; Weiss, Patricia M; Sequeira, Denisse J; Coppler, Patrick J; Lang, Eddy S; Higgins, J Stephen
This study sought to systematically search the literature to identify reliable and valid survey instruments for fatigue measurement in the Emergency Medical Services (EMS) occupational setting. A systematic review study design was used and searched six databases, including one website. The research question guiding the search was developed a priori and registered with the PROSPERO database of systematic reviews: "Are there reliable and valid instruments for measuring fatigue among EMS personnel?" (2016:CRD42016040097). The primary outcome of interest was criterion-related validity. Important outcomes of interest included reliability (e.g., internal consistency), and indicators of sensitivity and specificity. Members of the research team independently screened records from the databases. Full-text articles were evaluated by adapting the Bolster and Rourke system for categorizing findings of systematic reviews, and the rated data abstracted from the body of literature as favorable, unfavorable, mixed/inconclusive, or no impact. The Grading of Recommendations, Assessment, Development and Evaluation (GRADE) methodology was used to evaluate the quality of evidence. The search strategy yielded 1,257 unique records. Thirty-four unique experimental and non-experimental studies were determined relevant following full-text review. Nineteen studies reported on the reliability and/or validity of ten different fatigue survey instruments. Eighteen different studies evaluated the reliability and/or validity of four different sleepiness survey instruments. None of the retained studies reported sensitivity or specificity. Evidence quality was rated as very low across all outcomes. In this systematic review, limited evidence of the reliability and validity of 14 different survey instruments to assess the fatigue and/or sleepiness status of EMS personnel and related shift worker groups was identified.
Full Text Available ABSTRACT OBJECTIVE To validate a Spanish version of the Test of Gross Motor Development (TGMD-2 for the Chilean population. METHODS Descriptive, transversal, non-experimental validity and reliability study. Four translators, three experts and 92 Chilean children, from five to 10 years, students from a primary school in Santiago, Chile, have participated. The Committee of Experts has carried out translation, back-translation and revision processes to determine the translinguistic equivalence and content validity of the test, using the content validity index in 2013. In addition, a pilot implementation was achieved to determine test reliability in Spanish, by using the intraclass correlation coefficient and Bland-Altman method. We evaluated whether the results presented significant differences by replacing the bat with a racket, using T-test. RESULTS We obtained a content validity index higher than 0.80 for language clarity and relevance of the TGMD-2 for children. There were significant differences in the object control subtest when comparing the results with bat and racket. The intraclass correlation coefficient for reliability inter-rater, intra-rater and test-retest reliability was greater than 0.80 in all cases. CONCLUSIONS The TGMD-2 has appropriate content validity to be applied in the Chilean population. The reliability of this test is within the appropriate parameters and its use could be recommended in this population after the establishment of normative data, setting a further precedent for the validation in other Latin American countries.
Wickramasinghe, Nuwan Darshana; Dissanayake, Devani Sakunthala; Abeywardena, Gihan Sajiwa
The present study was aimed at assessing the validity and the reliability of the Sinhala version of the Utrecht Work Engagement Scale-Student Version (UWES-S) among collegiate cycle students in Sri Lanka. The 17-item UWES-S was translated to Sinhala and the judgmental validity was assessed by a multi-disciplinary panel of experts. Construct validity of the UWES-S was appraised by using multi-trait scaling analysis and exploratory factor analysis (EFA) on data obtained from a sample of 194 grade thirteen students in the Kurunegala district, Sri Lanka. Reliability of the UWES-S was assessed by using internal consistency and test-retest reliability. Except for item 13, all other items showed good psychometric properties in judgemental validity, item-convergent validity and item-discriminant validity. EFA using principal component analysis with Oblimin rotation, suggested a three-factor solution (including vigor, dedication and absorption subscales) explaining 65.4% of the total variance for the 16-item UWES-S (with item 13 deleted). All three subscales show high internal consistency with Cronbach's α coefficient values of 0.867, 0.819, and 0.903 and test-retest reliability was high (p valid and a reliable instrument to assess work engagement among collegiate cycle students in Sri Lanka.
Braun, Tobias; Marks, Detlef; Thiel, Christian; Grüneberg, Christian
To establish the validity and reliability of the de Morton Mobility Index (DEMMI) in patients with sub-acute stroke. This cross-sectional study was performed in a neurological rehabilitation hospital. We assessed unidimensionality, construct validity, internal consistency reliability, inter-rater reliability, minimal detectable change and possible floor and ceiling effects of the DEMMI in adult patients with sub-acute stroke. The study included a total sample of 121 patients with sub-acute stroke. We analysed validity (n = 109) and reliability (n = 51) in two sub-samples. Rasch analysis indicated unidimensionality with an overall fit to the model (chi-square = 12.37, p = 0.577). All hypotheses on construct validity were confirmed. Internal consistency reliability (Cronbach's alpha = 0.94) and inter-rater reliability (intraclass correlation coefficient = 0.95; 95% confidence interval: 0.92-0.97) were excellent. The minimal detectable change with 90% confidence was 13 points. No floor or ceiling effects were evident. These results indicate unidimensionality, sufficient internal consistency reliability, inter-rater reliability, and construct validity of the DEMMI in patients with a sub-acute stroke. Advantages of the DEMMI in clinical application are the short administration time, no need for special equipment and interval level data. The de Morton Mobility Index, therefore, may be a useful performance-based bedside test to measure mobility in individuals with a sub-acute stroke across the whole mobility spectrum. Implications for Rehabilitation The de Morton Mobility Index (DEMMI) is an unidimensional measurement instrument of mobility in individuals with sub-acute stroke. The DEMMI has excellent internal consistency and inter-rater reliability, and sufficient construct validity. The minimal detectable change of the DEMMI with 90% confidence in stroke rehabilitation is 13 points. The lack of any floor or ceiling effects on hospital admission indicates
Miller, Joshua D; McCain, Jessica; Lynam, Donald R; Few, Lauren R; Gentile, Brittany; MacKillop, James; Campbell, W Keith
The growing interest in the study of narcissism has resulted in the development of a number of assessment instruments that manifest only modest to moderate convergence. The present studies adjudicate among these measures with regard to criterion validity. In the 1st study, we compared multiple narcissism measures to expert consensus ratings of the personality traits associated with narcissistic personality disorder (NPD; Study 1; N = 98 community participants receiving psychological/psychiatric treatment) according to the Diagnostic and Statistical Manual of Mental Disorders (4th ed., text rev.; DSM-IV-TR; American Psychiatric Association, 2000) using 5-factor model traits as well as the traits associated with the pathological trait model according to the Diagnostic and Statistical Manual of Mental Disorders (5th ed.; American Psychiatric Association, 2013). In Study 2 (N = 274 undergraduates), we tested the criterion validity of an even larger set of narcissism instruments by examining their relations with measures of general and pathological personality, as well as psychopathology, and compared the resultant correlations to the correlations expected by experts for measures of grandiose and vulnerable narcissism. Across studies, the grandiose dimensions from the Five-Factor Narcissism Inventory (FFNI; Glover, Miller, Lynam, Crego, & Widiger, 2012) and the Narcissistic Personality Inventory (Raskin & Terry, 1988) provided the strongest match to expert ratings of DSM-IV-TR NPD and grandiose narcissism, whereas the vulnerable dimensions of the FFNI and the Pathological Narcissism Inventory (Pincus et al., 2009), as well as the Hypersensitive Narcissism Scale (Hendin & Cheek, 1997), provided the best match to expert ratings of vulnerable narcissism. These results should help guide researchers toward the selection of narcissism instruments that are most well suited to capturing different aspects of narcissism. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Lieberman, Lynne; Liu, Huiting; Huggins, Ashley A; Katz, Andrea C; Zvolensky, Michael J; Shankman, Stewart A
Personality traits relate to risk for psychopathology and can inform predictions about treatment outcome. In an effort to obtain a comprehensive index of personality, informant reports of personality are sometimes obtained in addition to self-reports of personality. However, there is limited research comparing the validity of self- and informant reports of personality, particularly among those with internalizing psychopathology. This is important given that informants may provide an additional (and perhaps different) perspective on individuals' personality. The present study therefore compared how both reports of positive affectivity (PA) and negative affectivity (NA) relate to psychophysiological and subjective measures of emotional responding to positive and negative stimuli. Given that our sample (n = 117) included individuals with no history of psychopathology, as well as individuals with major depressive disorder (MDD) and/or panic disorder (PD), we were also able to explore whether these internalizing diagnoses moderated the association between personality reports and measures of emotional responding. Informant-reported PA predicted physiological responses to positive stimuli (but not negative). Informant-reported NA predicted physiological responses to negative stimuli (but not positive). Self-reported personality did not predict physiological responding, but did predict subjectively measured emotional responding (NA for negative responding, PA for positive responding). Diagnoses of internalizing psychopathology (PD or MDD) did not moderate these associations. Results suggest self- and informant reports of personality may each provide valid indices of an individual's emotional response tendencies, but predict different aspects of those tendencies. © 2016 Society for Psychophysiological Research.
Lieberman, Lynne; Liu, Huiting; Huggins, Ashley A.; Katz, Andrea C.; Zvolensky, Michael J.; Shankman, Stewart A.
Personality traits relate to risk for psychopathology and can inform predictions about treatment outcome. In an effort to obtain a comprehensive index of personality, informant-reports of personality are sometimes obtained in addition to self-reports of personality. However, there is limited research comparing the validity of self- and informant-reports of personality, particularly among those with internalizing psychopathology. This is important given that informants may provide an additional (and perhaps different) perspective on individuals’ personality. The present study therefore compared how both reports of positive affectivity (PA) and negative affectivity (NA) relate to psychophysiological and subjective measures of emotional responding to positive and negative stimuli. Given that our sample (n = 117) included individuals with no history of psychopathology, as well as individuals with major depressive disorder (MDD) and/or panic disorder (PD), we were also able to explore whether these internalizing diagnoses moderated the association between personality reports and measures of emotional responding. Informant-reported PA predicted physiological responses to positive stimuli (but not negative). Informant reported NA predicted physiological responses to negative stimuli (but not positive). Self-reported personality did not predict physiological responding, but did predict subjectively measured emotional responding (NA for negative responding; PA for positive responding). Diagnoses of internalizing psychopathology (PD or MDD) did not moderate these associations. Results suggest self- and informant-reports of personality may each provide valid indices of an individual’s emotional response tendencies, but predict different aspects of those tendencies. PMID:27273802
Gruen, Margaret E; Griffith, Emily H; Thomson, Andrea E; Simpson, Wendy; Lascelles, B Duncan X
Degenerative joint disease and associated pain are common in cats, particularly in older cats. There is a need for treatment options, however evaluation of putative therapies is limited by a lack of suitable, validated outcome measures that can be used in the target population of client owned cats. The objectives of this study were to evaluate low-dose daily meloxicam for the treatment of pain associated with degenerative joint disease in cats, and further validate two clinical metrology instruments, the Feline Musculoskeletal Pain Index (FMPI) and the Client Specific Outcome Measures (CSOM). Sixty-six client owned cats with degenerative joint disease and owner-reported impairments in mobility were screened and enrolled into a double-masked, placebo-controlled, randomized clinical trial. Following a run-in baseline period, cats were given either placebo or meloxicam for 21 days, then in a masked washout, cats were all given placebo for 21 days. Subsequently, cats were given the opposite treatment, placebo or meloxicam, for 21 days. Cats wore activity monitors throughout the study, owners completed clinical metrology instruments following each period. Activity counts were increased in cats during treatment with daily meloxicam (pdegenerative joint disease.
Chen, Y-W; HajGhanbari, B; Road, J D; Coxson, H O; Camp, P G; Reid, W D
Pain is prevalent in chronic obstructive pulmonary disease (COPD) and the Brief Pain Inventory (BPI) appears to be a feasible questionnaire to assess this symptom. However, the reliability and validity of the BPI have not been determined in individuals with COPD. This study aimed to determine the internal consistency, test-retest reliability and validity (construct, convergent, divergent and discriminant) of the BPI in individuals with COPD. In order to examine the test-retest reliability, individuals with COPD were recruited from pulmonary rehabilitation programmes to complete the BPI twice 1 week apart. In order to investigate validity, de-identified data was retrieved from two previous studies, including forced expiratory volume in 1-s, age, sex and data from four questionnaires: the BPI, short-form McGill Pain Questionnaire (SF-MPQ), 36-Item Short Form Survey (SF-36) and Community Health Activities Model Program for Seniors (CHAMPS) questionnaire. In total, 123 participants were included in the analyses (eligible data were retrieved from 86 participants and additional 37 participants were recruited). The BPI demonstrated excellent internal consistency and test-retest reliability. It also showed convergent validity with the SF-MPQ and divergent validity with the SF-36. The factor analysis yielded two factors of the BPI, which demonstrated that the two domains of the BPI measure the intended constructs. The BPI can also discriminate pain levels among COPD patients with varied levels of quality of life (SF-36) and physical activity (CHAMPS). The BPI is a reliable and valid pain questionnaire that can be used to evaluate pain in COPD. This study formally established the reliability and validity of the BPI in individuals with COPD, which have not been determined in this patient group. The results of this study provide strong evidence that assessment results from this pain questionnaire are reliable and valid. © 2018 European Pain Federation - EFIC®.
Muhamad, Zailani; Ramli, Ayiesah; Amat, Salleh
The aim of this study was to determine the content validity, internal consistency, test-retest reliability and inter-rater reliability of the Clinical Competency Evaluation Instrument (CCEVI) in assessing the clinical performance of physiotherapy students. This study was carried out between June and September 2013 at University Kebangsaan Malaysia (UKM), Kuala Lumpur, Malaysia. A panel of 10 experts were identified to establish content validity by evaluating and rating each of the items used in the CCEVI with regards to their relevance in measuring students' clinical competency. A total of 50 UKM undergraduate physiotherapy students were assessed throughout their clinical placement to determine the construct validity of these items. The instrument's reliability was determined through a cross-sectional study involving a clinical performance assessment of 14 final-year undergraduate physiotherapy students. The content validity index of the entire CCEVI was 0.91, while the proportion of agreement on the content validity indices ranged from 0.83-1.00. The CCEVI construct validity was established with factor loading of ≥0.6, while internal consistency (Cronbach's alpha) overall was 0.97. Test-retest reliability of the CCEVI was confirmed with a Pearson's correlation range of 0.91-0.97 and an intraclass coefficient correlation range of 0.95-0.98. Inter-rater reliability of the CCEVI domains ranged from 0.59 to 0.97 on initial and subsequent assessments. This pilot study confirmed the content validity of the CCEVI. It showed high internal consistency, thereby providing evidence that the CCEVI has moderate to excellent inter-rater reliability. However, additional refinement in the wording of the CCEVI items, particularly in the domains of safety and documentation, is recommended to further improve the validity and reliability of the instrument.
Suzuki, Eiko; Kanoya, Yuka; Katsuki, Takeshi; Sato, Chifumi
To verify the reliability and validity of a Japanese version of the Rathus Assertiveness Schedule in novice nurses to contribute to nursing management. An adequate scale is needed to measure the assertiveness and the effect of assertion training for Japanese nurses and to compare them with those in other countries. Rathus Assertiveness Schedule was adapted to Japanese with back-translation and its validity was examined in 989 novice nurses. The Japanese version showed a high coefficient of reliability in a split-half reliability test (r=0.76; PAssertiveness Schedule. The Japanese version of Rathus Assertiveness Schedule was verified.
So, Hyang Sook; Chae, Myeong Jeong; Kim, Hye Young
In this study the reliability and validity of the Korean version of the Cancer Stigma Scale (KCSS) was evaluated. The KCSS was formed through translation and modification of Cataldo Lung Cancer Stigma Scale. The KCSS, Psychological Symptom Inventory (PSI), and European Organization for Research and Treatment of Cancer Quality of Life Questionnaire - Core 30 (EORTC QLQ-C30) were administered to 247 men and women diagnosed with one of the five major cancers. Construct validity, item convergent and discriminant validity, concurrent validity, known-group validity, and internal consistency reliability of the KCSS were evaluated. Exploratory factor analysis supported the construct validity with a six-factor solution; that explained 65.7% of the total variance. The six-factor model was validated by confirmatory factor analysis (Q (χ²/df)= 2.28, GFI=.84, AGFI=.81, NFI=.80, TLI=.86, RMR=.03, and RMSEA=.07). Concurrent validity was demonstrated with the QLQ-C30 (global: r=-.44; functional: r=-.19; symptom: r=.42). The KCSS had known-group validity. Cronbach's alpha coefficient for the 24 items was .89. The results of this study suggest that the 24-item KCSS has relatively acceptable reliability and validity and can be used in clinical research to assess cancer stigma and its impacts on health-related quality of life in Korean cancer patients. © 2017 Korean Society of Nursing Science
López-Villalobos, José A; Andrés-De Llano, Jesús; López-Sánchez, María V; Rodríguez-Molinero, Luis; Garrido-Redondo, Mercedes; Sacristán-Martín, Ana M; Martínez-Rivera, María T; Alberola-López, Susana
The aim of this research is to analyze Attention Deficit Hyperactivity Disorder Rating Scales IV (ADHD RS-IV) criteria validity and its clinical usefulness for the assessment of Attention Deficit Hyperactivity Disorder (ADHD) as a function of assessment method and age. A sample was obtained from an epidemiological study (n = 1095, 6-16 years). Clinical cases of ADHD (ADHD-CL) were selected by dimensional ADHD RS-IV and later by clinical interview (DSM-IV). ADHD-CL cases were compared with four categorical results of ADHD RS-IV provided by parents (CATPA), teachers (CATPR), either parents or teachers (CATPAOPR) and both parents and teachers (CATPA&PR). Criterion validity and clinical usefulness of the answer modalities to ADHD RS-IV were studied. ADHD-CL rate was 6.9% in childhood, 6.2% in preadolescence and 6.9% in adolescence. Alternative methods to the clinical interview led to increased numbers of ADHD cases in all age groups analyzed, in the following sequence: CATPAOPR> CATPRO> CATPA> CATPA&PR> ADHD-CL. CATPA&PR was the procedure with the greatest validity, specificity and clinical usefulness in all three age groups, particularly in the childhood. Isolated use of ADHD RS-IV leads to an increase in ADHD cases compared to clinical interview, and varies depending on the procedure used.
Dere, Zeynep; Ömeroglu, Esra
This study, Creative Behavior Observation Form was developed to assess creativity of the children. While the study group on the reliability and validity of Creative Behavior Observation Form was being developed, 257 children in total who were at the ages of 5-6 were used as samples with stratified sampling method. Content Validity Index (CVI) and…
Yirci, Ramazan; Karakose, Turgut; Uygun, Harun; Ozdemir, Tuncay Yavuz
The purpose of this study is to adapt the Mentoring Relationship Effectiveness Scale to Turkish, and to conduct validity and reliability tests regarding the scale. The study group consisted of 156 university science students receiving graduate education. Construct validity and factor structure of the scale was analyzed first through exploratory…
Shrestha, Bidhan; Niraula, Surya Raj; Parajuli, Prakash K; Suwal, Pramita; Singh, Raj Kumar
To assess the reliability and to validate the translated Nepalese version of the Oral Health Impact Profile (OHIP-EDENT-N) in Nepalese edentulous subjects. The international guidelines for translation and cross-cultural adaption of OHIP-EDENT were followed, and a Nepalese version of the questionnaire was adapted for this study. Eighty-eight completely edentulous subjects were then selected for the study and completed their responses for the questionnaire. The reliability of the OHIP-EDENT-N was evaluated using internal consistency. Validity was assessed as construct and convergent validity. Construct validity was determined using exploratory factor analysis (EFA). The correlation between OHIP-EDENT-N subscale scores and the global question was investigated to test the convergent validity. Cronbach's alpha for the total score of OHIP-EDENT-N was 0.78. Construct validity was assessed by factor analysis: 70.196% of the variance was accountable to five factors extracted from the factor analysis. Factor loadings above 0.40 were noted for all items. In terms of convergent validity, significant correlations could be established between OHIP-EDENT-N and global questions. This study has been able to establish the reliability and validity of the OHIP-EDENT-N, and OHIP-EDENT-N can be a considered a reliable tool to assess the oral health related quality of life in the Nepalese edentulous population. © 2016 by the American College of Prosthodontists.
Ng, Petrus; Su, Xiqing Susan; Chan, Vivien; Leung, Heidi; Cheung, Wendy; Tsun, Angela
This study validated a Perceived Campus Caring Scale with 1,520 university students. Using factor analysis, seven factors namely, Faculty Support, Nonfaculty Support, Peer Relationship, Sense of Detachment, Sense of Belonging, Caring Attitude, and Campus Involvement, are identified with high reliability, validity, and close correlation with the…
Biasutti, Michele; Frate, Sara
This article describes the development and validation of the Attitudes toward Sustainable Development scale, a quantitative 20-item scale that measures Italian university students' attitudes toward sustainable development. A total of 484 undergraduate students completed the questionnaire. The validity and reliability of the scale was statistically…
Vanbellingen, Tim; Nyffeler, Thomas; Nef, Tobias; Kwakkel, Gert; Bohlhalter, Stephan; van Wegen, Erwin E.H.
Background Patients with Parkinson's disease exhibit disturbed dexterity. Validated self-reported outcomes for dexterity in Parkinson's disease are lacking. The aim of this study was to investigate the reliability, content and construct validity of a new Dexterity Questionnaire 24. Methods One
M. Reijman (Max); J.M.W. Hazes (Mieke); H.A.P. Pols (Huib); R.M.D. Bernsen (Roos); B.W. Koes (Bart); S.M. Bierma-Zeinstra (Sita)
textabstractObjectives: To compare the reliability and validity in a large open population of three frequently used radiological definitions of hip osteoarthritis (OA): Kellgren and Lawrence grade, minimal joint space (MJS), and Croft grade; and to investigate whether the validity of the three
Objective. We sought to determine the validity and reliability of a self-report physical activity questionnaire (PAQ) measuring physical activity/inactivity in South African schoolgirls of different ethnic origins. Methods. Construct validity of the PAQ was tested against physical activity energy expenditure estimated from an ...
Watt, Torquil; Hegedus, Laszlo; Grønvold, Mogens
Appropriate scale validity and internal consistency reliability have recently been documented for the new thyroid-specific quality of life (QoL) patient-reported outcome (PRO) measure for benign thyroid disorders, the ThyPRO. However, before clinical use, clinical validity and test...
Zhang, C; Yang, G P; Li, Z; Li, X N; Li, Y; Hu, J; Zhang, F Y; Zhang, X J
Objective: To assess the reliability and validity of the Chinese version on Alcohol Use Disorders Identification Test (AUDIT) among medical students in China and to provide correct way of application on the recommended scales. Methods: An E-questionnaire was developed and sent to medical students in five different colleges. Students were all active volunteers to accept the testings. Cronbach's α and split-half reliability were calculated to evaluate the reliability of AUDIT while content, contract, discriminant and convergent validity were performed to measure the validity of the scales. Results: The overall Cronbach's α of AUDIT was 0.782 and the split-half reliability was 0.711. Data showed that the domain Cronbach's α and split-half reliability were 0.796 and 0.794 for hazardous alcohol use, 0.561 and 0.623 for dependence symptoms, and 0.647 and 0.640 for harmful alcohol use. Results also showed that the content validity index on the levels of items I-CVI) were from 0.83 to 1.00, the content validity index of scale level (S-CVI/UA) was 0.90, content validity index of average scale level (S-CVI/Ave) was 0.99 and the content validity ratios (CVR) were from 0.80 to 1.00. The simplified version of AUDIT supported a presupposed three-factor structure which could explain 61.175% of the total variance revealed through exploratory factor analysis. AUDIT semed to have good convergent and discriminant validity, with the success rate of calibration experiment as 100%. Conclusion: AUDIT showed good reliability and validity among medical students in China thus worth for promotion on its use.
Sureshkumar, Premala; Cumming, Robert G; Craig, Jonathan C
We describe the validity and reliability of a questionnaire designed to determine frequency, severity and risk factors of urinary tract infection and daytime urinary incontinence in primary school-age children. Based on published validated questionnaires and advice from content experts, a questionnaire was developed and piloted in children attending outpatient clinics. Construct validity for parent report of frequency and severity of daytime urinary incontinence was tested by comparison with a daily accident diary in 52 primary school children, and criterion validity of parent report for UTI was verified by comparison with the reference standard (urine culture) in 100 primary school children. Test-retest reliability of the questionnaire was assessed in 106 children from primary schools. There was excellent agreement between the questionnaire and accident diary in severity (weighted kappa 0.94, 95% confidence intervals 0.85 to 1.03) and frequency of daytime urinary incontinence (0.88, 0.7 to 1.0). Parents reported urinary tract infection in 15% of children, compared to a positive urine culture in 8% (sensitivity 100% and specificity 68.5%). Test-retest reliability of the questionnaire was excellent (mean k 0.78, range 0.61 to 1.00). Parents overreport UTI by about 2-fold but can recall frequency and severity of daytime urinary incontinence well during a 3-month period. The developed questionnaire is a valid tool to estimate frequency, severity and risk factors of daytime urinary incontinence and UTI in primary school children.
Pedersen, Scott J; Kitic, Cecilia M; Bird, Marie-Louise; Mainsbridge, Casey P; Cooley, P Dean
With the advent of workplace health and wellbeing programs designed to address prolonged occupational sitting, tools to measure behaviour change within this environment should derive from empirical evidence. In this study we measured aspects of validity and reliability for the Occupational Sitting and Physical Activity Questionnaire that asks employees to recount the percentage of work time they spend in the seated, standing, and walking postures during a typical workday. Three separate cohort samples (N = 236) were drawn from a population of government desk-based employees across several departmental agencies. These volunteers were part of a larger state-wide intervention study. Workplace sitting and physical activity behaviour was measured both subjectively against the International Physical Activity Questionnaire, and objectively against ActivPal accelerometers before the intervention began. Criterion validity and concurrent validity for each of the three posture categories were assessed using Spearman's rank correlation coefficients, and a bias comparison with 95 % limits of agreement. Test-retest reliability of the survey was reported with intraclass correlation coefficients. Criterion validity for this survey was strong for sitting and standing estimates, but weak for walking. Participants significantly overestimated the amount of walking they did at work. Concurrent validity was moderate for sitting and standing, but low for walking. Test-retest reliability of this survey proved to be questionable for our sample. Based on our findings we must caution occupational health and safety professionals about the use of employee self-report data to estimate workplace physical activity. While the survey produced accurate measurements for time spent sitting at work it was more difficult for employees to estimate their workplace physical activity.
Scott J. Pedersen
Full Text Available Abstract Background With the advent of workplace health and wellbeing programs designed to address prolonged occupational sitting, tools to measure behaviour change within this environment should derive from empirical evidence. In this study we measured aspects of validity and reliability for the Occupational Sitting and Physical Activity Questionnaire that asks employees to recount the percentage of work time they spend in the seated, standing, and walking postures during a typical workday. Methods Three separate cohort samples (N = 236 were drawn from a population of government desk-based employees across several departmental agencies. These volunteers were part of a larger state-wide intervention study. Workplace sitting and physical activity behaviour was measured both subjectively against the International Physical Activity Questionnaire, and objectively against ActivPal accelerometers before the intervention began. Criterion validity and concurrent validity for each of the three posture categories were assessed using Spearman’s rank correlation coefficients, and a bias comparison with 95 % limits of agreement. Test-retest reliability of the survey was reported with intraclass correlation coefficients. Results Criterion validity for this survey was strong for sitting and standing estimates, but weak for walking. Participants significantly overestimated the amount of walking they did at work. Concurrent validity was moderate for sitting and standing, but low for walking. Test-retest reliability of this survey proved to be questionable for our sample. Conclusions Based on our findings we must caution occupational health and safety professionals about the use of employee self-report data to estimate workplace physical activity. While the survey produced accurate measurements for time spent sitting at work it was more difficult for employees to estimate their workplace physical activity.
Køster, B; Søndergaard, J; Nielsen, J B; Olsen, A; Bentzen, J
An important feature of questionnaire validation is reliability. To be able to measure a given concept by questionnaire validly, the reliability needs to be high. The objectives of this study were to examine reliability of attitude and knowledge and behavioral consistency of sunburn in a developed questionnaire for monitoring and evaluating population sun-related behavior. Sun related behavior, attitude and knowledge was measured weekly by a questionnaire in the summer of 2013 among 664 Danes. Reliability was tested in a test-retest design. Consistency of behavioral information was tested similarly in a questionnaire adapted to measure behavior throughout the summer. The response rates for questionnaire 1, 2 and 3 were high and the drop out was not dependent on demographic characteristic. There was at least 73% agreement between sunburns in the measurement week and the entire summer, and a possible sunburn underestimation in questionnaires summarizing the entire summer. The participants underestimated their outdoor exposure in the evaluation covering the entire summer as compared to the measurement week. The reliability of scales measuring attitude and knowledge was high for majority of scales, while consistency in protection behavior was low. To our knowledge, this is the first study to report reliability for a completely validated questionnaire on sun-related behavior in a national random population based sample. Further, we show that attitude and knowledge questions confirmed their validity with good reliability, while consistency of protection behavior in general and in a week's measurement was low.
López-de-Uralde-Villanueva, I; Gil-Martínez, A; Candelas-Fernández, P; de Andrés-Ares, J; Beltrán-Alacreu, H; La Touche, R
The self-administered Leeds Assessment of Neuropathic Symptoms and Signs (S-LANSS) scale is a tool designed to identify patients with pain with neuropathic features. To assess the validity and reliability of the Spanish-language version of the S-LANSS scale. Our study included a total of 182 patients with chronic pain to assess the convergent and discriminant validity of the S-LANSS; the sample was increased to 321 patients to evaluate construct validity and reliability. The validated Spanish-language version of the ID-Pain questionnaire was used as the criterion variable. All participants completed the ID-Pain, the S-LANSS, and the Numerical Rating Scale for pain. Discriminant validity was evaluated by analysing sensitivity, specificity, and the area under the receiver operating characteristic curve (AUC). Construct validity was assessed with factor analysis and by comparing the odds ratio of each S-LANSS item to the total score. Convergent validity and reliability were evaluated with Pearson's r and Cronbach's alpha, respectively. The optimal cut-off point for S-LANSS was ≥12 points (AUC=.89; sensitivity=88.7; specificity=76.6). Factor analysis yielded one factor; furthermore, all items contributed significantly to the positive total score on the S-LANSS (P<.05). The S-LANSS showed a significant correlation with ID-Pain (r=.734, α=.71). The Spanish-language version of the S-LANSS is valid and reliable for identifying patients with chronic pain with neuropathic features. Copyright © 2016 Sociedad Española de Neurología. Publicado por Elsevier España, S.L.U. All rights reserved.
Sejbæk, Tobias; Blaabjerg, Morten; Sprogøe, Pippi
. The Multiple Sclerosis Neuropsychological Screening Questionnaire (MSNQ) has previously shown good validity in American, Argentinean, and Dutch MS cohorts. We sought to test reliability and validity of a Danish translation of the MSNQ compared with formal neuropsychological testing, and measures of depression...... the Expanded Disability Status Scale and MS Impairment Scale. Results: The test-retest reliability of the MSNQ-P was significant (R2 = 0.79, P ... that the MSNQ-P measures these items more than the cognitive abilities of the patients. Conclusions: This study does not support use of the MSNQ as a sensitive or valid screening tool for cognitive impairment in Danish patients with MS....
Chipi, Elena; Frattini, Giulia; Eusebi, Paolo; Mollica, Anita; D'Andrea, Katia; Russo, Mirella; Bernardelli, Alice; Montanucci, Chiara; Luchetti, Elisa; Calabresi, Paolo; Parnetti, Lucilla
The Alzheimer's disease Cooperative Study (ADCS)-Cognitive Function Instrument (CFI) is a 14-item questionnaire administered to the subject and the referent, aimed at detecting early changes in cognitive and functional abilities in individuals without clinical impairment. It is used for monitoring annual variations in cognitive functioning in prevention trials. The aim of the present study was to validate the Italian version of the CFI. A consecutive series of 257 functionally independent subjects was recruited among relatives of patients or as volunteers. They were administered CFI and global cognition measurements: Mini-Mental Status Examination (MMSE) and Repeatable Battery for the Assessment of Neuropsychological Status (RBANS). The reliability and criterion validity were comparable to the original in both self- and partner-report. Similarly to what reported in the original version, we found a corrected item-total correlation ranging between 0.38 and 0.54 in self-report and between 0.33 and 0.64 in partner-report. Cronbach's α was 0.77 (95% CI 0.72-0.83) in self-report and 0.78 (95% CI 0.73-0.84) in partner-report. Total partner- and self-report scores were significantly correlated (rS = 0.31, p reliability and validity of the Italian version of CFI. In order to definitely propose the use of CFI for tracking longitudinal changes of cognitive and functional abilities in subjects without clinical impairment, data from the follow-up of this cohort are needed.
Rikkert, Marcel G M Olde; Tona, Klodiana Daphne; Janssen, Lieneke
New staging systems of dementia require adaptation of disease management programs and adequate staging instruments. Therefore, we systematically reviewed the literature on validity and reliability of clinically applicable, multidomain, and dementia staging instruments. A total of 23 articles...... describing 12 staging instruments were identified (N = 6109 participants, age 65-87). Reliability was studied in most (91%) of the articles and was judged moderate to good. Approximately 78% of the articles evaluated concurrent validity, which was good to very good, while discriminant validity was assessed...... in only 25%. The scales can be applied in ±15 minutes. Clinical Dementia Rating (CDR), Global Deterioration scale (GDS), and Functional Assessment Staging (FAST) have been monitored on reliability and validity, and the CDR currently is the best-evidenced scale, also studied in international perspective...
Gleason, Philip M; Harris, Jeffrey; Sheean, Patricia M; Boushey, Carol J; Bruemmer, Barbara
This is the sixth in a series of monographs on research design and analysis. The purpose of this article is to describe and discuss several concepts related to the measurement of nutrition-related characteristics and outcomes, including validity, reliability, and diagnostic tests. The article reviews the methodologic issues related to capturing the various aspects of a given nutrition measure's reliability, including test-retest, inter-item, and interobserver or inter-rater reliability. Similarly, it covers content validity, indicators of absolute vs relative validity, and internal vs external validity. With respect to diagnostic assessment, the article summarizes the concepts of sensitivity and specificity. The hope is that dietetics practitioners will be able to both use high-quality measures of nutrition concepts in their research and recognize these measures in research completed by others. Copyright 2010 American Dietetic Association. Published by Elsevier Inc. All rights reserved.
Jillian E. Frideres
Full Text Available The purpose of this study was to design and to test the validity and reliability of an instrument to evaluate coaches' knowledge about the female athlete triad syndrome and their confidence in this knowledge. The instrument collects information regarding: knowledge of the syndrome, components, prevention and intervention; confidence of the coaches in their answers; and coach's characteristics (gender, degree held, years of experience in coaching females, continuing education participation specific to the syndrome and its components, and sport coached. The process of designing the questionnaire and testing the validity and reliability of it was done in four phases: a design and development of the instrument, b content validity, c instrument reliability, and d concurrent validity. The results show that the instrument is suitable for measuring coaches' female athlete triad knowledge. The instrument can contribute to assessing the coaches' knowledge level in relation to this topic.
Shim, In Hee; Bae, Dong Sik; Bahk, Won-Myong
The diagnostic validity of mixed features, excluding anxiety or psychomotor agitation in mood disorders, has not yet been fully examined. PubMed and relevant English-language literature (regardless of year) were searched. Keywords were mixed or mixed state or mixed features or mixed episode and anxious or anxiety or agitation and bipolar disorder or depressive disorder or mood disorder or affective disorder. Most studies on anxiety or psychomotor agitation have included a significant correlation relevant to the "with mixed features" specifier, although it is common in both poles of mood episodes regardless of the predominant polarity. There is some confusion between the characteristic of classical mixed states and the definition of the mixed features specifier with the newly added anxious distress specifier in DSM-5, specifically, whether to include anxiety and agitation as significant characteristics. This change is of concern because a large proportion of patients with mixed features are now unspecified, and this may influence treatment planning and prognosis. The findings of our review suggest that anxiety and psychomotor agitation can be core symptoms in mood episodes with mixed features and important clinical clues for prediction of treatment effects and disease course.
Chu, Anne H. Y.; Ng, Sheryl H. X.; Koh, David; Müller-Riemenschneider, Falk
Objective The Global Physical Activity Questionnaire (GPAQ) was originally designed to be interviewer-administered by the World Health Organization in assessing physical activity. The main aim of this study was to compare the psychometric properties of a self-administered GPAQ with the original interviewer-administered approach. Additionally, this study explored whether using different accelerometry-based physical activity bout definitions might affect the questionnaire’s validity. Methods A total of 110 participants were recruited and randomly allocated to an interviewer- (n = 56) or a self-administered (n = 54) group for test-retest reliability, of which 108 participants who met the wear time criteria were included in the validity study. Reliability was assessed by administration of questionnaires twice with a one-week interval. Criterion validity was assessed by comparing against seven-day accelerometer measures. Two definitions for accelerometry-data scoring were employed: (1) total-min of activity, and (2) 10-min bout. Results Participants had similar baseline characteristics in both administration groups and no significant difference was found between the two formats in terms of validity (correlations between the GPAQ and accelerometer). For validity, the GPAQ demonstrated fair-to-moderate correlations for moderate-to-vigorous physical activity (MVPA) for self-administration (r s = 0.30) and interviewer-administration (r s = 0.46). Findings were similar when considering 10-min activity bouts in the accelerometer analysis for MVPA (r s = 0.29 vs. 0.42 for self vs. interviewer). Within each mode of administration, the strongest correlations were observed for vigorous-intensity activity. However, Bland-Altman plots illustrated bias toward overestimation for higher levels of MVPA, vigorous- and moderate-intensity activities, and underestimation for lower levels of these measures. Reliability for MVPA revealed moderate correlations (r s = 0.61 vs. 0.63 for self
Summary: This report examines the meaning of validity and reliability and the role of psychometrics in plastic surgery. Study titles increasingly include the word “valid” to support the authors’ claims. Studies by other investigators may be labeled “not validated.” Validity simply refers to the ability of a device to measure what it intends to measure. Validity is not an intrinsic test property. It is a relative term most credibly assigned by the independent user. Similarly, the word “reliable” is subject to interpretation. In psychometrics, its meaning is synonymous with “reproducible.” The definitions of valid and reliable are analogous to accuracy and precision. Reliability (both the reliability of the data and the consistency of measurements) is a prerequisite for validity. Outcome measures in plastic surgery are intended to be surveys, not tests. The role of psychometric modeling in plastic surgery is unclear, and this discipline introduces difficult jargon that can discourage investigators. Standard statistical tests suffice. The unambiguous term “reproducible” is preferred when discussing data consistency. Study design and methodology are essential considerations when assessing a study’s validity. PMID:25289354
Eric Swanson, MD
Full Text Available Summary: This report examines the meaning of validity and reliability and the role of psychometrics in plastic surgery. Study titles increasingly include the word “valid” to support the authors’ claims. Studies by other investigators may be labeled “not validated.” Validity simply refers to the ability of a device to measure what it intends to measure. Validity is not an intrinsic test property. It is a relative term most credibly assigned by the independent user. Similarly, the word “reliable” is subject to interpretation. In psychometrics, its meaning is synonymous with “reproducible.” The definitions of valid and reliable are analogous to accuracy and precision. Reliability (both the reliability of the data and the consistency of measurements is a prerequisite for validity. Outcome measures in plastic surgery are intended to be surveys, not tests. The role of psychometric modeling in plastic surgery is unclear, and this discipline introduces difficult jargon that can discourage investigators. Standard statistical tests suffice. The unambiguous term “reproducible” is preferred when discussing data consistency. Study design and methodology are essential considerations when assessing a study’s validity.
Charalambous, Charalambos; Koulori, Agoritsa; Vasilopoulos, Aristidis; Roupa, Zoe
Prevention is the ideal strategy to tackle the problem of pressure ulcers. Pressure ulcer risk assessment scales are one of the most pivotal measures applied to tackle the problem, much criticisms has been developed regarding the validity and reliability of these scales. To investigate the validity and reliability of the Waterlow pressure ulcer risk assessment scale. The methodology used is a narrative literature review, the bibliography was reviewed through Cinahl, Pubmed, EBSCO, Medline and Google scholar, 26 scientific articles where identified. The articles where chosen due to their direct correlation with the objective under study and their scientific relevance. The construct and face validity of the Waterlow appears adequate, but with regards to content validity changes in the category age and gender can be beneficial. The concurrent validity cannot be assessed. The predictive validity of the Waterlow is characterized by high specificity and low sensitivity. The inter-rater reliability has been demonstrated to be inadequate, this may be due to lack of clear definitions within the categories and differentiating level of knowledge between the users. Due to the limitations presented regarding the validity and reliability of the Waterlow pressure ulcer risk assessment scale, the scale should be used in conjunction with clinical assessment to provide optimum results.
Charalambous, Charalambos; Koulori, Agoritsa; Vasilopoulos, Aristidis; Roupa, Zoe
Introduction Prevention is the ideal strategy to tackle the problem of pressure ulcers. Pressure ulcer risk assessment scales are one of the most pivotal measures applied to tackle the problem, much criticisms has been developed regarding the validity and reliability of these scales. Objective To investigate the validity and reliability of the Waterlow pressure ulcer risk assessment scale. Method The methodology used is a narrative literature review, the bibliography was reviewed through Cinahl, Pubmed, EBSCO, Medline and Google scholar, 26 scientific articles where identified. The articles where chosen due to their direct correlation with the objective under study and their scientific relevance. Results The construct and face validity of the Waterlow appears adequate, but with regards to content validity changes in the category age and gender can be beneficial. The concurrent validity cannot be assessed. The predictive validity of the Waterlow is characterized by high specificity and low sensitivity. The inter-rater reliability has been demonstrated to be inadequate, this may be due to lack of clear definitions within the categories and differentiating level of knowledge between the users. Conclusion Due to the limitations presented regarding the validity and reliability of the Waterlow pressure ulcer risk assessment scale, the scale should be used in conjunction with clinical assessment to provide optimum results. PMID:29736104
Marshall, Skye; Craven, Dana; Kelly, Jaimon; Isenring, Elizabeth
Malnutrition is a significant barrier to healthy and independent ageing in older adults who live in their own homes, and accurate diagnosis is a key step in managing the condition. However, there has not been sufficient systematic review or pooling of existing data regarding malnutrition diagnosis in the geriatric community setting. The current paper was conducted as part of the MACRo (Malnutrition in the Ageing Community Review) Study and seeks to determine the criterion (concurrent and predictive) validity and reliability of nutrition assessment tools in making a diagnosis of protein-energy malnutrition in the general older adult community. A systematic literature review was undertaken using six electronic databases in September 2016. Studies in any language were included which measured malnutrition via a nutrition assessment tool in adults ≥65 years living in their own homes. Data relating to the predictive validity of tools were analysed via meta-analyses. GRADE was used to evaluate the body of evidence. There were 6412 records identified, of which 104 potentially eligible records were screened via full text. Eight papers were included; two which evaluated the concurrent validity of the Mini Nutritional Assessment (MNA) and Subjective Global Assessment (SGA) and six which evaluated the predictive validity of the MNA. The quality of the body of evidence for the concurrent validity of both the MNA and SGA was very low. The quality of the body of evidence for the predictive validity of the MNA in detecting risk of death was moderate (RR: 1.92 [95% CI: 1.55-2.39]; P < 0.00001; n = 2013 participants; n = 4 studies; I 2 : 0%). The quality of the body of evidence for the predictive validity of the MNA in detecting risk of poor physical function was very low (SMD: 1.02 [95%CI: 0.24-1.80]; P = 0.01; n = 4046 participants; n = 3 studies; I 2 :89%). Due to the small number of studies identified and no evaluation of the predictive validity of tools other than
Rokotonarivo, Sarobidy; Schaafsma, Marije; Hockley, Neal
reliability measures. DCE results were generally consistent with those of other stated preference techniques (convergent validity), but hypothetical bias was common. Evidence supporting theoretical validity (consistency with assumptions of rational choice theory) was limited. In content validity tests, 2...
Full Text Available Abstract Background The aims of this study were to evaluate the construct validity (known group, concurrent validity (criterion based and test-retest (intra-rater reliability of manual goniometers to measure passive hip range of motion (ROM in femoroacetabular impingement patients and healthy controls. Methods Passive hip flexion, abduction, adduction, internal and external rotation ROMs were simultaneously measured with a conventional goniometer and an electromagnetic tracking system (ETS on two different testing sessions. A total of 15 patients and 15 sex- and age-matched healthy controls participated in the study. Results The goniometer provided greater hip ROM values compared to the ETS (range 2.0-18.9 degrees; P P Conclusions The present study suggests that goniometer-based assessments considerably overestimate hip joint ROM by measuring intersegmental angles (e.g., thigh flexion on trunk for hip flexion rather than true hip ROM. It is likely that uncontrolled pelvic rotation and tilt due to difficulties in placing the goniometer properly and in performing the anatomically correct ROM contribute to the overrating of the arc of these motions. Nevertheless, conventional manual goniometers can be used with confidence for longitudinal assessments in the clinic.
Ghaemi, Hamide; Khoddami, Seyyedeh Maryam; Soleymani, Zahra; Zandieh, Fariborz; Jalaie, Shohreh; Ahanchian, Hamid; Khadivi, Ehsan
The aim of this study was to develop, validate, and assess the reliability of the Persian version of Vocal Cord Dysfunction Questionnaire (VCDQ P ). The study design was cross-sectional or cultural survey. Forty-four patients with vocal fold dysfunction (VFD) and 40 healthy volunteers were recruited for the study. To assess the content validity, the prefinal questions were given to 15 experts to comment on its essential. Ten patients with VFD rated the importance of VCDQ P in detecting face validity. Eighteen of the patients with VFD completed the VCDQ 1 week later for test-retest reliability. To detect absolute reliability, standard error of measurement and smallest detected change were calculated. Concurrent validity was assessed by completing the Persian Chronic Obstructive Pulmonary Disease (COPD) Assessment Test (CAT) by 34 patients with VFD. Discriminant validity was measured from 34 participants. The VCDQ was further validated by administering the questionnaire to 40 healthy volunteers. Validation of the VCDQ as a treatment outcome tool was conducted in 18 patients with VFD using pre- and posttreatment scores. The internal consistency was confirmed (Cronbach α = 0.78). The test-retest reliability was excellent (intraclass correlation coefficient = 0.97). The standard error of measurement and smallest detected change values were acceptable (0.39 and 1.08, respectively). There was a significant correlation between the VCDQ P and the CAT total scores (P validity was significantly different. The VCDQ scores in patients with VFD before and after treatment was significantly different (P valid and reliable self-administered questionnaire in Persian-speaking population. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Szucs, Kimberly A; Brown, Elena V Donoso
[Purpose] Measurement of posture is important for those with a clinical diagnosis as well as researchers aiming to understand the impact of faulty postures on the development of musculoskeletal disorders. A reliable, cost-effective and low tech posture measure may be beneficial for research and clinical applications. The purpose of this study was to determine rater reliability and construct validity of a posture screening mobile application in healthy young adults. [Subjects and Methods] Pictures of subjects were taken in three standing positions. Two raters independently digitized the static standing posture image twice. The app calculated posture variables, including sagittal and coronal plane translations and angulations. Intra- and inter-rater reliability were calculated using the appropriate ICC models for complete agreement. Construct validity was determined through comparison of known groups using repeated measures ANOVA. [Results] Intra-rater reliability ranged from 0.71 to 0.99. Inter-rater reliability was good to excellent for all translations. ICCs were stronger for translations versus angulations. The construct validity analysis found that the app was able to detect the change in the four variables selected. [Conclusion] The posture mobile application has demonstrated strong rater reliability and preliminary evidence of construct validity. This application may have utility in clinical and research settings.
Ruhinda, E; Byanyima, R K; Mugerwa, H
Reliability and validity studies of different lumbar curvature analysis and measurement techniques have been documented however there is limited literature on the reliability and validity of subjective visual analysis. Radiological assessment of lumbar lordotic curve aids in early diagnosis of conditions even before neurologic changes set in. To ascertain the level of reliability and validity of subjective assessment of lumbar lordosis in conventional radiography. A blinded, repeated-measures diagnostic test was carried out on lumbar spine x-ray radiographs. Radiology Department at Joint Clinical Research Centre (JCRC), Mengo-Kampala-Uganda. Seventy (70) lateral lumbar x-ray films were used for this study and were obtained from the archive of JCRC radiology department at Butikiro house, Mengo-Kampala. Poor observer agreement, both inter- and intra-observer, with kappa values of 0.16 was found. Inter-observer agreement was poorer than intra-observer agreement. Kappa values significantly rose when the lumbar lordosis was clustered into four categories without grading each abnormality. The results confirm that subjective assessment of lumbar lordosis has low reliability and validity. Film quality has limited influence on the observer reliability. This study further shows that fewer scale categories of lordosis abnormalities produce better observer reliability.
Full Text Available An important feature of questionnaire validation is reliability. To be able to measure a given concept by questionnaire validly, the reliability needs to be high.The objectives of this study were to examine reliability of attitude and knowledge and behavioral consistency of sunburn in a developed questionnaire for monitoring and evaluating population sun-related behavior.Sun related behavior, attitude and knowledge was measured weekly by a questionnaire in the summer of 2013 among 664 Danes. Reliability was tested in a test-retest design. Consistency of behavioral information was tested similarly in a questionnaire adapted to measure behavior throughout the summer.The response rates for questionnaire 1, 2 and 3 were high and the drop out was not dependent on demographic characteristic. There was at least 73% agreement between sunburns in the measurement week and the entire summer, and a possible sunburn underestimation in questionnaires summarizing the entire summer. The participants underestimated their outdoor exposure in the evaluation covering the entire summer as compared to the measurement week. The reliability of scales measuring attitude and knowledge was high for majority of scales, while consistency in protection behavior was low.To our knowledge, this is the first study to report reliability for a completely validated questionnaire on sun-related behavior in a national random population based sample. Further, we show that attitude and knowledge questions confirmed their validity with good reliability, while consistency of protection behavior in general and in a week's measurement was low. Keywords: Questionnaire, Validation, Reliability, Skin cancer, Prevention, Ultraviolet radiation
Li, Z; Yang, Y M; Zhang, C; Li, Y; Hu, J; Gao, L W; Zhou, Y X; Zhang, X J
Objective: To assess the reliability and validity of the Chinese version of Driving Anger Scale (DAS) in professional drivers in China and provide a scientific basis for the application of the scale in drivers in China. Methods: Professional drivers, including taxi drivers, bus drivers, truck drivers and school bus drivers, were selected to complete the questionnaire. Cronbach's α and split-half reliability were calculated to evaluate the reliability of DAS, and content, contract, discriminant and convergent validity were performed to measure the validity of the scale. Results: The overall Cronbach's α of DAS was 0.934 and the split-half reliability was 0.874. The correlation coefficient of each subscale with the total scale was 0.639-0.922. The simplified version of DAS supported a presupposed six-factor structure, explaining 56.371% of the total variance revealed by exploratory factor analysis. The DAS had good convergent and discriminant validity, with the success rate of calibration experiment of 100%. Conclusion: DAS has a good reliability and validity in professional drivers in China, and the use of DAS is worth promoting in divers.
Full Text Available Background: The purpose of this study was to evaluate the validity and reliability on the Persian translation of the Modifiable Activity Questionnaire (MAQ in a sample of Tehranian adolescents. Methods: Of a total of 52 subjects, a sub-sample of 40 participations (55.0% boys was used to assess the reliability and the validity of the physical activity questionnaire. The reliability of the two MAQs was calculated by intraclass correlation coefficients, and validation was evaluated using Pearson correlation coefficients to compare data between mean of the two MAQs and mean of four physical activity records. Results: Intraclass correlation coefficient was calculated to assess the reliability between two MAQs and the results of leisure time physical activity over the past year were 0.97. Pearson correlation coefficients between mean of two MAQs and mean of four physical activity records were 0.49 (P < 0.001, for leisure time physical activities. Conclusions: High reliability and relatively moderate validity were found for the Persian translation of the MAQ in a Tehranian adolescent population. Further studies with large sample size are suggested to assess the validity more precisely.
Ling, Samuel K K; Chan, Vincent; Ho, Karen; Ling, Fona; Lui, T H
Develop the first reliable and validated open-source outcome scoring system in the Chinese language for foot and ankle problems. Translation of the English FAOS into Chinese following regular protocols. First, two forward-translations were created separately, these were then combined into a preliminary version by an expert committee, and was subsequently back-translated into English. The process was repeated until the original and back translations were congruent. This version was then field tested on actual patients who provided feedback for modification. The final Chinese FAOS version was then tested for reliability and validity. Reliability analysis was performed on 20 subjects while validity analysis was performed on 50 subjects. Tools used to validate the Chinese FAOS were the SF36 and Pain Numeric Rating Scale (NRS). Internal consistency between the FAOS subgroups was measured using Cronbach's alpha. Spearman's correlation was calculated between each subgroup in the FAOS, SF36 and NRS. The Chinese FAOS passed both reliability and validity testing; meaning it is reliable, internally consistent and correlates positively with the SF36 and the NRS. The Chinese FAOS is a free, open-source scoring system that can be used to provide a relatively standardised outcome measure for foot and ankle studies. Copyright © 2017 Elsevier Ltd. All rights reserved.
Manios, Y; Androutsos, O; Moschonis, G; Birbilis, M; Maragkopoulou, K; Giannopoulou, A; Argyri, E; Kourlaba, G
The aim of this paper was to evaluate the criterion validity of the Physical Activity Questionnaire for Schoolchildren (PAQ-S). The current study is a subcohort of the Healthy Growth Study, a large-scale cross-sectional study. 202 schoolchildren aged 9-13 years from Greece completed the PAQ-S and wore an accelerometer for 4 consecutive days. Time spent moderate (MPA), moderate to vigorous (MVPA) and vigorous (VPA) physical activity was calculated based on PAQ-S and accelerometer data. The average time spent on MPA and MVPA as derived from PAQ-S and from accelerometers were significantly moderately correlated (r=0.462, PPAQ-S and accelerometer-measured time spent performing VPA (rho=0.150, P=0.057). Intraclass Correlation Coefficient (ICC) indicated a moderate agreement between PAQ-S and accelerometer in estimating MPA (ICC=0.592, PPAQ-S, indicate a systematic overestimation of physical activity time with increasing physical activity for PAQ-S. The validity of PAQ-S for the estimation of MPA and MVPA was found to be slightly similar self-reported measures for schoolchildren. Therefore, this questionnaire could be used as a tool for physical activity assessment in large population studies.
Marco Aurelio Lumertz Saffi
Full Text Available Using a sample of patients with coronary artery disease, this methodological study aimed to conduct a cross-cultural adaptation and validation of a questionnaire on knowledge of cardiovascular risk factors (Q-FARCS, lifestyle changes, and treatment adherence for use in Brazil. The questionnaire has three scales: general knowledge of risk factors (RFs; specific knowledge of these RFs; and lifestyle changes achieved. Cross-cultural adaptation included translation, synthesis, back-translation, expert committee review, and pretesting. Face and content validity, reliability, and construct validity were measured. Cronbach’s alpha for the total sample (n = 240 was 0.75. Assessment of psychometric properties revealed adequate face and content validity, and the construct revealed seven components. It was concluded that the Brazilian version of Q-FARCS had adequate reliability and validity for the assessment of knowledge of cardiovascular RFs.
Development of Reliable and Validated Tools to Evaluate Technical Resuscitation Skills in a Pediatric Simulation Setting: Resuscitation and Emergency Simulation Checklist for Assessment in Pediatrics.
Faudeux, Camille; Tran, Antoine; Dupont, Audrey; Desmontils, Jonathan; Montaudié, Isabelle; Bréaud, Jean; Braun, Marc; Fournier, Jean-Paul; Bérard, Etienne; Berlengi, Noémie; Schweitzer, Cyril; Haas, Hervé; Caci, Hervé; Gatin, Amélie; Giovannini-Chami, Lisa
To develop a reliable and validated tool to evaluate technical resuscitation skills in a pediatric simulation setting. Four Resuscitation and Emergency Simulation Checklist for Assessment in Pediatrics (RESCAPE) evaluation tools were created, following international guidelines: intraosseous needle insertion, bag mask ventilation, endotracheal intubation, and cardiac massage. We applied a modified Delphi methodology evaluation to binary rating items. Reliability was assessed comparing the ratings of 2 observers (1 in real time and 1 after a video-recorded review). The tools were assessed for content, construct, and criterion validity, and for sensitivity to change. Inter-rater reliability, evaluated with Cohen kappa coefficients, was perfect or near-perfect (>0.8) for 92.5% of items and each Cronbach alpha coefficient was ≥0.91. Principal component analyses showed that all 4 tools were unidimensional. Significant increases in median scores with increasing levels of medical expertise were demonstrated for RESCAPE-intraosseous needle insertion (P = .0002), RESCAPE-bag mask ventilation (P = .0002), RESCAPE-endotracheal intubation (P = .0001), and RESCAPE-cardiac massage (P = .0037). Significantly increased median scores over time were also demonstrated during a simulation-based educational program. RESCAPE tools are reliable and validated tools for the evaluation of technical resuscitation skills in pediatric settings during simulation-based educational programs. They might also be used for medical practice performance evaluations. Copyright © 2017 Elsevier Inc. All rights reserved.
Hayes, Corey J.; Bhandari, Naleen Raj; Kathe, Niranjan; Payakachat, Nalin
Limited evidence exists on how non-cancer pain (NCP) affects an individual’s health-related quality of life (HRQoL). This study aimed to validate the Medical Outcomes Study Short Form-12 Version 2 (SF-12v2), a generic measure of HRQoL, in a NCP cohort using the Medical Expenditure Panel Survey Longitudinal Files. The SF Mental Component Summary (MCS12) and SF Physical Component Summary (PCS12) were tested for reliability (internal consistency and test-retest reliability) and validity (construct: convergent and discriminant; criterion: concurrent and predictive). A total of 15,716 patients with NCP were included in the final analysis. The MCS12 and PCS12 demonstrated high internal consistency (Cronbach’s alpha and Mosier’s alpha > 0.8), and moderate and high test-retest reliability, respectively (MCS12 intraclass correlation coefficient (ICC): 0.64; PCS12 ICC: 0.73). Both scales were significantly associated with a number of chronic conditions (p reliable and valid measure of HRQoL for patients with NCP. PMID:28445438
Andrew J. Butler
Full Text Available Mental imagery can improve motor performance in stroke populations when combined with physical therapy. Valid and reliable instruments to evaluate the imagery ability of stroke survivors are needed to maximize the benefits of mental imagery therapy. The purposes of this study were to: examine and compare the test-retest intra-rate reliability of the Movement Imagery Questionnaire-Revised, Second Edition (MIQ-RS in stroke survivors and able-bodied controls, examine internal consistency of the visual and kinesthetic items of the MIQ-RS, determine if the MIQ-RS includes both the visual and kinesthetic dimensions of mental imagery, correlate impairment and motor imagery scores, and investigate the criterion validity of the MIQ-RS in stroke survivors by comparing the results to the KVIQ-10. Test-retest analysis indicated good levels of reliability (ICC range: .83–.99 and internal consistency (Cronbach α: .95–.98 of the visual and kinesthetic subscales in both groups. The two-factor structure of the MIQ-RS was supported by factor analysis, with the visual and kinesthetic components accounting for 88.6% and 83.4% of the total variance in the able-bodied and stroke groups, respectively. The MIQ-RS is a valid and reliable instrument in the stroke population examined and able-bodied populations and therefore useful as an outcome measure for motor imagery ability.
Xiao, Lin; Gao, Yulin; Zhang, Lili; Chen, Peiyun; Sun, Xiaojia; Tang, Siyuan
Previous literatures on quality of life (QoL) in bipolar disorder (BD) strongly suggested that a disease-specific QoL measure for patients with BD should be developed to evaluate QoL more specifically and reliably. To our knowledge, "Quality of Life in Bipolar Disorder" (QoL.BD) is the first and only questionnaire produced to specifically measure QoL in people with BD. In China, there is no disease-targeted measure available to specifically measure QoL in Chinese patients with BD. The aim of the study is to revise and validate the brief version of the QoL.BD (Bref QoL.BD ) into Chinese version. All the items of the Bref QoL.BD was translated into Chinese language, using the Brislin translation mode. The questionnaire was administered to a total sample of 231 subjects, including 101 BD patients and 130 healthy controls, to test the psychometric properties of Bref QoL.BD (e.g. internal consistency, retest reliability, content validity, item analysis, confirmatory factor analysis, criterion validity, convergent validity, discriminative validity and feasibility). The Chinese version of the Bref QoL.BD had very high internal consistency (Cronbach's alpha=0.815) and retest reliability (intraclass correlation coefficient (ICC )=0.808). Confirmatory factor analysis (CFA) validated the original one-factor structure. The direction and magnitude of correlations with 36-item Short-Form Health Survey (SF-36; rs= 0.313, Psize from only one tertiary care center. And BD patients enrolled were euthymic, excluding the acute BD patients. The Chinese version of the Bref QoL.BD is a feasible, reliable and valid tool for the assessment of QoL for Chinese BD patients. Copyright © 2015 Elsevier B.V. All rights reserved.
Campo-Arias, Adalberto; Lafaurie, María Mercedes; Gaitán-Duarte, Hernando G
There are several scales to quantify homophobia in different populations. However, the reliability and validity of these instruments among Colombian students are unknown. Consequently, this work is intended to assess reliability (inner consistency) as well as the validity of the Scale for Homophobia in Medicine students from a private university in Bogotá (Colombia). Methodological study with 199 Medicine students from 1st to 5th semester that filled out the Homophobia Scale form, the general welfare questionnaire, the Attitude Towards Gays and Lesbians Scale (ATGL), WHO-5 (divergent validity) and the Francis Scale of Attitude Toward Christianity (nomologic validity). Pearson's correlations were computed, the Cronbach's alfa coefficient, the omega coefficient (construct's reliability) and confirmatory factorial analysis. The Scale for Homophobia showed an alpha Cronbach coefficient of 0,785, an omega coefficient of 0,790 and a Pearson correlation with the ATGL of 0,844; with WHO-5, -0,059; and a Francis Scale of Attitude Toward Christianity, 0,187. The Scale toward Homophobia exhibited a relevant factor of 44,7% of the total variance. The Scale for Homophobia showed acceptable reliability and validity. New studies should investigate the stability of the scale and the nomologic validity regarding other constructs. Copyright © 2012 Asociación Colombiana de Psiquiatría. Publicado por Elsevier España. All rights reserved.
Loudon, Kirsty; Zwarenstein, Merrick; Sullivan, Frank M; Donnan, Peter T; Gágyor, Ildikó; Hobbelen, Hans J S M; Althabe, Fernando; Krishnan, Jerry A; Treweek, Shaun
PRagmatic Explanatory Continuum Indicator Summary (PRECIS)-2 is a tool that could improve design insight for trialists. Our aim was to validate the PRECIS-2 tool, unlike its predecessor, testing the discriminant validity and interrater reliability. Over 80 international trialists, methodologists, clinicians, and policymakers created PRECIS-2 helping to ensure face validity and content validity. The interrater reliability of PRECIS-2 was measured using 19 experienced trialists who used PRECIS-2 to score a diverse sample of 15 randomized controlled trial protocols. Discriminant validity was tested with two raters to independently determine if the trial protocols were more pragmatic or more explanatory, with scores from the 19 raters for the 15 trials as predictors of pragmatism. Interrater reliability was generally good, with seven of nine domains having an intraclass correlation coefficient over 0.65. Flexibility (adherence) and recruitment had wide confidence intervals, but raters found these difficult to rate and wanted more information. Each of the nine PRECIS-2 domains could be used to differentiate between trials taking more pragmatic or more explanatory approaches with better than chance discrimination for all domains. We have assessed the validity and reliability of PRECIS-2. An elaboration study and web site provide guidance to help future users of the tool which is continuing to be tested by trial teams, systematic reviewers, and funders. Copyright © 2017 Elsevier Inc. All rights reserved.
Dawson, Andreas; Raphael, Karen G; Glaros, Alan; Axelsson, Susanna; Arima, Taro; Ernberg, Malin; Farella, Mauro; Lobbezoo, Frank; Manfredini, Daniele; Michelotti, Ambra; Svensson, Peter; List, Thomas
To combine empirical evidence and expert opinion in a formal consensus method in order to develop a quality-assessment tool for experimental bruxism studies in systematic reviews. Tool development comprised five steps: (1) preliminary decisions, (2) item generation, (3) face-validity assessment, (4) reliability and discriminitive validity assessment, and (5) instrument refinement. The kappa value and phi-coefficient were calculated to assess inter-observer reliability and discriminative ability, respectively. Following preliminary decisions and a literature review, a list of 52 items to be considered for inclusion in the tool was compiled. Eleven experts were invited to join a Delphi panel and 10 accepted. Four Delphi rounds reduced the preliminary tool-Quality-Assessment Tool for Experimental Bruxism Studies (Qu-ATEBS)- to 8 items: study aim, study sample, control condition or group, study design, experimental bruxism task, statistics, interpretation of results, and conflict of interest statement. Consensus among the Delphi panelists yielded good face validity. Inter-observer reliability was acceptable (k = 0.77). Discriminative validity was excellent (phi coefficient 1.0; P reviews of experimental bruxism studies, exhibits face validity, excellent discriminative validity, and acceptable inter-observer reliability. Development of quality assessment tools for many other topics in the orofacial pain literature is needed and may follow the described procedure.
Full Text Available In this study it was aimed to make the studies of the translation of Perception of Organizational Politics Scale into Turkish and the validity and reliability of the scale. Perceptions of Organizational Politics Scale’s (POPS validities has been tested in terms of view, content and structure. The application is designed as a two-stage process. In the first stage, face and content validity was tested. In the second stage, it was sought evidences for the construct validity of the scale by making exploratory factor analysis (EFA and then the confirmatory factor analysis (CFA to the data obtained. In determining the reliability of the scale item-total score correlations and Cronbach alpha coefficient was used. The application made for the validity and reliability of the scale was conducted on the data collected from 277 faculty members working in universities’ education faculties. As a method of achieving those faculty members "Simple randomized (random sampling" is used. The psychometric properties of the Turkish version of Perception of Organizational Politics Scale showed that the scale has a satisfactory level of reliability and validity for the Turkish employee sample.
Aertssen, W F M; Steenbergen, B; Smits-Engelsman, B C M
There is lack of valid and reliable field-based tests for assessing functional strength in young children with mild intellectual disabilities (IDs). The aim of this study was to investigate the test-retest reliability and construct validity of the Functional Strength Measurement in children with ID (FSM-ID). Fifty-two children with mild ID (40 boys and 12 girls, mean age 8.48 years, SD = 1.48) were tested with the FSM. Test-retest reliability (n = 32) was examined by a two-way interclass correlation coefficient for agreement (ICC 2.1A). Standard error of measurement and smallest detectable change were calculated. Construct validity was determined by calculating correlations between the FSM-ID and handheld dynamometry (HHD) (convergent validity), FSM-ID, FSM-ID and subtest strength of the Bruininks-Oseretsky test of motor proficiency - second edition (BOT-2) (convergent validity) and the FSM-ID and balance subtest of the BOT-2 (discriminant validity). Test-retest reliability ICC ranged 0.89-0.98. Correlation between the items of the FSM-ID and HHD ranged 0.39-0.79 and between FSM-ID and BOT-2 (strength items) 0.41-0.80. Correlation between items of the FSM-ID and BOT-2 (balance items) ranged 0.41-0.70. The FSM-ID showed good test-retest reliability and good convergent validity with the HHD and BOT-2 subtest strength. The correlations assessing discriminant validity were higher than expected. Poor levels of postural control and core stability in children with mild IDs may be the underlying factor of those higher correlations. © 2018 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and John Wiley & Sons Ltd.
André, Helô-Isa; Carnide, Filomena; Borja, Edgar; Ramalho, Fátima; Santos-Rocha, Rita; Veloso, António P
This study aimed to develop a new field test protocol with a standardized measurement of strength and power in plantar flexor muscles targeted to functionally independent older adults, the calf-raise senior (CRS) test, and also evaluate its reliability and validity. Forty-one subjects aged 65 years and older of both sexes participated in five different cross-sectional studies: 1) pilot (n=12); 2) inter- and intrarater agreement (n=12); 3) construct (n=41); 4) criterion validity (n=33); and 5) test-retest reliability (n=41). Different motion parameters were compared in order to define a specifically designed protocol for seniors. Two raters evaluated each participant twice, and the results of the same individual were compared between raters and participants to assess the interrater and intrarater agreement. The validity and reliability studies involved three testing sessions that lasted 2 weeks, including a battery of functional fitness tests, CRS test in two occasions, accelerometry, and strength assessments in an isokinetic dynamometer. The CRS test presented an excellent test-retest reliability (intraclass correlation coefficient [ICC] =0.90, standard error of measurement =2.0) and interrater reliability (ICC =0.93-0.96), as well as a good intrarater agreement (ICC =0.79-0.84). Participants with better results in the CRS test were younger and presented higher levels of physical activity and functional fitness. A significant association between test results and all strength parameters (isometric, r =0.87, r 2 =0.75; isokinetic, r =0.86, r 2 =0.74; and rate of force development, r =0.77, r 2 =0.59) was shown. This study was successful in demonstrating that the CRS test can meet the scientific criteria of validity and reliability. The test can be a good indicator of ankle strength in older adults and proved to discriminate significantly between individuals with improved functionality and levels of physical activity.
Ana Lúcia Araújo Gomes
Full Text Available ABSTRACT Objective: To evaluate the psychometric properties in terms of validity and reliability of the scale Self-efficacy and their child's level of asthma control: Brazilian version. Method: Methodological study in which 216 parents/guardians of children with asthma participated. A construct validation (factor analysis and test of hypothesis by comparison of contrasted groups and an analysis of reliability in terms of homogeneity (Cronbach's alpha and stability (test-retest were carried out. Results: Exploratory factor analysis proved suitable for the Brazilian version of the scale (Kaiser-Meyer-Olkim index of 0.879 and Bartlett's sphericity with p < 0.001. The correlation matrix in factor analysis suggested the removal of item 7 from the scale. Cronbach's alpha of the final scale, with 16 items, was 0.92. Conclusion: The Brazilian version of Self-efficacy and their child's level of asthma control presented psychometric properties that confirmed its validity and reliability.
Nygren, Björn; Randström, Kerstin Björkman; Lejonklou, Anna K; Lundman, Beril
The purpose of this study was to test the reliability and validity of the Swedish language version of the Resilience Scale (RS). Participants were 142 adults between 19-85 years of age. Internal consistency reliability, stability over time, and construct validity were evaluated using Cronbach's alpha, principal components analysis with varimax rotation and correlations with scores on the Sense of Coherence Scale (SOC) and the Rosenberg Self-Esteem Scale (RSE). The mean score on the RS was 142 (SD = 15). The possible scores on the RS range from 25 to 175, and scores higher than 146 are considered high. The test-retest correlation was .78. Correlations with the SOC and the RSE were .41 (p Self and Life emerged as components from the principal components analysis. These findings provide evidence for the reliability and validity of the Swedish language version of the RS.
O’CONNOR, MELISSA; DAVITT, JOAN K.
The Outcome and Assessment Information Set (OASIS) is the patient-specific, standardized assessment used in Medicare home health care to plan care, determine reimbursement, and measure quality. Since its inception in 1999, there has been debate over the reliability and validity of the OASIS as a research tool and outcome measure. A systematic literature review of English-language articles identified 12 studies published in the last 10 years examining the validity and reliability of the OASIS. Empirical findings indicate the validity and reliability of the OASIS range from low to moderate but vary depending on the item studied. Limitations in the existing research include: nonrepresentative samples; inconsistencies in methods used, items tested, measurement, and statistical procedures; and the changes to the OASIS itself over time. The inconsistencies suggest that these results are tentative at best; additional research is needed to confirm the value of the OASIS for measuring patient outcomes, research, and quality improvement. PMID:23216513
Pakpour, Amir H.; Nourozi, Saeedeh; Mølsted, Stig
INTRODUCTION: The aim of the study was to assess the validity and reliability of the SF-12 questionnaire in a sample of Iranian patients undergoing hemodialysis. MATERIALS AND METHODS: One hundred and forty-four hemodialysis patients were included from dialysis centers in Zanjan, Iran, and were...... asked to complete the SF-12 and SF-36 questionnaires. An initial test-retest reliability evaluation was performed on a sample of 70 patients from the total group, with a retest interval of 14 days. Reliability was estimated by internal consistency and validity was assessed using known-group comparisons...... and construct validity on the patient group as a whole. A linear regression analysis was used to assess any variation in the physical component summary and mental component summary scores of the SF-36 with the respective component summary scores of the SF-12. In addition, the factor structure...
Strøyer, Jesper; Essendrop, Morten; Jensen, Lone Donbaek
To test the validity and reliability of self-assessed physical fitness samples included healthcare assistants working at a hospital (women=170, men=17), persons working with physically and mentally handicapped patients (women=530, men= 123), and two separate groups of healthcare students (a) women...... except for flexibility among men. The reliability was moderate to good (ICC = .62 - .80). Self-assessed aerobic fitness, muscle strength, and flexibility showed moderate construct validity and moderate to good reliability using visual analogues.......=91 and men=5 and (b) women=159 and men=10. Five components of physical fitness were self-assessed by Visual Analogue Scales with illustrations and verbal anchors for the extremes: aerobic fitness, muscle strength, endurance, flexibility, and balance. Convergent and divergent validity were evaluated...
Eshghi, Mohammad Ali; Kordi, Ramin; Memari, Amir Hossein; Ghaziasgar, Ahmad; Mansournia, Mohammad-Ali; Zamani Sani, Seyed Hojjat
The Youth Sport Environment Questionnaire (YSEQ) had been developed from Group Environment Questionnaire, a well-known measure of team cohesion. The aim of this study was to adapt and examine the reliability and validity of the Farsi version of the YSEQ. This version was completed by 455 athletes aged 13-17 years. Results of confirmatory factor analysis indicated that two-factor solution showed a good fit to the data. The results also revealed that the Farsi YSEQ showed high internal consistency, test-retest reliability, and good concurrent validity. This study indicated that the Farsi version of the YSEQ is a valid and reliable measure to assess team cohesion in sport setting.
Mohammad Ali Eshghi
Full Text Available The Youth Sport Environment Questionnaire (YSEQ had been developed from Group Environment Questionnaire, a well-known measure of team cohesion. The aim of this study was to adapt and examine the reliability and validity of the Farsi version of the YSEQ. This version was completed by 455 athletes aged 13–17 years. Results of confirmatory factor analysis indicated that two-factor solution showed a good fit to the data. The results also revealed that the Farsi YSEQ showed high internal consistency, test-retest reliability, and good concurrent validity. This study indicated that the Farsi version of the YSEQ is a valid and reliable measure to assess team cohesion in sport setting.
Full Text Available The main objective of this study is to develop a valid and reliable scale for identifying digital citizenship perceptions of young people in the most common age groups. The study was conducted as a survey study. The study group of this study is composed of 438 people in Turkey who are among 16-24 age group with the highest rate of internet use in Turkey. An exploratory factor analysis was performed to determine the validity of the scale and the item discrimination powers were calculated. The total variance of the scale was determined that the scale had 8-factor structure and was found to be 49,70%. The internal consistency level was also calculated to determine the reliability of the scale. As a result, it can be said that this scale is a valid and reliable scale that can be used to determine the digital citizenship perceptions of young people.
Full Text Available Abstract Background Insufficient participation in physical activity and excessive screen time have been observed among Chinese children. The role of social and environmental factors in shaping physical activity and sedentary behaviors among Chinese children is under-investigated. The purpose of the present study was to assess the reliability and validity of a questionnaire to measure child- and parent-reported psychosocial and environmental correlates of physical activity and screen-based behaviors among Chinese children in Hong Kong. Methods A total of 303 schoolchildren aged 9-14 years and their parents volunteered to participate in this study and 160 of them completed the questionnaire twice within an interval of 10 days. Intraclass correlation coefficients (ICCs, kappa statistics, and percent agreement were performed to evaluate test-retest reliability of the continuous and categorical variables, respectively. Exploratory factor analyses (EFAs were conducted to assess convergent validity of the emergent scales. Cronbach's alpha and ICCs were performed to assess internal and test-retest reliability of the emergent scales. Criterion validity was assessed by correlating psychosocial and environmental measures with self-reported physical activity and screen-based behaviors, measured by a validated questionnaire. Results Reliability statistics for both child- and parent-reported continuous variables showed acceptable consistency for all of the ICC values greater than 0.70. Kappa statistics showed fair to perfect test-retest reliability for the categorical items. Adequate internal consistency and test-retest reliability were observed in most of the emergent scales. Criterion validity assessed by correlating psychosocial and environmental measures with child-reported physical activity found associations with physical activity in the self-efficacy scale (r = 0.25, P r = 0.25, P r = 0.14, P r = -0.22, P r = 0.12, P = 0.053. Conclusions The findings
Prowse, Ashleigh; Aslaksen, Berit; Kierkegaard, Marie; Furness, James; Gerdhem, Paul; Abbott, Allan
To investigate the reliability and concurrent validity of the Baseline ® Body Level/Scoliosis meter for adolescent idiopathic scoliosis postural assessment in three anatomical planes. This is an observational reliability and concurrent validity study of adolescent referrals to the Orthopaedic department for scoliosis screening at Karolinska University Hospital, Stockholm, Sweden between March-May 2012. A total of 31 adolescents with idiopathic scoliosis (13.6 ± 0.6 years old) of mild-moderate curvatures (25° ± 12°) were consecutively recruited. Measurement of cervical, thoracic and lumbar curvatures, pelvic and shoulder tilt, and axial thoracic rotation (ATR) were performed by two trained physiotherapists in one day. The intraclass correlation coefficient (ICC) was used to determine the inter-examiner reliability (ICC2,1) and the intra-rater reliability (ICC3,3) of the Baseline ® Body Level/Scoliosis meter. Spearman's correlation analyses were used to estimate concurrent validity between the Baseline ® Body Level/Scoliosis meter and Gold Standard Cobb angles from radiographs and the Orthopaedic Systems Inc. Scoliometer. There was excellent reliability between examiners for thoracic kyphosis (ICC2,1 = 0.94), ATR (ICC2,1 = 0.92) and lumbar lordosis (ICC2,1 = 0.79). There was adequate reliability between examiners for cervical lordosis (ICC2,1 = 0.51), however poor reliability for pelvic and shoulder tilt. Both devices were reproducible in the measurement of ATR when repeated by one examiner (ICC3,3 0.98-1.00). The device had a good correlation with the Scoliometer (rho = 0.78). When compared with Cobb angle from radiographs, there was a moderate correlation for ATR (rho = 0.627). The Baseline ® Body Level/Scoliosis meter provides reliable transverse and sagittal cervical, thoracic and lumbar measurements and valid transverse plan measurements of mild-moderate scoliosis deformity.
Uno, Yota; Mizukami, Hitomi; Ando, Masahiko; Yukihiro, Ryoji; Iwasaki, Yoko; Ozaki, Norio
OBJECTIVE: The present study evaluated the reliability and concurrent validity of the new Tanaka B Intelligence Scale, which is an intelligence test that can be administered on groups within a short period of time. METHODS: The new Tanaka B Intelligence Scale and Wechsler Intelligence Scale for Children-Third Edition were administered to 81 subjects (mean age ± SD 15.2 ± 0.7 years) residing in a juvenile detention home; reliability was assessed using Cronbach's alpha coefficient, and concurre...
Abay, Halime; Kaplan, Sena
There are a limited number of menopause-specific quality-of-life scales for the Turkish population. This study was conducted to evaluate the validity and reliability of the Turkish Utian Quality-of-Life Scale in postmenopausal women. The study group was comprised of 250 postmenopausal women who applied to a training and research hospital's menopause clinic in Turkey. A survey form and the Turkish Utian quality-of-Life Scale were used to collect data, and the Turkish version of Short Form-36 was used to evaluate reliability with an equivalent form. Language-validity, content-validity, and construct-validity methods were used to assess the validity of the scale, and Cronbach's α coefficient calculation and the equivalent-form reliability methods were used to assess the reliability of the scale. The Turkish Utian Quality-of-Life Scale was determined to be a valid and reliable instrument for measuring the quality of life of postmenopausal women. Confirmatory factor analysis demonstrates that the instrument fits well with 23 items and a four-factor model. The Cronbach's α coefficient for the quality-of-life domains were as follows: 0.88 overall, 0.79 health, 0.78 emotional, 0.76 sexual, and 0.75 occupational. Reliability of the instrument was confirmed through significant correlations between scores on the Turkish version of the Utian Quality-of-Life Scale and the Turkish version of the Short Form-36 (r = 0.745, P measuring quality of life during menopause.
Cruz, Jonas P; Baldacchino, Donia R; Alquwez, Nahed
Patients often resort to religious and spiritual activities to cope with physical and mental challenges. The effect of spiritual coping on overall health, adaptation and health-related quality of life among patients undergoing haemodialysis (HD) is well documented. Thus, it is essential to establish a valid and reliable instrument that can assess both the religious and non-religious coping methods in patients undergoing HD. This study aimed to assess the validity and reliability of the Spiritual Coping Strategies Scale Arabic version (SCS-A) in Saudi patients undergoing HD. A convenience sample of 60 Saudi patients undergoing HD was recruited for this descriptive, cross-sectional study. Data were collected between May and June 2015. Forward-backward translation was used to formulate the SCS-A. The SCS-A, Muslim Religiosity Scale and the Quality of Life Index Dialysis Version III were used to procure the data. Internal consistency reliability, stability reliability, factor analysis and construct validity tests were performed. Analyses were set at the 0.05 level of significance. The SCS-A showed an acceptable internal consistency and strong stability reliability over time. The EFA produced two factors (non-religious and religious coping). Satisfactory construct validity was established by the convergent and divergent validity and known-groups method. The SCS-A is a reliable and valid tool that can be used to measure the religious and non-religious coping strategies of patients undergoing HD in Saudi Arabia and other Muslim and Arabic-speaking countries. © 2016 European Dialysis and Transplant Nurses Association/European Renal Care Association.
Boonstra, Anne M; Schiphorst Preuper, Henrica R; Reneman, Michiel F; Posthumus, Jitze B; Stewart, Roy E
To determine the reliability and concurrent validity of a visual analogue scale (VAS) for disability as a single-item instrument measuring disability in chronic pain patients was the objective of the study. For the reliability study a test-retest design and for the validity study a cross-sectional design was used. A general rehabilitation centre and a university rehabilitation centre was the setting for the study. The study population consisted of patients over 18 years of age, suffering from chronic musculoskeletal pain; 52 patients in the reliability study, 344 patients in the validity study. Main outcome measures were as follows. Reliability study: Spearman's correlation coefficients (rho values) of the test and retest data of the VAS for disability; validity study: rho values of the VAS disability scores with the scores on four domains of the Short-Form Health Survey (SF-36) and VAS pain scores, and with Roland-Morris Disability Questionnaire scores in chronic low back pain patients. Results were as follows: in the reliability study rho values varied from 0.60 to 0.77; and in the validity study rho values of VAS disability scores with SF-36 domain scores varied from 0.16 to 0.51, with Roland-Morris Disability Questionnaire scores from 0.38 to 0.43 and with VAS pain scores from 0.76 to 0.84. The conclusion of the study was that the reliability of the VAS for disability is moderate to good. Because of a weak correlation with other disability instruments and a strong correlation with the VAS for pain, however, its validity is questionable.
Evenson Kelly R
Full Text Available Abstract Background The purpose of this study is to assess the reliability and validity of the U.S. National Center for Safe Routes to School's in-class student travel tallies and written parent surveys. Over 65,000 tallies and 374,000 parent surveys have been completed, but no published studies have examined their measurement properties. Methods Students and parents from two Charlotte, NC (USA elementary schools participated. Tallies were conducted on two consecutive days using a hand-raising protocol; on day two students were also asked to recall the previous days' travel. The recall from day two was compared with day one to assess 24-hour test-retest reliability. Convergent validity was assessed by comparing parent-reports of students' travel mode with student-reports of travel mode. Two-week test-retest reliability of the parent survey was assessed by comparing within-parent responses. Reliability and validity were assessed using kappa statistics. Results A total of 542 students participated in the in-class student travel tally reliability assessment and 262 parent-student dyads participated in the validity assessment. Reliability was high for travel to and from school (kappa > 0.8; convergent validity was lower but still high (kappa > 0.75. There were no differences by student grade level. Two-week test-retest reliability of the parent survey (n = 112 ranged from moderate to very high for objective questions on travel mode and travel times (kappa range: 0.62 - 0.97 but was substantially lower for subjective assessments of barriers to walking to school (kappa range: 0.31 - 0.76. Conclusions The student in-class student travel tally exhibited high reliability and validity at all elementary grades. The parent survey had high reliability on questions related to student travel mode, but lower reliability for attitudinal questions identifying barriers to walking to school. Parent survey design should be improved so that responses clearly indicate
McDonald, Noreen C; Dwelley, Amanda E; Combs, Tabitha S; Evenson, Kelly R; Winters, Richard H
The purpose of this study is to assess the reliability and validity of the U.S. National Center for Safe Routes to School's in-class student travel tallies and written parent surveys. Over 65,000 tallies and 374,000 parent surveys have been completed, but no published studies have examined their measurement properties. Students and parents from two Charlotte, NC (USA) elementary schools participated. Tallies were conducted on two consecutive days using a hand-raising protocol; on day two students were also asked to recall the previous days' travel. The recall from day two was compared with day one to assess 24-hour test-retest reliability. Convergent validity was assessed by comparing parent-reports of students' travel mode with student-reports of travel mode. Two-week test-retest reliability of the parent survey was assessed by comparing within-parent responses. Reliability and validity were assessed using kappa statistics. A total of 542 students participated in the in-class student travel tally reliability assessment and 262 parent-student dyads participated in the validity assessment. Reliability was high for travel to and from school (kappa > 0.8); convergent validity was lower but still high (kappa > 0.75). There were no differences by student grade level. Two-week test-retest reliability of the parent survey (n=112) ranged from moderate to very high for objective questions on travel mode and travel times (kappa range: 0.62-0.97) but was substantially lower for subjective assessments of barriers to walking to school (kappa range: 0.31-0.76). The student in-class student travel tally exhibited high reliability and validity at all elementary grades. The parent survey had high reliability on questions related to student travel mode, but lower reliability for attitudinal questions identifying barriers to walking to school. Parent survey design should be improved so that responses clearly indicate issues that influence parental decision making in regards to their
Sahin, Fusun; Yilmaz, Figen; Ozmaden, Asli; Kotevolu, Nurdan; Sahin, Tulay; Kuran, Banu
The purpose of this study was to develop a Turkish version of the Berg Balance Scale (BBS) and assess its reliability and validity. Sixty healthy volunteers older than 65 years were included in to the study. Subjects who had lower extremity amputation, or were armchair or bedridden were excluded. After translation process, the Turkish version of the scale was administered to each participant twice with an interval of 2 weeks. The intraclass correlation coefficient (ICC) was calculated to assess intra- and inter-observer reliability. Chronbach alpha was calculated to evaluate internal consistency of the total BBS score. Interclass correlation coefficient was calcuated to examine test-retest reliability. Convergent validity was assessed by correlating the scale with Modified Barthel Index (MBI) and Timed Up and Go Test (TUG). Construct validity was assessed with factor analysis. The mean age in years of the participants were 77.00+/-5.67 (range: 67-92 yrs). The ICC for intra- and inter- observer reliability was 0.98 (pr=0.67 pr=-0.75 p<0.0001, respectively). The Turkish version of the BBS is a reliable and valid scale to be used in balance assessment of Turkish older adults.
Sahin Cankurtaran, Eylem; Danişman, Mustafa; Tutar, Hasan; Ulusoy Kaymak, Semra
The Neuropsychiatric Inventory-Clinician (NPI-C) scale is one of the best-known scales for evaluating the behavioral and psychological symptoms of dementia. This study aimed to assess the reliability and validity of the Turkish version of the NPI-C scale in patients with Alzheimer disease (AD). The NPI-C scale was administered to 125 patients with AD. For reliability, both Cronbach's α and interrater reliability were analyzed. The Behavioral Pathology in Alzheimer's Disease (BEHAVE-AD) scale was applied for validity and, in addition, the Mini Mental State Examination (MMSE), Instrumental Activities of Daily Living (IADL) scale, and Disability Assessment of Dementia (DAD) scale were completed. The Turkish version of the NPI-C scale showed high internal consistency (Cronbach's α = 0.75) and mostly good interrater reliability. Assessments of validity showed that the NPI-C and corresponding BEHAVE-AD domains were found to be significantly correlated, between 0.925 and 0.195. Moreover, the correlations between NPI-C and MMSE were significant for all domains except the dysphoria, anxiety, and elation/euphoria domains. When we conducted a correlation analysis of NPI-C with IADL, all domains were statistically significantly correlated except aggression, anxiety, elation/euphoria, and dysphoria. The Turkish version of the NPI-C scale was found to be a reliable and valid instrument to assess neuropsychiatric symptoms in Turkish elderly subjects with AD.
Full Text Available This study aims to develop a valid and reliable instrument for measuring students' social studies achievement goal. The research was conducted on a study group consisted of 374 middle school students studying in the central district of Diyarbakır in 2014-2015 school year fall semester. Expert opinion was consulted with regard to the scale's content and face validity. Exploratory Factor Analysis (EFA and Confirmatory Factor Analysis (CFA were performed in order to measure the scale's construct validity. As a result of EFA, a 29-item and a six-factor structure model which explains 50.82% of the total variance was obtained. The emerging factors were called as a self-approach, task-approach, other-approach, task-avoidance, other-avoidance and self-avoidance respectively. The findings acquired CFA indicated that the 29-item and six-factor structure related to social studies oriented achievement goal scale have acceptable goodness of fit indices. The scale's reliability coefficients were calculated by means of internal consistency method. As a result of reliability analysis, it was determined that the reliability coefficients were within admissible limits. The finding of the item correlation and 27% of upper and lower group comparisons demonstrated that all of the items in the scale should remain. In light of these results, it could be argued that the scale is reliable and valid instrument and can be used in order to test students' social studies achievement goals.
Full Text Available OBJECTIVE: Gülhane Aphasia Test-2 (GAT-2 has been developed to show the presence of a language disorder ‘aphasia’ and to give the clinician implications for the accompanying speech disorders such as apraxia and dysarthria. OBJECTIVE: The aim of the study was to report standardization, validity and reliability study of GAT-2. METHODS: : 10 healthy individuals were tested initially for the pilot study. 134 healthy individual was included to the standardization study and 30 individuals with aphasia and 11 individuals with right brain injury was included to the validation study. The inter group GAT-2 score differentiations and the effects of age, years of education, sex variances were observed. GAT-2 cut-off scores were calculated by the scores of healthy individuals. GAT-2 test-retest reliability and inter-observer reliability was calculated. RESULTS: Healthy individuals’ GAT-2 scores were significantly different from the GAT-2 scores of aphasic patients, but not from right brain injured patients’. Healthy individuals’ GAT-2 scores were not affected from the sex, age variances but from years of education, so cut-off scores were calculated by this variance. GAT-2 scores of aphasic patients were not affected from age, sex and years of education. Test-retest and inter-observer reliability and internal consistency results showed that GAT-2 is a highly reliable aphasia screening test. CONCLUSION: GAT-2 was found to be a standardized, highly reliable and a valid aphasia test for Turkish stroke patients with aphasia
Baker, Nancy A; Cook, James R; Redfern, Mark S
This paper describes the inter-rater and intra-rater reliability, and the concurrent validity of an observational instrument, the Keyboard Personal Computer Style instrument (K-PeCS), which assesses stereotypical postures and movements associated with computer keyboard use. Three trained raters independently rated the video clips of 45 computer keyboard users to ascertain inter-rater reliability, and then re-rated a sub-sample of 15 video clips to ascertain intra-rater reliability. Concurrent validity was assessed by comparing the ratings obtained using the K-PeCS to scores developed from a 3D motion analysis system. The overall K-PeCS had excellent reliability [inter-rater: intra-class correlation coefficients (ICC)=.90; intra-rater: ICC=.92]. Most individual items on the K-PeCS had from good to excellent reliability, although six items fell below ICC=.75. Those K-PeCS items that were assessed for concurrent validity compared favorably to the motion analysis data for all but two items. These results suggest that most items on the K-PeCS can be used to reliably document computer keyboarding style.
Zhao, M.; McDonald, A.; Dick, P.
The test rig for Validation and Reliability Testing of shutdown system software has been upgraded from the AECL Windows-based test rig previously used for CANDU6 stations. It includes a Virtual Trip Computer, which is a software simulation of the functional specification of the trip computer, and a real-time trip computer simulator in a separate chassis, which is used during the preparation of trip computer test cases before the actual trip computers are available. This allows preparation work for Validation and Reliability Testing to be performed in advance of delivery of actual trip computers to maintain a project schedule. (author)
Bayani, Ali Asghar
The internal consistency, test-retest reliability, and construct validity of the Farsi version of the Depression Anxiety Stress Scales were examined, with a sample of 306 undergraduate students (123 men, 183 women) ranging from 18 to 51 years of age (M age = 25.4, SD = 6.1). Participants completed the Satisfaction with Life Scale, Rosenberg Self-esteem Scale, and the Depression Anxiety Stress Scales. The findings confirmed the preliminary reliabilities and preliminary construct validity of the Farsi translation of the Depression Anxiety Stress Scales.
Jacobsen, Stine Lindahl
The paper will present a phd study concerning reliability and validity of music therapy assessment model “Assessment of Parenting Competences” (APC) in the area of families with emotionally neglected children. This study had a multiple strategy design with a philosophical base of critical realism...... and pragmatism. The fixed design for this study was a between and within groups design in testing the APCs reliability and validity. The two different groups were parents with neglected children and parents with non-neglected children. The flexible design had a multiple case study strategy specifically...
Jingying Liu; Jipeng Yang; Yanhui Liu; Yang Yang; Hongfu Zhang
Purpose: To test the validity and reliability of a modified Career Growth Scale (CGS) to assess nurse career growth. Method: A cross-sectional design was used to analyze the use of the CGS to survey 600 full-time registered nurses from Grade A hospitals in Tianjin. Results: A modified scale we called Career Growth of Nurse Scale (CGNS) is acceptable, valid, and reliable for the evaluation of nurse career growth in Chinese hospitals. This scale measured three main factors (career goal, c...
Erkoc, Sultan Baliz; Isikli, Burhanettin; Metintas, Selma; Kalyoncu, Cemalettin
This study was conducted to develop a scale to measure knowledge about hypertension among Turkish adults. The Hypertension Knowledge-Level Scale (HK-LS) was generated based on content, face, and construct validity, internal consistency, test re-test reliability, and discriminative validity procedures. The final scale had 22 items with six sub-dimensions. The scale was applied to 457 individuals aged ≥18 years, and 414 of them were re-evaluated for test-retest reliability. The six sub-dimensio...
Iyigun, Gozde; Kirmizigil, Berkiye; Angin, Ender; Oksuz, Sevim; Can, Filiz; Eker, Levent; Rose, Debra J
The aim of this study was to evaluate the reliability and validity of the Turkish version of the FAB(FAB-T) scale in the older Turkish adults. The reliability and validity of the scale was tested on 200 community-dwelling older adults. FAB-T scale was scored by different physiotherapists on different days to evaluate inter-rater and intrarater reliability. The Berg Balance Scale (BBS) was used for the evaluation of convergent validity, and the content validity of the FAB-T scale was investigated. The FAB-T scale showed very high inter- and intra-rater reliability. For inter-rater agreement, on the individual test items and total score ICC values were 0.92 (95 %CI; 0.90-0.94) and 0.96 (95% CI; 0.95-0.97) respectively. The intra-rater agreement, on the individual test items and total score ICC values were 0.93 (95 %CI; 0.91- 0.95) and 0.96 (95% CI; 0.95- 0.97) respectively. There was a good agreement between the FAB-T and BBS scales. A high correlation was found between the BBS and FAB-T scales [rho = 0.70 (%95 CI; 0.62-0.76)] indicating good convergent validity. Considering the content validity of the FAB-T scale, no floor (floor score: 0%) or ceiling (ceiling score: 6.5%) effect was detected. The FAB-T scale was successfully translated from the original English version (FAB) and demonstrated strong psychometric features. It was found that the FAB-T scale has very high inter-rater and intra-rater reliability. Considering the convergent validity, the scale has high correlation with the BBS. The FAB-T has no floor and ceiling effect. Copyright © 2018 Elsevier B.V. All rights reserved.
Apolzan, John W; Myers, Candice A; Cowley, Amanda D; Brady, Heather; Hsia, Daniel S; Stewart, Tiffany M; Redman, Leanne M; Martin, Corby K
Mindfulness is theorized to affect the eating behavior and weight of pregnant women, yet no measure has been validated during pregnancy. This study qualitatively and quantitatively evaluated the reliability and validity of the Mindful Eating Questionnaire (MEQ) in overweight and obese pregnant women. Participants completed focus groups and cognitive interviews. The MEQ was administered twice to measure test-retest reliability. The Eating Inventory (EI) and Mindful Attention Awareness Scale (MAAS) were administered to assess convergent validity, and the Neighborhood Environment Walkability Scale (NEWS) assessed discriminant validity. Participants were 20 ± 8 weeks gestation (mean ± SD), 30 ± 2 years old, and 55% were obese. The MEQ total score had good test-retest reliability (r = .85). The total score internal consistency reliability was poor (Cronbach's α = .56). The external cues subscale (ECS) was not internally consistent (α = .31). Other subscales ranged from α = .59-.68. When the ECS was excluded, the MEQ total score internal consistency was acceptable (α = .62). Convergent validity was supported by the MEQ total score (with and without ECS) correlating significantly with the MAAS and the EI disinhibition and hunger subscales. Discriminant validity of the MEQ was supported by the MEQ and NEWS total scores and subscales not being significantly correlated. The quantitative results were supported by the qualitative context and content analysis. With the exception of the ECS, the MEQ's reliability and validity was supported in pregnant women, and most of the subscales were more robust in pregnant women than in the original sample of healthy adults. The MEQ's use with overweight and obese pregnant women is supported. Copyright © 2016 Elsevier Ltd. All rights reserved.
Full Text Available The objective of this study was to examine the content validity of commonly used muscle performance tests in military personnel and to investigate the reliability of a proposed test battery. For the content validity investigation, thirty selected tests were those described in the literature and/or commonly used in the Nordic and North Atlantic Treaty Organization (NATO countries. Nine selected experts rated, on a four-point Likert scale, the relevance of these tests in relation to five different work tasks: lifting, carrying equipment on the body or in the hands, climbing, and digging. Thereafter, a content validity index (CVI was calculated for each work task. The result showed excellent CVI (≥0.78 for sixteen tests, which comprised of one or more of the military work tasks. Three of the tests; the functional lower-limb loading test (the Ranger test, dead-lift with kettlebells, and back extension, showed excellent content validity for four of the work tasks. For the development of a new muscle strength/endurance test battery, these three tests were further supplemented with two other tests, namely, the chins and side-bridge test. The inter-rater reliability was high (intraclass correlation coefficient, ICC2,1 0.99 for all five tests. The intra-rater reliability was good to high (ICC3,1 0.82-0.96 with an acceptable standard error of mean (SEM, except for the side-bridge test (SEM%>15. Thus, the final suggested test battery for a valid and reliable evaluation of soldiers' muscle performance comprised the following four tests; the Ranger test, dead-lift with kettlebells, chins, and back extension test. The criterion-related validity of the test battery should be further evaluated for soldiers exposed to varying physical workload.
Isabelle Ottenvall Hammar
Full Text Available In research and healthcare it is important to measure older persons’ self-determination in order to improve their possibilities to decide for themselves in daily life. The questionnaire Impact on Participation and Autonomy (IPA assesses self-determination, but is not constructed for older persons. The aim of this study was to examine the validity and reliability of the IPA-S questionnaire for persons aged 70 years and older. The study was performed in two steps; first a validity test of the Swedish version of the questionnaire, IPA-S, followed by a reliability test-retest of an adjusted version. The validity was tested with focus groups and individual interviews on persons aged 77-88 years, and the reliability on persons aged 70-99 years. The validity test result showed that IPA-S is valid for older persons but it was too extensive and the phrasing of the items needed adjustments. The reliability test-retest on the adjusted questionnaire, IPA-Older persons (IPA-O, showed that 15 of 22 items had high agreement. IPA-O can be used to measure older persons’ self-determination in their care and rehabilitation.
Danielle Fabiana Cucolo
Full Text Available ABSTRACT Objectives: to verify the reliability and construct validity estimates of the "Assessment of nursing care product" scale (APROCENF and its applicability. Methods: this validation study included a sample of 40 (inter-rater reliability and 172 (construct validity assessments performed by nurses at the end of the work shift at nine inpatient services of a teaching hospital in the Brazilian Southeast. The data were collected between February and September/2014 with interruptions. Cronbach's alpha and Spearman's correlation coefficients were calculated, as well as the intraclass correlation and the weighted kappa index (inter-rater reliability. Exploratory factor analysis was used with principal component extraction and varimax rotation (construct validity. Results: the internal consistency revealed an alpha coefficient of 0.85, item-item correlation ranging between 0.13 and 0.61 and item-total correlation between 0.43 and 0.69. Inter-rater equivalence was obtained and all items evidenced significant factor loadings. Conclusion: this research evidenced the reliability and construct validity of the scale to assess the nursing care product. Its application in nursing practice permits identifying improvements needed in the production process, contributing to management and care decisions.
Gadbury-Amyot, Cynthia C.
This study examined validity and reliability of portfolio assessment using Messick's (1996, 1995) unified framework of construct validity. Theoretical and empirical evidence was sought for six aspects of construct validity. The sample included twenty student portfolios. Each portfolio were evaluated by seven faculty raters using a primary trait analysis scoring rubric. There was a significant relationship (r = .81--.95; p Dental Hygiene Board Examination (r = .60; p Dental Testing Service examination was both weak and nonsignificant (r = .19; p > .05). An open-ended survey was used to elicit student feedback on portfolio development. A majority of the students (76%) perceived value in the development of programmatic portfolios. In conclusion, the pattern of findings from this study suggest that portfolios can serve as a valid and reliable measure for assessing student competency.
Bornstein, P H; Hamilton, S B; Miller, R K; Quevillon, R P; Spitzform, M
This study investigated the effects of reliability and validity "enhancers" on fidelity of self-report data in an analogue therapy situation. Under the guise of a Concentration Skills Training Program, 57 Ss were assigned randomly to one of the following conditions: (a) Reliability Enhancement; (b) Truth Talk; (c) No Comment Control. Results indicated significant differences among groups (p less than .05). In addition, tests of multiple comparisons revealed that Reliability Enhancement was significantly different from Truth Talk in occurrences of unreliability (p less than .05). These findings are discussed in light of the increased reliance on self-report data in behavioral intervention, and recommendations are made for future research.
Full Text Available Background: In the past years, there was an increasing development of physical activity tracker (Wearables. For recreational people, testing of these devices under walking or light jogging conditions might be sufficient. For (elite athletes, however, scientific trustworthiness needs to be given for a broad spectrum of velocities or even fast changes in velocities reflecting the demands of the sport. Therefore, the aim was to evaluate the validity of eleven Wearables for monitoring step count, covered distance and energy expenditure (EE under laboratory conditions with different constant and varying velocities.Methods: Twenty healthy sport students (10 men, 10 women performed a running protocol consisting of four 5 min stages of different constant velocities (4.3; 7.2; 10.1; 13.0 km·h−1, a 5 min period of intermittent velocity, and a 2.4 km outdoor run (10.1 km·h−1 while wearing eleven different Wearables (Bodymedia Sensewear, Beurer AS 80, Polar Loop, Garmin Vivofit, Garmin Vivosmart, Garmin Vivoactive, Garmin Forerunner 920XT, Fitbit Charge, Fitbit Charge HR, Xaomi MiBand, Withings Pulse Ox. Step count, covered distance, and EE were evaluated by comparing each Wearable with a criterion method (Optogait system and manual counting for step count, treadmill for covered distance and indirect calorimetry for EE.Results: All Wearables, except Bodymedia Sensewear, Polar Loop, and Beurer AS80, revealed good validity (small MAPE, good ICC for all constant and varying velocities for monitoring step count. For covered distance, all Wearables showed a very low ICC (<0.1 and high MAPE (up to 50%, revealing no good validity. The measurement of EE was acceptable for the Garmin, Fitbit and Withings Wearables (small to moderate MAPE, while Bodymedia Sensewear, Polar Loop, and Beurer AS80 showed a high MAPE up to 56% for all test conditions.Conclusion: In our study, most Wearables provide an acceptable level of validity for step counts at different
Jørgensen, René; Ris Hansen, Inge; Falla, Deborah
-retest reliability in people with and without chronic neck pain. Moreover, construct and between-group discriminative validity of the tests were examined. METHODS: Twenty-one participants with chronic neck pain and 21 asymptomatic participants were included. Intra- and inter-reliability were evaluated for the Cranio-Cervical...... Flexion Test (CCFT), Range of Movement (ROM), Joint Position Error (JPE), Gaze Stability (GS), Smooth Pursuit Neck Torsion Test (SPNTT), and neuromuscular control of the Deep Cervical Extensors (DCE). Test-retest reliability was assessed for Postural Control (SWAY) and Pressure Pain Threshold (PPT) over......BACKGROUND: The reliability of clinical tests for the cervical spine has not been adequately evaluated. Six cervical clinical tests, which are low cost and easy to perform in clinical settings, were tested for intra- and inter-examiner reliability, and two performance tests were assessed for test...
The purpose of this study is to make Turkish adaptation the Writing Attitude Scale (WAS) that In order to measure writing anxiety developed by Marcia et al (1984). For this purpose was carried out the Validation of a Writing Attitude Scale and to examine its reliability and validity. Writing Attitude Scale (WAS) was first translated into Turkish and, equivalence analysis of forms English / Turkish language of the scale were carried out by the reading of three English teachers / lecturers. The...
Nualnong Wongtongkam; Paul Russell Ward; Andrew Day; Anthony Harold Winefield
In Thailand physical violence among male adolescents is considered a significant public health issue, although there has been little published research into the aetiology and functions of violence in Thai youth. Research in this area has been hampered by a lack of psychometrically sound tools that have been validated to assess problem behaviours in Asian youth. The purpose of this paper is to provide validity and reliability data on an instrument to measure violence in Thai youth. In this stu...
Burcu Ersöz Hüseyinsinoğlu
Full Text Available OBJECTIVE: The aim of this study was to adapt the Motor Activity Log-28 (MAL-28 into Turkish and probe the reliability and validity of this questionnaire in stroke patients. METHODS: Following the translation of the MAL-28 into Turkish, its reliability and construct validity was examined in 30 stroke patients. For the reliability study, patients were interviewed twice within a three day period, during which no rehabilitative activities were undertaken. The test-retest reliability was determined by using intra-class correlation coefficient (ICC and Spearman correlation coefficient (r; internal consistency was determined by Cronbach's alpha (α. The construct validity was examined by comparing MAL-28 Quality Of Movement (QOM scale and Amount Of Use (AOU scale with Wolf Motor Function Test (WMFT-Performance Time (PT and Functional Ability (FA scores. Furthermore, item-to-scale correlations of AOU and QOM scales were determined and correlation between totol scores of two scales was examined. RESULTS: Turkish version of MAL-28 AOU and QOM scales were reliable (ICC scores were 0.97 and 0.96, respectively and internally consistent (Cronbach’s α value was 0.96 for both scales. Test-retest reliability was supported (AOU, r=0.94; QOM, r=0.93. WMFT FA scores was correlated with both scales (r=0.63. Correlation between WMFT PT and AOU and QOM scales were -0.56 and -0.55. AOU and QOM scales were highly correlated (r=0.95. CONCLUSION: The findings indicate that Turkish version of MAL-28 is reliable and valid in individuals with stroke. Further investigation about its responsiveness is needed before using that version as a primary measurement in clinical trials
Murray, Nicholas; Salvatore, Anthony; Powell, Douglas; Reed-Jones, Rebecca
Context: An estimated 300 000 sport-related concussion injuries occur in the United States annually. Approximately 30% of individuals with concussions experience balance disturbances. Common methods of balance assessment include the Clinical Test of Sensory Organization and Balance (CTSIB), the Sensory Organization Test (SOT), the Balance Error Scoring System (BESS), and the Romberg test; however, the National Collegiate Athletic Association recommended the Wii Fit as an alternative measure of balance in athletes with a concussion. A central concern regarding the implementation of the Wii Fit is whether it is reliable and valid for measuring balance disturbance in athletes with concussion. Objective: To examine the reliability and validity evidence for the CTSIB, SOT, BESS, Romberg test, and Wii Fit for detecting balance disturbance in athletes with a concussion. Data Sources: Literature considered for review included publications with reliability and validity data for the assessments of balance (CTSIB, SOT, BESS, Romberg test, and Wii Fit) from PubMed, PsycINFO, and CINAHL. Data Extraction: We identified 63 relevant articles for consideration in the review. Of the 63 articles, 28 were considered appropriate for inclusion and 35 were excluded. Data Synthesis: No current reliability or validity information supports the use of the CTSIB, SOT, Romberg test, or Wii Fit for balance assessment in athletes with a concussion. The BESS demonstrated moderate to high reliability (interclass correlation coefficient = 0.87) and low to moderate validity (sensitivity = 34%, specificity = 87%). However, the Romberg test and Wii Fit have been shown to be reliable tools in the assessment of balance in Parkinson patients. Conclusions: The BESS can evaluate balance problems after a concussion. However, it lacks the ability to detect balance problems after the third day of recovery. Further investigation is needed to establish the use of the CTSIB, SOT, Romberg test, and Wii Fit for
Murray, Nicholas; Salvatore, Anthony; Powell, Douglas; Reed-Jones, Rebecca
An estimated 300 000 sport-related concussion injuries occur in the United States annually. Approximately 30% of individuals with concussions experience balance disturbances. Common methods of balance assessment include the Clinical Test of Sensory Organization and Balance (CTSIB), the Sensory Organization Test (SOT), the Balance Error Scoring System (BESS), and the Romberg test; however, the National Collegiate Athletic Association recommended the Wii Fit as an alternative measure of balance in athletes with a concussion. A central concern regarding the implementation of the Wii Fit is whether it is reliable and valid for measuring balance disturbance in athletes with concussion. To examine the reliability and validity evidence for the CTSIB, SOT, BESS, Romberg test, and Wii Fit for detecting balance disturbance in athletes with a concussion. Literature considered for review included publications with reliability and validity data for the assessments of balance (CTSIB, SOT, BESS, Romberg test, and Wii Fit) from PubMed, PsycINFO, and CINAHL. We identified 63 relevant articles for consideration in the review. Of the 63 articles, 28 were considered appropriate for inclusion and 35 were excluded. No current reliability or validity information supports the use of the CTSIB, SOT, Romberg test, or Wii Fit for balance assessment in athletes with a concussion. The BESS demonstrated moderate to high reliability (interclass correlation coefficient = 0.87) and low to moderate validity (sensitivity = 34%, specificity = 87%). However, the Romberg test and Wii Fit have been shown to be reliable tools in the assessment of balance in Parkinson patients. The BESS can evaluate balance problems after a concussion. However, it lacks the ability to detect balance problems after the third day of recovery. Further investigation is needed to establish the use of the CTSIB, SOT, Romberg test, and Wii Fit for assessing balance in athletes with concussions.
Mills, Tamara L; Holm, Margo B; Schmeler, Mark
The purpose of this study was to establish the test-retest reliability and content validity of an outcomes tool designed to measure the effectiveness of seating-mobility interventions on the functional performance of individuals who use wheelchairs or scooters as their primary seating-mobility device. The instrument, Functioning Everyday With a Wheelchair (FEW), is a questionnaire designed to measure perceived user function related to wheelchair/scooter use. Using consumer-generated items, FEW Beta Version 1.0 was developed and test-retest reliability was established. Cross-validation of FEW Beta Version 1.0 was then carried out with five samples of seating-mobility users to establish content validity. Based on the content validity study, FEW Version 2.0 was developed and administered to seating-mobility consumers to examine its test-retest reliability. FEW Beta Version 1.0 yielded an intraclass correlation coefficient (ICC) Model (3,k) of .92, p content validity results revealed that FEW Beta Version 1.0 captured 55% of seating-mobility goals reported by consumers across five samples. FEW Version 2.0 yielded ICC(3,k) = .86, p content validity of FEW Version 2.0 was confirmed. FEW Beta Version 1.0 and FEW Version 2.0 were highly stable in their measurement of participants' seating-mobility goals over a 1-week interval.
Full Text Available This study aimed to translate MIDAS questionnaire from English into Persian and determine its content validity and reliability. MIDAS was translated and validated on a sample (N = 110 of Iranian adult population. The participants were both male and female with the age range of 17-57. They were at different educational levels and from different ethnic groups in Iran. A translating team, consisting of five members, bilingual in English and Persian and familiar with multiple intelligences (MI theory and practice, were involved in translating and determining content validity, which included the processes of forward translation, back-translation, review, final proof-reading, and testing. The statistical analyses of inter-scale correlation were performed using the Cronbach's alpha coefficient. In an intra-class correlation, the Cronbach's alpha was high for all of the questions. Translation and content validity of MIDAS questionnaire was completed by a proper process leading to high reliability and validity. The results suggest that Persian MIDAS (P-MIDAS could serve as a valid and reliable instrument for measuring Iranian adults MIs.
Valente, Ana Rita S; Hall, Andreia; Alvelos, Helena; Leahy, Margaret; Jesus, Luis M T
The appropriate use of language in context depends on the speaker's pragmatic language competencies. A coding system was used to develop a specific and adult-focused self-administered questionnaire to adults who stutter and adults who do not stutter, The Assessment of Language Use in Social Contexts for Adults, with three categories: precursors, basic exchanges, and extended literal/non-literal discourse. This paper presents the content validity, item analysis, reliability coefficients and evidences of construct validity of the instrument. Content validity analysis was based on a two-stage process: first, 11 pragmatic questionnaires were assessed to identify items that probe each pragmatic competency and to create the first version of the instrument; second, items were assessed qualitatively by an expert panel composed by adults who stutter and controls, and quantitatively and qualitatively by an expert panel composed by clinicians. A pilot study was conducted with five adults who stutter and five controls to analyse items and calculate reliability. Construct validity evidences were obtained using the hypothesized relationships method and factor analysis with 28 adults who stutter and 28 controls. Concerning content validity, the questionnaires assessed up to 13 pragmatic competencies. Qualitative and quantitative analysis revealed ambiguities in items construction. Disagreement between experts was solved through item modification. The pilot study showed that the instrument presented internal consistency and temporal stability. Significant differences between adults who stutter and controls and different response profiles revealed the instrument's underlying construct. The instrument is reliable and presented evidences of construct validity.
Guspatni, G.; Kurniawati, Y.
The aim of this paper is to examine validity and reliability of a questionnaire used to evaluate e-learning implementation in chemistry instruction. 48 questionnaires were filled in by students who had studied chemistry through e-learning system. The questionnaire consisted of 20 indicators evaluating students’ perception on using e-learning. Parametric testing was done as data were assumed to follow normal distribution. Item validity of the questionnaire was examined through item-total correlation using Pearson’s formula while its reliability was assessed with Cronbach’s alpha formula. Moreover, convergent validity was assessed to see whether indicators building a factor had theoretically the same underlying construct. The result of validity testing revealed 19 valid indicators while the result of reliability testing revealed Cronbach’s alpha value of .886. The result of factor analysis showed that questionnaire consisted of five factors, and each of them had indicators building the same construct. This article shows the importance of factor analysis to get a construct valid questionnaire before it is used as research instrument.
Moriguchi, Eri; Ito, Mikiko; Nagai, Toshisaburo
A Japanese version of the Quality of Life in Childhood Epilepsy Questionnaire (QOLCE-J) was developed using international guidelines as a QOL scale for childhood epilepsy; its reliability and validity were examined, focusing on Japanese pediatric epilepsy patients applicability. A pilot test questionnaire survey was conducted; involving parents of pediatric epilepsy patients aged 4-15 undergoing outpatient treatment. 278 responses were obtained and analyzed. Internal consistency for the 16 QOLCE-J subscales, except for , was sufficient, and a high overall coefficient α was obtained. The intraclass correlation coefficient was also high, supporting the test-retest reliability of this version. Associations among the subscales, high correlations of r>0.7 were observed among , , and , representing cognitive and behavioral aspects, and among these and . In contrast, correlations among others were moderate or weaker. Furthermore, correlations of r>0.35 were observed among the subscales of the SDQ (Strength and Difficulties Questionnaire) used as an external criterion and the QOLCE-J, confirming the criterion validity of the study version. Analysis of associations between the total QOLCE-J score and pathology of epilepsy, found significant correlation with age of onset and frequency of seizures, ADL, and antiepileptics side effects' symptoms. QOLCE has mostly been used in treatment resistant pediatric patients, the influence of interictal period presently observed, like antiepileptic side effects' symptoms; suggest usefulness for pediatric patients with seizures under control. The QOLCE-J with sufficient reliability and validity may be applicable as a QOL scale for Japanese children with epilepsy. Copyright © 2015 The Japanese Society of Child Neurology. Published by Elsevier B.V. All rights reserved.
Tokudome, Yuko; Okumura, Keiko; Kumagai, Yoshiko; Hirano, Hirohiko; Kim, Hunkyung; Morishita, Shiho; Watanabe, Yutaka
Because few Japanese questionnaires assess the elderly's appetite, there is an urgent need to develop an appetite questionnaire with verified reliability, validity, and reproducibility. We translated and back-translated the Council on Nutrition Appetite Questionnaire (CNAQ), which has eight items, into Japanese (CNAQ-J), as well as the Simplified Nutritional Appetite Questionnaire (SNAQ-J), which includes four CNAQ-J-derived items. Using structural equation modeling, we examined the CNAQ-J structure based on data of 649 Japanese elderly people in 2013, including individuals having a certain degree of cognitive impairment, and we developed the SNAQ for the Japanese elderly (SNAQ-JE) according to an exploratory factor analysis. Confirmatory factor analyses on the appetite questionnaires were conducted to probe fitting to the model. We computed Cronbach's α coefficients and criterion-referenced/-related validity figures examining associations of the three appetite battery scores with body mass index (BMI) values and with nutrition-related questionnaire values. Test-retest reproducibility of appetite tools was scrutinized over an approximately 2-week interval. An exploratory factor analysis demonstrated that the CNAQ-J was constructed of one factor (appetite), yielding the SNAQ-JE, which includes four questions derived from the CNAQ-J. The three appetite instruments showed almost equivalent fitting to the model and reproducibility. The CNAQ-J and SNAQ-JE demonstrated satisfactory reliability and significant criterion-referenced/-related validity values, including BMIs, but the SNAQ-J included a low factor-loading item, exhibited less satisfactory reliability and had a non-significant relationship to BMI. The CNAQ-J and SNAQ-JE may be applied to assess the appetite of Japanese elderly, including persons with some cognitive impairment. Copyright © 2017 The Authors. Production and hosting by Elsevier B.V. All rights reserved.
...] Food and Drug Administration/American Glaucoma Society Workshop on the Validity, Reliability, and... entitled ``FDA/American Glaucoma Society (AGS) Workshop on the Validity,