single imputation methods: Topics by WorldWideScience.org

Sample records for single imputation methods

Missing data imputation: focusing on single imputation.

Science.gov (United States)

Zhang, Zhongheng

2016-01-01

Complete case analysis is widely used for handling missing data, and it is the default method in many statistical packages. However, this method may introduce bias and some useful information will be omitted from analysis. Therefore, many imputation methods are developed to make gap end. The present article focuses on single imputation. Imputations with mean, median and mode are simple but, like complete case analysis, can introduce bias on mean and deviation. Furthermore, they ignore relationship with other variables. Regression imputation can preserve relationship between missing values and other variables. There are many sophisticated methods exist to handle missing values in longitudinal data. This article focuses primarily on how to implement R code to perform single imputation, while avoiding complex mathematical calculations.
R package imputeTestbench to compare imputations methods for univariate time series

OpenAIRE

Bokde, Neeraj; Kulat, Kishore; Beck, Marcus W; Asencio-Cortés, Gualberto

2016-01-01

This paper describes the R package imputeTestbench that provides a testbench for comparing imputation methods for missing data in univariate time series. The imputeTestbench package can be used to simulate the amount and type of missing data in a complete dataset and compare filled data using different imputation methods. The user has the option to simulate missing data by removing observations completely at random or in blocks of different sizes. Several default imputation methods are includ...
Dealing with missing data in a multi-question depression scale: a comparison of imputation methods

Directory of Open Access Journals (Sweden)

Stuart Heather

2006-12-01

Full Text Available Abstract Background Missing data present a challenge to many research projects. The problem is often pronounced in studies utilizing self-report scales, and literature addressing different strategies for dealing with missing data in such circumstances is scarce. The objective of this study was to compare six different imputation techniques for dealing with missing data in the Zung Self-reported Depression scale (SDS. Methods 1580 participants from a surgical outcomes study completed the SDS. The SDS is a 20 question scale that respondents complete by circling a value of 1 to 4 for each question. The sum of the responses is calculated and respondents are classified as exhibiting depressive symptoms when their total score is over 40. Missing values were simulated by randomly selecting questions whose values were then deleted (a missing completely at random simulation. Additionally, a missing at random and missing not at random simulation were completed. Six imputation methods were then considered; 1 multiple imputation, 2 single regression, 3 individual mean, 4 overall mean, 5 participant's preceding response, and 6 random selection of a value from 1 to 4. For each method, the imputed mean SDS score and standard deviation were compared to the population statistics. The Spearman correlation coefficient, percent misclassified and the Kappa statistic were also calculated. Results When 10% of values are missing, all the imputation methods except random selection produce Kappa statistics greater than 0.80 indicating 'near perfect' agreement. MI produces the most valid imputed values with a high Kappa statistic (0.89, although both single regression and individual mean imputation also produced favorable results. As the percent of missing information increased to 30%, or when unbalanced missing data were introduced, MI maintained a high Kappa statistic. The individual mean and single regression method produced Kappas in the 'substantial agreement' range
The multiple imputation method: a case study involving secondary data analysis.

Science.gov (United States)

Walani, Salimah R; Cleland, Charles M

2015-05-01

To illustrate with the example of a secondary data analysis study the use of the multiple imputation method to replace missing data. Most large public datasets have missing data, which need to be handled by researchers conducting secondary data analysis studies. Multiple imputation is a technique widely used to replace missing values while preserving the sample size and sampling variability of the data. The 2004 National Sample Survey of Registered Nurses. The authors created a model to impute missing values using the chained equation method. They used imputation diagnostics procedures and conducted regression analysis of imputed data to determine the differences between the log hourly wages of internationally educated and US-educated registered nurses. The authors used multiple imputation procedures to replace missing values in a large dataset with 29,059 observations. Five multiple imputed datasets were created. Imputation diagnostics using time series and density plots showed that imputation was successful. The authors also present an example of the use of multiple imputed datasets to conduct regression analysis to answer a substantive research question. Multiple imputation is a powerful technique for imputing missing values in large datasets while preserving the sample size and variance of the data. Even though the chained equation method involves complex statistical computations, recent innovations in software and computation have made it possible for researchers to conduct this technique on large datasets. The authors recommend nurse researchers use multiple imputation methods for handling missing data to improve the statistical power and external validity of their studies.
Assessing and comparison of different machine learning methods in parent-offspring trios for genotype imputation.

Science.gov (United States)

Mikhchi, Abbas; Honarvar, Mahmood; Kashan, Nasser Emam Jomeh; Aminafshar, Mehdi

2016-06-21

Genotype imputation is an important tool for prediction of unknown genotypes for both unrelated individuals and parent-offspring trios. Several imputation methods are available and can either employ universal machine learning methods, or deploy algorithms dedicated to infer missing genotypes. In this research the performance of eight machine learning methods: Support Vector Machine, K-Nearest Neighbors, Extreme Learning Machine, Radial Basis Function, Random Forest, AdaBoost, LogitBoost, and TotalBoost compared in terms of the imputation accuracy, computation time and the factors affecting imputation accuracy. The methods employed using real and simulated datasets to impute the un-typed SNPs in parent-offspring trios. The tested methods show that imputation of parent-offspring trios can be accurate. The Random Forest and Support Vector Machine were more accurate than the other machine learning methods. The TotalBoost performed slightly worse than the other methods.The running times were different between methods. The ELM was always most fast algorithm. In case of increasing the sample size, the RBF requires long imputation time.The tested methods in this research can be an alternative for imputation of un-typed SNPs in low missing rate of data. However, it is recommended that other machine learning methods to be used for imputation. Copyright © 2016 Elsevier Ltd. All rights reserved.
Assessment of imputation methods using varying ecological information to fill the gaps in a tree functional trait database

Science.gov (United States)

Poyatos, Rafael; Sus, Oliver; Vilà-Cabrera, Albert; Vayreda, Jordi; Badiella, Llorenç; Mencuccini, Maurizio; Martínez-Vilalta, Jordi

2016-04-01

Plant functional traits are increasingly being used in ecosystem ecology thanks to the growing availability of large ecological databases. However, these databases usually contain a large fraction of missing data because measuring plant functional traits systematically is labour-intensive and because most databases are compilations of datasets with different sampling designs. As a result, within a given database, there is an inevitable variability in the number of traits available for each data entry and/or the species coverage in a given geographical area. The presence of missing data may severely bias trait-based analyses, such as the quantification of trait covariation or trait-environment relationships and may hamper efforts towards trait-based modelling of ecosystem biogeochemical cycles. Several data imputation (i.e. gap-filling) methods have been recently tested on compiled functional trait databases, but the performance of imputation methods applied to a functional trait database with a regular spatial sampling has not been thoroughly studied. Here, we assess the effects of data imputation on five tree functional traits (leaf biomass to sapwood area ratio, foliar nitrogen, maximum height, specific leaf area and wood density) in the Ecological and Forest Inventory of Catalonia, an extensive spatial database (covering 31900 km2). We tested the performance of species mean imputation, single imputation by the k-nearest neighbors algorithm (kNN) and a multiple imputation method, Multivariate Imputation with Chained Equations (MICE) at different levels of missing data (10%, 30%, 50%, and 80%). We also assessed the changes in imputation performance when additional predictors (species identity, climate, forest structure, spatial structure) were added in kNN and MICE imputations. We evaluated the imputed datasets using a battery of indexes describing departure from the complete dataset in trait distribution, in the mean prediction error, in the correlation matrix
The Ability of Different Imputation Methods to Preserve the Significant Genes and Pathways in Cancer

Directory of Open Access Journals (Sweden)

Rosa Aghdam

2017-12-01

Full Text Available Deciphering important genes and pathways from incomplete gene expression data could facilitate a better understanding of cancer. Different imputation methods can be applied to estimate the missing values. In our study, we evaluated various imputation methods for their performance in preserving significant genes and pathways. In the first step, 5% genes are considered in random for two types of ignorable and non-ignorable missingness mechanisms with various missing rates. Next, 10 well-known imputation methods were applied to the complete datasets. The significance analysis of microarrays (SAM method was applied to detect the significant genes in rectal and lung cancers to showcase the utility of imputation approaches in preserving significant genes. To determine the impact of different imputation methods on the identification of important genes, the chi-squared test was used to compare the proportions of overlaps between significant genes detected from original data and those detected from the imputed datasets. Additionally, the significant genes are tested for their enrichment in important pathways, using the ConsensusPathDB. Our results showed that almost all the significant genes and pathways of the original dataset can be detected in all imputed datasets, indicating that there is no significant difference in the performance of various imputation methods tested. The source code and selected datasets are available on http://profiles.bs.ipm.ir/softwares/imputation_methods/.
Improving accuracy of rare variant imputation with a two-step imputation approach

DEFF Research Database (Denmark)

Kreiner-Møller, Eskil; Medina-Gomez, Carolina; Uitterlinden, André G

2015-01-01

not being comprehensively scrutinized. Next-generation arrays ensuring sufficient coverage together with new reference panels, as the 1000 Genomes panel, are emerging to facilitate imputation of low frequent single-nucleotide polymorphisms (minor allele frequency (MAF) ... reference sample genotyped on a dense array and hereafter to the 1000 Genomes reference panel. We show that mean imputation quality, measured by the r(2) using this approach, increases by 28% for variants with a MAF between 1 and 5% as compared with direct imputation to 1000 Genomes reference. Similarly......Genotype imputation has been the pillar of the success of genome-wide association studies (GWAS) for identifying common variants associated with common diseases. However, most GWAS have been run using only 60 HapMap samples as reference for imputation, meaning less frequent and rare variants...
Missing value imputation in DNA microarrays based on conjugate gradient method.

Science.gov (United States)

Dorri, Fatemeh; Azmi, Paeiz; Dorri, Faezeh

2012-02-01

Analysis of gene expression profiles needs a complete matrix of gene array values; consequently, imputation methods have been suggested. In this paper, an algorithm that is based on conjugate gradient (CG) method is proposed to estimate missing values. k-nearest neighbors of the missed entry are first selected based on absolute values of their Pearson correlation coefficient. Then a subset of genes among the k-nearest neighbors is labeled as the best similar ones. CG algorithm with this subset as its input is then used to estimate the missing values. Our proposed CG based algorithm (CGimpute) is evaluated on different data sets. The results are compared with sequential local least squares (SLLSimpute), Bayesian principle component analysis (BPCAimpute), local least squares imputation (LLSimpute), iterated local least squares imputation (ILLSimpute) and adaptive k-nearest neighbors imputation (KNNKimpute) methods. The average of normalized root mean squares error (NRMSE) and relative NRMSE in different data sets with various missing rates shows CGimpute outperforms other methods. Copyright © 2011 Elsevier Ltd. All rights reserved.
The Ability of Different Imputation Methods to Preserve the Significant Genes and Pathways in Cancer.

Science.gov (United States)

Aghdam, Rosa; Baghfalaki, Taban; Khosravi, Pegah; Saberi Ansari, Elnaz

2017-12-01

Deciphering important genes and pathways from incomplete gene expression data could facilitate a better understanding of cancer. Different imputation methods can be applied to estimate the missing values. In our study, we evaluated various imputation methods for their performance in preserving significant genes and pathways. In the first step, 5% genes are considered in random for two types of ignorable and non-ignorable missingness mechanisms with various missing rates. Next, 10 well-known imputation methods were applied to the complete datasets. The significance analysis of microarrays (SAM) method was applied to detect the significant genes in rectal and lung cancers to showcase the utility of imputation approaches in preserving significant genes. To determine the impact of different imputation methods on the identification of important genes, the chi-squared test was used to compare the proportions of overlaps between significant genes detected from original data and those detected from the imputed datasets. Additionally, the significant genes are tested for their enrichment in important pathways, using the ConsensusPathDB. Our results showed that almost all the significant genes and pathways of the original dataset can be detected in all imputed datasets, indicating that there is no significant difference in the performance of various imputation methods tested. The source code and selected datasets are available on http://profiles.bs.ipm.ir/softwares/imputation_methods/. Copyright © 2017. Production and hosting by Elsevier B.V.
Missing data imputation using statistical and machine learning methods in a real breast cancer problem.

Science.gov (United States)

Jerez, José M; Molina, Ignacio; García-Laencina, Pedro J; Alba, Emilio; Ribelles, Nuria; Martín, Miguel; Franco, Leonardo

2010-10-01

Missing data imputation is an important task in cases where it is crucial to use all available data and not discard records with missing values. This work evaluates the performance of several statistical and machine learning imputation methods that were used to predict recurrence in patients in an extensive real breast cancer data set. Imputation methods based on statistical techniques, e.g., mean, hot-deck and multiple imputation, and machine learning techniques, e.g., multi-layer perceptron (MLP), self-organisation maps (SOM) and k-nearest neighbour (KNN), were applied to data collected through the "El Álamo-I" project, and the results were then compared to those obtained from the listwise deletion (LD) imputation method. The database includes demographic, therapeutic and recurrence-survival information from 3679 women with operable invasive breast cancer diagnosed in 32 different hospitals belonging to the Spanish Breast Cancer Research Group (GEICAM). The accuracies of predictions on early cancer relapse were measured using artificial neural networks (ANNs), in which different ANNs were estimated using the data sets with imputed missing values. The imputation methods based on machine learning algorithms outperformed imputation statistical methods in the prediction of patient outcome. Friedman's test revealed a significant difference (p=0.0091) in the observed area under the ROC curve (AUC) values, and the pairwise comparison test showed that the AUCs for MLP, KNN and SOM were significantly higher (p=0.0053, p=0.0048 and p=0.0071, respectively) than the AUC from the LD-based prognosis model. The methods based on machine learning techniques were the most suited for the imputation of missing values and led to a significant enhancement of prognosis accuracy compared to imputation methods based on statistical procedures. Copyright © 2010 Elsevier B.V. All rights reserved.
An Overview and Evaluation of Recent Machine Learning Imputation Methods Using Cardiac Imaging Data.

Science.gov (United States)

Liu, Yuzhe; Gopalakrishnan, Vanathi

2017-03-01

Many clinical research datasets have a large percentage of missing values that directly impacts their usefulness in yielding high accuracy classifiers when used for training in supervised machine learning. While missing value imputation methods have been shown to work well with smaller percentages of missing values, their ability to impute sparse clinical research data can be problem specific. We previously attempted to learn quantitative guidelines for ordering cardiac magnetic resonance imaging during the evaluation for pediatric cardiomyopathy, but missing data significantly reduced our usable sample size. In this work, we sought to determine if increasing the usable sample size through imputation would allow us to learn better guidelines. We first review several machine learning methods for estimating missing data. Then, we apply four popular methods (mean imputation, decision tree, k-nearest neighbors, and self-organizing maps) to a clinical research dataset of pediatric patients undergoing evaluation for cardiomyopathy. Using Bayesian Rule Learning (BRL) to learn ruleset models, we compared the performance of imputation-augmented models versus unaugmented models. We found that all four imputation-augmented models performed similarly to unaugmented models. While imputation did not improve performance, it did provide evidence for the robustness of our learned models.
Comparison of three boosting methods in parent-offspring trios for genotype imputation using simulation study

Directory of Open Access Journals (Sweden)

Abbas Mikhchi

2016-01-01

Full Text Available Abstract Background Genotype imputation is an important process of predicting unknown genotypes, which uses reference population with dense genotypes to predict missing genotypes for both human and animal genetic variations at a low cost. Machine learning methods specially boosting methods have been used in genetic studies to explore the underlying genetic profile of disease and build models capable of predicting missing values of a marker. Methods In this study strategies and factors affecting the imputation accuracy of parent-offspring trios compared from lower-density SNP panels (5 K to high density (10 K SNP panel using three different Boosting methods namely TotalBoost (TB, LogitBoost (LB and AdaBoost (AB. The methods employed using simulated data to impute the un-typed SNPs in parent-offspring trios. Four different datasets of G1 (100 trios with 5 k SNPs, G2 (100 trios with 10 k SNPs, G3 (500 trios with 5 k SNPs, and G4 (500 trio with 10 k SNPs were simulated. In four datasets all parents were genotyped completely, and offspring genotyped with a lower density panel. Results Comparison of the three methods for imputation showed that the LB outperformed AB and TB for imputation accuracy. The time of computation were different between methods. The AB was the fastest algorithm. The higher SNP densities resulted the increase of the accuracy of imputation. Larger trios (i.e. 500 was better for performance of LB and TB. Conclusions The conclusion is that the three methods do well in terms of imputation accuracy also the dense chip is recommended for imputation of parent-offspring trios.
Imputation methods for filling missing data in urban air pollution data for Malaysia

Directory of Open Access Journals (Sweden)

Nur Afiqah Zakaria

2018-06-01

Full Text Available The air quality measurement data obtained from the continuous ambient air quality monitoring (CAAQM station usually contained missing data. The missing observations of the data usually occurred due to machine failure, routine maintenance and human error. In this study, the hourly monitoring data of CO, O3, PM10, SO2, NOx, NO2, ambient temperature and humidity were used to evaluate four imputation methods (Mean Top Bottom, Linear Regression, Multiple Imputation and Nearest Neighbour. The air pollutants observations were simulated into four percentages of simulated missing data i.e. 5%, 10%, 15% and 20%. Performance measures namely the Mean Absolute Error, Root Mean Squared Error, Coefficient of Determination and Index of Agreement were used to describe the goodness of fit of the imputation methods. From the results of the performance measures, Mean Top Bottom method was selected as the most appropriate imputation method for filling in the missing values in air pollutants data.
Multi-generational imputation of single nucleotide polymorphism marker genotypes and accuracy of genomic selection.

Science.gov (United States)

Toghiani, S; Aggrey, S E; Rekaya, R

2016-07-01

Availability of high-density single nucleotide polymorphism (SNP) genotyping platforms provided unprecedented opportunities to enhance breeding programmes in livestock, poultry and plant species, and to better understand the genetic basis of complex traits. Using this genomic information, genomic breeding values (GEBVs), which are more accurate than conventional breeding values. The superiority of genomic selection is possible only when high-density SNP panels are used to track genes and QTLs affecting the trait. Unfortunately, even with the continuous decrease in genotyping costs, only a small fraction of the population has been genotyped with these high-density panels. It is often the case that a larger portion of the population is genotyped with low-density and low-cost SNP panels and then imputed to a higher density. Accuracy of SNP genotype imputation tends to be high when minimum requirements are met. Nevertheless, a certain rate of genotype imputation errors is unavoidable. Thus, it is reasonable to assume that the accuracy of GEBVs will be affected by imputation errors; especially, their cumulative effects over time. To evaluate the impact of multi-generational selection on the accuracy of SNP genotypes imputation and the reliability of resulting GEBVs, a simulation was carried out under varying updating of the reference population, distance between the reference and testing sets, and the approach used for the estimation of GEBVs. Using fixed reference populations, imputation accuracy decayed by about 0.5% per generation. In fact, after 25 generations, the accuracy was only 7% lower than the first generation. When the reference population was updated by either 1% or 5% of the top animals in the previous generations, decay of imputation accuracy was substantially reduced. These results indicate that low-density panels are useful, especially when the generational interval between reference and testing population is small. As the generational interval
Comparison of missing value imputation methods in time series: the case of Turkish meteorological data

Science.gov (United States)

Yozgatligil, Ceylan; Aslan, Sipan; Iyigun, Cem; Batmaz, Inci

2013-04-01

This study aims to compare several imputation methods to complete the missing values of spatio-temporal meteorological time series. To this end, six imputation methods are assessed with respect to various criteria including accuracy, robustness, precision, and efficiency for artificially created missing data in monthly total precipitation and mean temperature series obtained from the Turkish State Meteorological Service. Of these methods, simple arithmetic average, normal ratio (NR), and NR weighted with correlations comprise the simple ones, whereas multilayer perceptron type neural network and multiple imputation strategy adopted by Monte Carlo Markov Chain based on expectation-maximization (EM-MCMC) are computationally intensive ones. In addition, we propose a modification on the EM-MCMC method. Besides using a conventional accuracy measure based on squared errors, we also suggest the correlation dimension (CD) technique of nonlinear dynamic time series analysis which takes spatio-temporal dependencies into account for evaluating imputation performances. Depending on the detailed graphical and quantitative analysis, it can be said that although computational methods, particularly EM-MCMC method, are computationally inefficient, they seem favorable for imputation of meteorological time series with respect to different missingness periods considering both measures and both series studied. To conclude, using the EM-MCMC algorithm for imputing missing values before conducting any statistical analyses of meteorological data will definitely decrease the amount of uncertainty and give more robust results. Moreover, the CD measure can be suggested for the performance evaluation of missing data imputation particularly with computational methods since it gives more precise results in meteorological time series.
Multiple imputation by chained equations for systematically and sporadically missing multilevel data.

Science.gov (United States)

Resche-Rigon, Matthieu; White, Ian R

2018-06-01

In multilevel settings such as individual participant data meta-analysis, a variable is 'systematically missing' if it is wholly missing in some clusters and 'sporadically missing' if it is partly missing in some clusters. Previously proposed methods to impute incomplete multilevel data handle either systematically or sporadically missing data, but frequently both patterns are observed. We describe a new multiple imputation by chained equations (MICE) algorithm for multilevel data with arbitrary patterns of systematically and sporadically missing variables. The algorithm is described for multilevel normal data but can easily be extended for other variable types. We first propose two methods for imputing a single incomplete variable: an extension of an existing method and a new two-stage method which conveniently allows for heteroscedastic data. We then discuss the difficulties of imputing missing values in several variables in multilevel data using MICE, and show that even the simplest joint multilevel model implies conditional models which involve cluster means and heteroscedasticity. However, a simulation study finds that the proposed methods can be successfully combined in a multilevel MICE procedure, even when cluster means are not included in the imputation models.
Imputation Accuracy from Low to Moderate Density Single Nucleotide Polymorphism Chips in a Thai Multibreed Dairy Cattle Population

Directory of Open Access Journals (Sweden)

Danai Jattawa

2016-04-01

Full Text Available The objective of this study was to investigate the accuracy of imputation from low density (LDC to moderate density SNP chips (MDC in a Thai Holstein-Other multibreed dairy cattle population. Dairy cattle with complete pedigree information (n = 1,244 from 145 dairy farms were genotyped with GeneSeek GGP20K (n = 570, GGP26K (n = 540 and GGP80K (n = 134 chips. After checking for single nucleotide polymorphism (SNP quality, 17,779 SNP markers in common between the GGP20K, GGP26K, and GGP80K were used to represent MDC. Animals were divided into two groups, a reference group (n = 912 and a test group (n = 332. The SNP markers chosen for the test group were those located in positions corresponding to GeneSeek GGP9K (n = 7,652. The LDC to MDC genotype imputation was carried out using three different software packages, namely Beagle 3.3 (population-based algorithm, FImpute 2.2 (combined family- and population-based algorithms and Findhap 4 (combined family- and population-based algorithms. Imputation accuracies within and across chromosomes were calculated as ratios of correctly imputed SNP markers to overall imputed SNP markers. Imputation accuracy for the three software packages ranged from 76.79% to 93.94%. FImpute had higher imputation accuracy (93.94% than Findhap (84.64% and Beagle (76.79%. Imputation accuracies were similar and consistent across chromosomes for FImpute, but not for Findhap and Beagle. Most chromosomes that showed either high (73% or low (80% imputation accuracies were the same chromosomes that had above and below average linkage disequilibrium (LD; defined here as the correlation between pairs of adjacent SNP within chromosomes less than or equal to 1 Mb apart. Results indicated that FImpute was more suitable than Findhap and Beagle for genotype imputation in this Thai multibreed population. Perhaps additional increments in imputation accuracy could be achieved by increasing the completeness of pedigree information.
Comparison of different Methods for Univariate Time Series Imputation in R

OpenAIRE

Moritz, Steffen; Sardá, Alexis; Bartz-Beielstein, Thomas; Zaefferer, Martin; Stork, Jörg

2015-01-01

Missing values in datasets are a well-known problem and there are quite a lot of R packages offering imputation functions. But while imputation in general is well covered within R, it is hard to find functions for imputation of univariate time series. The problem is, most standard imputation techniques can not be applied directly. Most algorithms rely on inter-attribute correlations, while univariate time series imputation needs to employ time dependencies. This paper provides an overview of ...
Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes

Directory of Open Access Journals (Sweden)

Lotz Meredith J

2008-01-01

Full Text Available Abstract Background Gene expression data frequently contain missing values, however, most down-stream analyses for microarray experiments require complete data. In the literature many methods have been proposed to estimate missing values via information of the correlation patterns within the gene expression matrix. Each method has its own advantages, but the specific conditions for which each method is preferred remains largely unclear. In this report we describe an extensive evaluation of eight current imputation methods on multiple types of microarray experiments, including time series, multiple exposures, and multiple exposures × time series data. We then introduce two complementary selection schemes for determining the most appropriate imputation method for any given data set. Results We found that the optimal imputation algorithms (LSA, LLS, and BPCA are all highly competitive with each other, and that no method is uniformly superior in all the data sets we examined. The success of each method can also depend on the underlying "complexity" of the expression data, where we take complexity to indicate the difficulty in mapping the gene expression matrix to a lower-dimensional subspace. We developed an entropy measure to quantify the complexity of expression matrixes and found that, by incorporating this information, the entropy-based selection (EBS scheme is useful for selecting an appropriate imputation algorithm. We further propose a simulation-based self-training selection (STS scheme. This technique has been used previously for microarray data imputation, but for different purposes. The scheme selects the optimal or near-optimal method with high accuracy but at an increased computational cost. Conclusion Our findings provide insight into the problem of which imputation method is optimal for a given data set. Three top-performing methods (LSA, LLS and BPCA are competitive with each other. Global-based imputation methods (PLS, SVD, BPCA

Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes.

Science.gov (United States)

Brock, Guy N; Shaffer, John R; Blakesley, Richard E; Lotz, Meredith J; Tseng, George C

2008-01-10

Gene expression data frequently contain missing values, however, most down-stream analyses for microarray experiments require complete data. In the literature many methods have been proposed to estimate missing values via information of the correlation patterns within the gene expression matrix. Each method has its own advantages, but the specific conditions for which each method is preferred remains largely unclear. In this report we describe an extensive evaluation of eight current imputation methods on multiple types of microarray experiments, including time series, multiple exposures, and multiple exposures x time series data. We then introduce two complementary selection schemes for determining the most appropriate imputation method for any given data set. We found that the optimal imputation algorithms (LSA, LLS, and BPCA) are all highly competitive with each other, and that no method is uniformly superior in all the data sets we examined. The success of each method can also depend on the underlying "complexity" of the expression data, where we take complexity to indicate the difficulty in mapping the gene expression matrix to a lower-dimensional subspace. We developed an entropy measure to quantify the complexity of expression matrixes and found that, by incorporating this information, the entropy-based selection (EBS) scheme is useful for selecting an appropriate imputation algorithm. We further propose a simulation-based self-training selection (STS) scheme. This technique has been used previously for microarray data imputation, but for different purposes. The scheme selects the optimal or near-optimal method with high accuracy but at an increased computational cost. Our findings provide insight into the problem of which imputation method is optimal for a given data set. Three top-performing methods (LSA, LLS and BPCA) are competitive with each other. Global-based imputation methods (PLS, SVD, BPCA) performed better on mcroarray data with lower complexity
Comparison of different methods for imputing genome-wide marker genotypes in Swedish and Finnish Red Cattle

DEFF Research Database (Denmark)

Ma, Peipei; Brøndum, Rasmus Froberg; Qin, Zahng

2013-01-01

This study investigated the imputation accuracy of different methods, considering both the minor allele frequency and relatedness between individuals in the reference and test data sets. Two data sets from the combined population of Swedish and Finnish Red Cattle were used to test the influence...... coefficient was lower when the minor allele frequency was lower. The results indicate that Beagle and IMPUTE2 provide the most robust and accurate imputation accuracies, but considering computing time and memory usage, FImpute is another alternative method....
Missing in space: an evaluation of imputation methods for missing data in spatial analysis of risk factors for type II diabetes.

Science.gov (United States)

Baker, Jannah; White, Nicole; Mengersen, Kerrie

2014-11-20

Spatial analysis is increasingly important for identifying modifiable geographic risk factors for disease. However, spatial health data from surveys are often incomplete, ranging from missing data for only a few variables, to missing data for many variables. For spatial analyses of health outcomes, selection of an appropriate imputation method is critical in order to produce the most accurate inferences. We present a cross-validation approach to select between three imputation methods for health survey data with correlated lifestyle covariates, using as a case study, type II diabetes mellitus (DM II) risk across 71 Queensland Local Government Areas (LGAs). We compare the accuracy of mean imputation to imputation using multivariate normal and conditional autoregressive prior distributions. Choice of imputation method depends upon the application and is not necessarily the most complex method. Mean imputation was selected as the most accurate method in this application. Selecting an appropriate imputation method for health survey data, after accounting for spatial correlation and correlation between covariates, allows more complete analysis of geographic risk factors for disease with more confidence in the results to inform public policy decision-making.
LinkImputeR: user-guided genotype calling and imputation for non-model organisms.

Science.gov (United States)

Money, Daniel; Migicovsky, Zoë; Gardner, Kyle; Myles, Sean

2017-07-10

Genomic studies such as genome-wide association and genomic selection require genome-wide genotype data. All existing technologies used to create these data result in missing genotypes, which are often then inferred using genotype imputation software. However, existing imputation methods most often make use only of genotypes that are successfully inferred after having passed a certain read depth threshold. Because of this, any read information for genotypes that did not pass the threshold, and were thus set to missing, is ignored. Most genomic studies also choose read depth thresholds and quality filters without investigating their effects on the size and quality of the resulting genotype data. Moreover, almost all genotype imputation methods require ordered markers and are therefore of limited utility in non-model organisms. Here we introduce LinkImputeR, a software program that exploits the read count information that is normally ignored, and makes use of all available DNA sequence information for the purposes of genotype calling and imputation. It is specifically designed for non-model organisms since it requires neither ordered markers nor a reference panel of genotypes. Using next-generation DNA sequence (NGS) data from apple, cannabis and grape, we quantify the effect of varying read count and missingness thresholds on the quantity and quality of genotypes generated from LinkImputeR. We demonstrate that LinkImputeR can increase the number of genotype calls by more than an order of magnitude, can improve genotyping accuracy by several percent and can thus improve the power of downstream analyses. Moreover, we show that the effects of quality and read depth filters can differ substantially between data sets and should therefore be investigated on a per-study basis. By exploiting DNA sequence data that is normally ignored during genotype calling and imputation, LinkImputeR can significantly improve both the quantity and quality of genotype data generated from
Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information

Science.gov (United States)

Poyatos, Rafael; Sus, Oliver; Badiella, Llorenç; Mencuccini, Maurizio; Martínez-Vilalta, Jordi

2018-05-01

The ubiquity of missing data in plant trait databases may hinder trait-based analyses of ecological patterns and processes. Spatially explicit datasets with information on intraspecific trait variability are rare but offer great promise in improving our understanding of functional biogeography. At the same time, they offer specific challenges in terms of data imputation. Here we compare statistical imputation approaches, using varying levels of environmental information, for five plant traits (leaf biomass to sapwood area ratio, leaf nitrogen content, maximum tree height, leaf mass per area and wood density) in a spatially explicit plant trait dataset of temperate and Mediterranean tree species (Ecological and Forest Inventory of Catalonia, IEFC, dataset for Catalonia, north-east Iberian Peninsula, 31 900 km2). We simulated gaps at different missingness levels (10-80 %) in a complete trait matrix, and we used overall trait means, species means, k nearest neighbours (kNN), ordinary and regression kriging, and multivariate imputation using chained equations (MICE) to impute missing trait values. We assessed these methods in terms of their accuracy and of their ability to preserve trait distributions, multi-trait correlation structure and bivariate trait relationships. The relatively good performance of mean and species mean imputations in terms of accuracy masked a poor representation of trait distributions and multivariate trait structure. Species identity improved MICE imputations for all traits, whereas forest structure and topography improved imputations for some traits. No method performed best consistently for the five studied traits, but, considering all traits and performance metrics, MICE informed by relevant ecological variables gave the best results. However, at higher missingness (> 30 %), species mean imputations and regression kriging tended to outperform MICE for some traits. MICE informed by relevant ecological variables allowed us to fill the gaps in
A comparison of genomic selection models across time in interior spruce (Picea engelmannii × glauca) using unordered SNP imputation methods.

Science.gov (United States)

Ratcliffe, B; El-Dien, O G; Klápště, J; Porth, I; Chen, C; Jaquish, B; El-Kassaby, Y A

2015-12-01

Genomic selection (GS) potentially offers an unparalleled advantage over traditional pedigree-based selection (TS) methods by reducing the time commitment required to carry out a single cycle of tree improvement. This quality is particularly appealing to tree breeders, where lengthy improvement cycles are the norm. We explored the prospect of implementing GS for interior spruce (Picea engelmannii × glauca) utilizing a genotyped population of 769 trees belonging to 25 open-pollinated families. A series of repeated tree height measurements through ages 3-40 years permitted the testing of GS methods temporally. The genotyping-by-sequencing (GBS) platform was used for single nucleotide polymorphism (SNP) discovery in conjunction with three unordered imputation methods applied to a data set with 60% missing information. Further, three diverse GS models were evaluated based on predictive accuracy (PA), and their marker effects. Moderate levels of PA (0.31-0.55) were observed and were of sufficient capacity to deliver improved selection response over TS. Additionally, PA varied substantially through time accordingly with spatial competition among trees. As expected, temporal PA was well correlated with age-age genetic correlation (r=0.99), and decreased substantially with increasing difference in age between the training and validation populations (0.04-0.47). Moreover, our imputation comparisons indicate that k-nearest neighbor and singular value decomposition yielded a greater number of SNPs and gave higher predictive accuracies than imputing with the mean. Furthermore, the ridge regression (rrBLUP) and BayesCπ (BCπ) models both yielded equal, and better PA than the generalized ridge regression heteroscedastic effect model for the traits evaluated.
Missing value imputation for epistatic MAPs

LENUS (Irish Health Repository)

Ryan, Colm

2010-04-20

Abstract Background Epistatic miniarray profiling (E-MAPs) is a high-throughput approach capable of quantifying aggravating or alleviating genetic interactions between gene pairs. The datasets resulting from E-MAP experiments typically take the form of a symmetric pairwise matrix of interaction scores. These datasets have a significant number of missing values - up to 35% - that can reduce the effectiveness of some data analysis techniques and prevent the use of others. An effective method for imputing interactions would therefore increase the types of possible analysis, as well as increase the potential to identify novel functional interactions between gene pairs. Several methods have been developed to handle missing values in microarray data, but it is unclear how applicable these methods are to E-MAP data because of their pairwise nature and the significantly larger number of missing values. Here we evaluate four alternative imputation strategies, three local (Nearest neighbor-based) and one global (PCA-based), that have been modified to work with symmetric pairwise data. Results We identify different categories for the missing data based on their underlying cause, and show that values from the largest category can be imputed effectively. We compare local and global imputation approaches across a variety of distinct E-MAP datasets, showing that both are competitive and preferable to filling in with zeros. In addition we show that these methods are effective in an E-MAP from a different species, suggesting that pairwise imputation techniques will be increasingly useful as analogous epistasis mapping techniques are developed in different species. We show that strongly alleviating interactions are significantly more difficult to predict than strongly aggravating interactions. Finally we show that imputed interactions, generated using nearest neighbor methods, are enriched for annotations in the same manner as measured interactions. Therefore our method potentially
Gaussian mixture clustering and imputation of microarray data.

Science.gov (United States)

Ouyang, Ming; Welsh, William J; Georgopoulos, Panos

2004-04-12

In microarray experiments, missing entries arise from blemishes on the chips. In large-scale studies, virtually every chip contains some missing entries and more than 90% of the genes are affected. Many analysis methods require a full set of data. Either those genes with missing entries are excluded, or the missing entries are filled with estimates prior to the analyses. This study compares methods of missing value estimation. Two evaluation metrics of imputation accuracy are employed. First, the root mean squared error measures the difference between the true values and the imputed values. Second, the number of mis-clustered genes measures the difference between clustering with true values and that with imputed values; it examines the bias introduced by imputation to clustering. The Gaussian mixture clustering with model averaging imputation is superior to all other imputation methods, according to both evaluation metrics, on both time-series (correlated) and non-time series (uncorrelated) data sets.
Multiple Imputation of a Randomly Censored Covariate Improves Logistic Regression Analysis.

Science.gov (United States)

Atem, Folefac D; Qian, Jing; Maye, Jacqueline E; Johnson, Keith A; Betensky, Rebecca A

2016-01-01

Randomly censored covariates arise frequently in epidemiologic studies. The most commonly used methods, including complete case and single imputation or substitution, suffer from inefficiency and bias. They make strong parametric assumptions or they consider limit of detection censoring only. We employ multiple imputation, in conjunction with semi-parametric modeling of the censored covariate, to overcome these shortcomings and to facilitate robust estimation. We develop a multiple imputation approach for randomly censored covariates within the framework of a logistic regression model. We use the non-parametric estimate of the covariate distribution or the semiparametric Cox model estimate in the presence of additional covariates in the model. We evaluate this procedure in simulations, and compare its operating characteristics to those from the complete case analysis and a survival regression approach. We apply the procedures to an Alzheimer's study of the association between amyloid positivity and maternal age of onset of dementia. Multiple imputation achieves lower standard errors and higher power than the complete case approach under heavy and moderate censoring and is comparable under light censoring. The survival regression approach achieves the highest power among all procedures, but does not produce interpretable estimates of association. Multiple imputation offers a favorable alternative to complete case analysis and ad hoc substitution methods in the presence of randomly censored covariates within the framework of logistic regression.
Imputation of single nucleotide polymorhpism genotypes of Hereford cattle: reference panel size, family relationship and population structure

Science.gov (United States)

The objective of this study is to investigate single nucleotide polymorphism (SNP) genotypes imputation of Hereford cattle. Purebred Herefords were from two sources, Line 1 Hereford (N=240) and representatives of Industry Herefords (N=311). Using different reference panels of 62 and 494 males with 1...
Randomly and Non-Randomly Missing Renal Function Data in the Strong Heart Study: A Comparison of Imputation Methods.

Directory of Open Access Journals (Sweden)

Nawar Shara

Full Text Available Kidney and cardiovascular disease are widespread among populations with high prevalence of diabetes, such as American Indians participating in the Strong Heart Study (SHS. Studying these conditions simultaneously in longitudinal studies is challenging, because the morbidity and mortality associated with these diseases result in missing data, and these data are likely not missing at random. When such data are merely excluded, study findings may be compromised. In this article, a subset of 2264 participants with complete renal function data from Strong Heart Exams 1 (1989-1991, 2 (1993-1995, and 3 (1998-1999 was used to examine the performance of five methods used to impute missing data: listwise deletion, mean of serial measures, adjacent value, multiple imputation, and pattern-mixture. Three missing at random models and one non-missing at random model were used to compare the performance of the imputation techniques on randomly and non-randomly missing data. The pattern-mixture method was found to perform best for imputing renal function data that were not missing at random. Determining whether data are missing at random or not can help in choosing the imputation method that will provide the most accurate results.
Gap-filling a spatially explicit plant trait database: comparing imputation methods and different levels of environmental information

Directory of Open Access Journals (Sweden)

R. Poyatos

2018-05-01

Full Text Available The ubiquity of missing data in plant trait databases may hinder trait-based analyses of ecological patterns and processes. Spatially explicit datasets with information on intraspecific trait variability are rare but offer great promise in improving our understanding of functional biogeography. At the same time, they offer specific challenges in terms of data imputation. Here we compare statistical imputation approaches, using varying levels of environmental information, for five plant traits (leaf biomass to sapwood area ratio, leaf nitrogen content, maximum tree height, leaf mass per area and wood density in a spatially explicit plant trait dataset of temperate and Mediterranean tree species (Ecological and Forest Inventory of Catalonia, IEFC, dataset for Catalonia, north-east Iberian Peninsula, 31 900 km2. We simulated gaps at different missingness levels (10–80 % in a complete trait matrix, and we used overall trait means, species means, k nearest neighbours (kNN, ordinary and regression kriging, and multivariate imputation using chained equations (MICE to impute missing trait values. We assessed these methods in terms of their accuracy and of their ability to preserve trait distributions, multi-trait correlation structure and bivariate trait relationships. The relatively good performance of mean and species mean imputations in terms of accuracy masked a poor representation of trait distributions and multivariate trait structure. Species identity improved MICE imputations for all traits, whereas forest structure and topography improved imputations for some traits. No method performed best consistently for the five studied traits, but, considering all traits and performance metrics, MICE informed by relevant ecological variables gave the best results. However, at higher missingness (> 30 %, species mean imputations and regression kriging tended to outperform MICE for some traits. MICE informed by relevant ecological variables
Multiply-Imputed Synthetic Data: Advice to the Imputer

Directory of Open Access Journals (Sweden)

Loong Bronwyn

2017-12-01

Full Text Available Several statistical agencies have started to use multiply-imputed synthetic microdata to create public-use data in major surveys. The purpose of doing this is to protect the confidentiality of respondents’ identities and sensitive attributes, while allowing standard complete-data analyses of microdata. A key challenge, faced by advocates of synthetic data, is demonstrating that valid statistical inferences can be obtained from such synthetic data for non-confidential questions. Large discrepancies between observed-data and synthetic-data analytic results for such questions may arise because of uncongeniality; that is, differences in the types of inputs available to the imputer, who has access to the actual data, and to the analyst, who has access only to the synthetic data. Here, we discuss a simple, but possibly canonical, example of uncongeniality when using multiple imputation to create synthetic data, which specifically addresses the choices made by the imputer. An initial, unanticipated but not surprising, conclusion is that non-confidential design information used to impute synthetic data should be released with the confidential synthetic data to allow users of synthetic data to avoid possible grossly conservative inferences.
Multiple imputation and its application

CERN Document Server

Carpenter, James

2013-01-01

A practical guide to analysing partially observed data. Collecting, analysing and drawing inferences from data is central to research in the medical and social sciences. Unfortunately, it is rarely possible to collect all the intended data. The literature on inference from the resulting incomplete data is now huge, and continues to grow both as methods are developed for large and complex data structures, and as increasing computer power and suitable software enable researchers to apply these methods. This book focuses on a particular statistical method for analysing and drawing inferences from incomplete data, called Multiple Imputation (MI). MI is attractive because it is both practical and widely applicable. The authors aim is to clarify the issues raised by missing data, describing the rationale for MI, the relationship between the various imputation models and associated algorithms and its application to increasingly complex data structures. Multiple Imputation and its Application: Discusses the issues ...
Use of Multiple Imputation Method to Improve Estimation of Missing Baseline Serum Creatinine in Acute Kidney Injury Research

Science.gov (United States)

Peterson, Josh F.; Eden, Svetlana K.; Moons, Karel G.; Ikizler, T. Alp; Matheny, Michael E.

2013-01-01

Summary Background and objectives Baseline creatinine (BCr) is frequently missing in AKI studies. Common surrogate estimates can misclassify AKI and adversely affect the study of related outcomes. This study examined whether multiple imputation improved accuracy of estimating missing BCr beyond current recommendations to apply assumed estimated GFR (eGFR) of 75 ml/min per 1.73 m2 (eGFR 75). Design, setting, participants, & measurements From 41,114 unique adult admissions (13,003 with and 28,111 without BCr data) at Vanderbilt University Hospital between 2006 and 2008, a propensity score model was developed to predict likelihood of missing BCr. Propensity scoring identified 6502 patients with highest likelihood of missing BCr among 13,003 patients with known BCr to simulate a “missing” data scenario while preserving actual reference BCr. Within this cohort (n=6502), the ability of various multiple-imputation approaches to estimate BCr and classify AKI were compared with that of eGFR 75. Results All multiple-imputation methods except the basic one more closely approximated actual BCr than did eGFR 75. Total AKI misclassification was lower with multiple imputation (full multiple imputation + serum creatinine) (9.0%) than with eGFR 75 (12.3%; Pcreatinine) (15.3%) versus eGFR 75 (40.5%; P<0.001). Multiple imputation improved specificity and positive predictive value for detecting AKI at the expense of modestly decreasing sensitivity relative to eGFR 75. Conclusions Multiple imputation can improve accuracy in estimating missing BCr and reduce misclassification of AKI beyond currently proposed methods. PMID:23037980
Cost reduction for web-based data imputation

KAUST Repository

Li, Zhixu

2014-01-01

Web-based Data Imputation enables the completion of incomplete data sets by retrieving absent field values from the Web. In particular, complete fields can be used as keywords in imputation queries for absent fields. However, due to the ambiguity of these keywords and the data complexity on the Web, different queries may retrieve different answers to the same absent field value. To decide the most probable right answer to each absent filed value, existing method issues quite a few available imputation queries for each absent value, and then vote on deciding the most probable right answer. As a result, we have to issue a large number of imputation queries for filling all absent values in an incomplete data set, which brings a large overhead. In this paper, we work on reducing the cost of Web-based Data Imputation in two aspects: First, we propose a query execution scheme which can secure the most probable right answer to an absent field value by issuing as few imputation queries as possible. Second, we recognize and prune queries that probably will fail to return any answers a priori. Our extensive experimental evaluation shows that our proposed techniques substantially reduce the cost of Web-based Imputation without hurting its high imputation accuracy. © 2014 Springer International Publishing Switzerland.
Methods for significance testing of categorical covariates in logistic regression models after multiple imputation: power and applicability analysis

NARCIS (Netherlands)

Eekhout, I.; Wiel, M.A. van de; Heymans, M.W.

2017-01-01

Background. Multiple imputation is a recommended method to handle missing data. For significance testing after multiple imputation, Rubin’s Rules (RR) are easily applied to pool parameter estimates. In a logistic regression model, to consider whether a categorical covariate with more than two levels
Estimating the accuracy of geographical imputation

Directory of Open Access Journals (Sweden)

Boscoe Francis P

2008-01-01

Full Text Available Abstract Background To reduce the number of non-geocoded cases researchers and organizations sometimes include cases geocoded to postal code centroids along with cases geocoded with the greater precision of a full street address. Some analysts then use the postal code to assign information to the cases from finer-level geographies such as a census tract. Assignment is commonly completed using either a postal centroid or by a geographical imputation method which assigns a location by using both the demographic characteristics of the case and the population characteristics of the postal delivery area. To date no systematic evaluation of geographical imputation methods ("geo-imputation" has been completed. The objective of this study was to determine the accuracy of census tract assignment using geo-imputation. Methods Using a large dataset of breast, prostate and colorectal cancer cases reported to the New Jersey Cancer Registry, we determined how often cases were assigned to the correct census tract using alternate strategies of demographic based geo-imputation, and using assignments obtained from postal code centroids. Assignment accuracy was measured by comparing the tract assigned with the tract originally identified from the full street address. Results Assigning cases to census tracts using the race/ethnicity population distribution within a postal code resulted in more correctly assigned cases than when using postal code centroids. The addition of age characteristics increased the match rates even further. Match rates were highly dependent on both the geographic distribution of race/ethnicity groups and population density. Conclusion Geo-imputation appears to offer some advantages and no serious drawbacks as compared with the alternative of assigning cases to census tracts based on postal code centroids. For a specific analysis, researchers will still need to consider the potential impact of geocoding quality on their results and evaluate
Evaluating Imputation Algorithms for Low-Depth Genotyping-By-Sequencing (GBS Data.

Directory of Open Access Journals (Sweden)

Ariel W Chan

Full Text Available Well-powered genomic studies require genome-wide marker coverage across many individuals. For non-model species with few genomic resources, high-throughput sequencing (HTS methods, such as Genotyping-By-Sequencing (GBS, offer an inexpensive alternative to array-based genotyping. Although affordable, datasets derived from HTS methods suffer from sequencing error, alignment errors, and missing data, all of which introduce noise and uncertainty to variant discovery and genotype calling. Under such circumstances, meaningful analysis of the data is difficult. Our primary interest lies in the issue of how one can accurately infer or impute missing genotypes in HTS-derived datasets. Many of the existing genotype imputation algorithms and software packages were primarily developed by and optimized for the human genetics community, a field where a complete and accurate reference genome has been constructed and SNP arrays have, in large part, been the common genotyping platform. We set out to answer two questions: 1 can we use existing imputation methods developed by the human genetics community to impute missing genotypes in datasets derived from non-human species and 2 are these methods, which were developed and optimized to impute ascertained variants, amenable for imputation of missing genotypes at HTS-derived variants? We selected Beagle v.4, a widely used algorithm within the human genetics community with reportedly high accuracy, to serve as our imputation contender. We performed a series of cross-validation experiments, using GBS data collected from the species Manihot esculenta by the Next Generation (NEXTGEN Cassava Breeding Project. NEXTGEN currently imputes missing genotypes in their datasets using a LASSO-penalized, linear regression method (denoted 'glmnet'. We selected glmnet to serve as a benchmark imputation method for this reason. We obtained estimates of imputation accuracy by masking a subset of observed genotypes, imputing, and
Evaluating Imputation Algorithms for Low-Depth Genotyping-By-Sequencing (GBS) Data.

Science.gov (United States)

Chan, Ariel W; Hamblin, Martha T; Jannink, Jean-Luc

2016-01-01

Well-powered genomic studies require genome-wide marker coverage across many individuals. For non-model species with few genomic resources, high-throughput sequencing (HTS) methods, such as Genotyping-By-Sequencing (GBS), offer an inexpensive alternative to array-based genotyping. Although affordable, datasets derived from HTS methods suffer from sequencing error, alignment errors, and missing data, all of which introduce noise and uncertainty to variant discovery and genotype calling. Under such circumstances, meaningful analysis of the data is difficult. Our primary interest lies in the issue of how one can accurately infer or impute missing genotypes in HTS-derived datasets. Many of the existing genotype imputation algorithms and software packages were primarily developed by and optimized for the human genetics community, a field where a complete and accurate reference genome has been constructed and SNP arrays have, in large part, been the common genotyping platform. We set out to answer two questions: 1) can we use existing imputation methods developed by the human genetics community to impute missing genotypes in datasets derived from non-human species and 2) are these methods, which were developed and optimized to impute ascertained variants, amenable for imputation of missing genotypes at HTS-derived variants? We selected Beagle v.4, a widely used algorithm within the human genetics community with reportedly high accuracy, to serve as our imputation contender. We performed a series of cross-validation experiments, using GBS data collected from the species Manihot esculenta by the Next Generation (NEXTGEN) Cassava Breeding Project. NEXTGEN currently imputes missing genotypes in their datasets using a LASSO-penalized, linear regression method (denoted 'glmnet'). We selected glmnet to serve as a benchmark imputation method for this reason. We obtained estimates of imputation accuracy by masking a subset of observed genotypes, imputing, and calculating the

Bootstrap inference when using multiple imputation.

Science.gov (United States)

Schomaker, Michael; Heumann, Christian

2018-04-16

Many modern estimators require bootstrapping to calculate confidence intervals because either no analytic standard error is available or the distribution of the parameter of interest is nonsymmetric. It remains however unclear how to obtain valid bootstrap inference when dealing with multiple imputation to address missing data. We present 4 methods that are intuitively appealing, easy to implement, and combine bootstrap estimation with multiple imputation. We show that 3 of the 4 approaches yield valid inference, but that the performance of the methods varies with respect to the number of imputed data sets and the extent of missingness. Simulation studies reveal the behavior of our approaches in finite samples. A topical analysis from HIV treatment research, which determines the optimal timing of antiretroviral treatment initiation in young children, demonstrates the practical implications of the 4 methods in a sophisticated and realistic setting. This analysis suffers from missing data and uses the g-formula for inference, a method for which no standard errors are available. Copyright © 2018 John Wiley & Sons, Ltd.
Which DTW Method Applied to Marine Univariate Time Series Imputation

OpenAIRE

Phan , Thi-Thu-Hong; Caillault , Émilie; Lefebvre , Alain; Bigand , André

2017-01-01

International audience; Missing data are ubiquitous in any domains of applied sciences. Processing datasets containing missing values can lead to a loss of efficiency and unreliable results, especially for large missing sub-sequence(s). Therefore, the aim of this paper is to build a framework for filling missing values in univariate time series and to perform a comparison of different similarity metrics used for the imputation task. This allows to suggest the most suitable methods for the imp...
Missing value imputation for microarray gene expression data using histone acetylation information

Directory of Open Access Journals (Sweden)

Feng Jihua

2008-05-01

Full Text Available Abstract Background It is an important pre-processing step to accurately estimate missing values in microarray data, because complete datasets are required in numerous expression profile analysis in bioinformatics. Although several methods have been suggested, their performances are not satisfactory for datasets with high missing percentages. Results The paper explores the feasibility of doing missing value imputation with the help of gene regulatory mechanism. An imputation framework called histone acetylation information aided imputation method (HAIimpute method is presented. It incorporates the histone acetylation information into the conventional KNN(k-nearest neighbor and LLS(local least square imputation algorithms for final prediction of the missing values. The experimental results indicated that the use of acetylation information can provide significant improvements in microarray imputation accuracy. The HAIimpute methods consistently improve the widely used methods such as KNN and LLS in terms of normalized root mean squared error (NRMSE. Meanwhile, the genes imputed by HAIimpute methods are more correlated with the original complete genes in terms of Pearson correlation coefficients. Furthermore, the proposed methods also outperform GOimpute, which is one of the existing related methods that use the functional similarity as the external information. Conclusion We demonstrated that the using of histone acetylation information could greatly improve the performance of the imputation especially at high missing percentages. This idea can be generalized to various imputation methods to facilitate the performance. Moreover, with more knowledge accumulated on gene regulatory mechanism in addition to histone acetylation, the performance of our approach can be further improved and verified.
A Nonparametric, Multiple Imputation-Based Method for the Retrospective Integration of Data Sets

Science.gov (United States)

Carrig, Madeline M.; Manrique-Vallier, Daniel; Ranby, Krista W.; Reiter, Jerome P.; Hoyle, Rick H.

2015-01-01

Complex research questions often cannot be addressed adequately with a single data set. One sensible alternative to the high cost and effort associated with the creation of large new data sets is to combine existing data sets containing variables related to the constructs of interest. The goal of the present research was to develop a flexible, broadly applicable approach to the integration of disparate data sets that is based on nonparametric multiple imputation and the collection of data from a convenient, de novo calibration sample. We demonstrate proof of concept for the approach by integrating three existing data sets containing items related to the extent of problematic alcohol use and associations with deviant peers. We discuss both necessary conditions for the approach to work well and potential strengths and weaknesses of the method compared to other data set integration approaches. PMID:26257437
Traffic speed data imputation method based on tensor completion.

Science.gov (United States)

Ran, Bin; Tan, Huachun; Feng, Jianshuai; Liu, Ying; Wang, Wuhong

2015-01-01

Traffic speed data plays a key role in Intelligent Transportation Systems (ITS); however, missing traffic data would affect the performance of ITS as well as Advanced Traveler Information Systems (ATIS). In this paper, we handle this issue by a novel tensor-based imputation approach. Specifically, tensor pattern is adopted for modeling traffic speed data and then High accurate Low Rank Tensor Completion (HaLRTC), an efficient tensor completion method, is employed to estimate the missing traffic speed data. This proposed method is able to recover missing entries from given entries, which may be noisy, considering severe fluctuation of traffic speed data compared with traffic volume. The proposed method is evaluated on Performance Measurement System (PeMS) database, and the experimental results show the superiority of the proposed approach over state-of-the-art baseline approaches.
A web-based approach to data imputation

KAUST Repository

Li, Zhixu

2013-10-24

In this paper, we present WebPut, a prototype system that adopts a novel web-based approach to the data imputation problem. Towards this, Webput utilizes the available information in an incomplete database in conjunction with the data consistency principle. Moreover, WebPut extends effective Information Extraction (IE) methods for the purpose of formulating web search queries that are capable of effectively retrieving missing values with high accuracy. WebPut employs a confidence-based scheme that efficiently leverages our suite of data imputation queries to automatically select the most effective imputation query for each missing value. A greedy iterative algorithm is proposed to schedule the imputation order of the different missing values in a database, and in turn the issuing of their corresponding imputation queries, for improving the accuracy and efficiency of WebPut. Moreover, several optimization techniques are also proposed to reduce the cost of estimating the confidence of imputation queries at both the tuple-level and the database-level. Experiments based on several real-world data collections demonstrate not only the effectiveness of WebPut compared to existing approaches, but also the efficiency of our proposed algorithms and optimization techniques. © 2013 Springer Science+Business Media New York.
Candidate gene analysis using imputed genotypes: cell cycle single-nucleotide polymorphisms and ovarian cancer risk

DEFF Research Database (Denmark)

Goode, Ellen L; Fridley, Brooke L; Vierkant, Robert A

2009-01-01

Polymorphisms in genes critical to cell cycle control are outstanding candidates for association with ovarian cancer risk; numerous genes have been interrogated by multiple research groups using differing tagging single-nucleotide polymorphism (SNP) sets. To maximize information gleaned from......, and rs3212891; CDK2 rs2069391, rs2069414, and rs17528736; and CCNE1 rs3218036. These results exemplify the utility of imputation in candidate gene studies and lend evidence to a role of cell cycle genes in ovarian cancer etiology, suggest a reduced set of SNPs to target in additional cases and controls....
Imputation of microsatellite alleles from dense SNP genotypes for parental verification

Directory of Open Access Journals (Sweden)

Matthew eMcclure

2012-08-01

Full Text Available Microsatellite (MS markers have recently been used for parental verification and are still the international standard despite higher cost, error rate, and turnaround time compared with Single Nucleotide Polymorphisms (SNP-based assays. Despite domestic and international interest from producers and research communities, no viable means currently exist to verify parentage for an individual unless all familial connections were analyzed using the same DNA marker type (MS or SNP. A simple and cost-effective method was devised to impute MS alleles from SNP haplotypes within breeds. For some MS, imputation results may allow inference across breeds. A total of 347 dairy cattle representing 4 dairy breeds (Brown Swiss, Guernsey, Holstein, and Jersey were used to generate reference haplotypes. This approach has been verified (>98% accurate for imputing the International Society of Animal Genetics (ISAG recommended panel of 12 MS for cattle parentage verification across a validation set of 1,307 dairy animals.. Implementation of this method will allow producers and breed associations to transition to SNP-based parentage verification utilizing MS genotypes from historical data on parents where SNP genotypes are missing. This approach may be applicable to additional cattle breeds and other species that wish to migrate from MS- to SNP- based parental verification.
Evaluating geographic imputation approaches for zip code level data: an application to a study of pediatric diabetes

Directory of Open Access Journals (Sweden)

Puett Robin C

2009-10-01

Full Text Available Abstract Background There is increasing interest in the study of place effects on health, facilitated in part by geographic information systems. Incomplete or missing address information reduces geocoding success. Several geographic imputation methods have been suggested to overcome this limitation. Accuracy evaluation of these methods can be focused at the level of individuals and at higher group-levels (e.g., spatial distribution. Methods We evaluated the accuracy of eight geo-imputation methods for address allocation from ZIP codes to census tracts at the individual and group level. The spatial apportioning approaches underlying the imputation methods included four fixed (deterministic and four random (stochastic allocation methods using land area, total population, population under age 20, and race/ethnicity as weighting factors. Data included more than 2,000 geocoded cases of diabetes mellitus among youth aged 0-19 in four U.S. regions. The imputed distribution of cases across tracts was compared to the true distribution using a chi-squared statistic. Results At the individual level, population-weighted (total or under age 20 fixed allocation showed the greatest level of accuracy, with correct census tract assignments averaging 30.01% across all regions, followed by the race/ethnicity-weighted random method (23.83%. The true distribution of cases across census tracts was that 58.2% of tracts exhibited no cases, 26.2% had one case, 9.5% had two cases, and less than 3% had three or more. This distribution was best captured by random allocation methods, with no significant differences (p-value > 0.90. However, significant differences in distributions based on fixed allocation methods were found (p-value Conclusion Fixed imputation methods seemed to yield greatest accuracy at the individual level, suggesting use for studies on area-level environmental exposures. Fixed methods result in artificial clusters in single census tracts. For studies
Traffic Speed Data Imputation Method Based on Tensor Completion

Directory of Open Access Journals (Sweden)

Bin Ran

2015-01-01

Full Text Available Traffic speed data plays a key role in Intelligent Transportation Systems (ITS; however, missing traffic data would affect the performance of ITS as well as Advanced Traveler Information Systems (ATIS. In this paper, we handle this issue by a novel tensor-based imputation approach. Specifically, tensor pattern is adopted for modeling traffic speed data and then High accurate Low Rank Tensor Completion (HaLRTC, an efficient tensor completion method, is employed to estimate the missing traffic speed data. This proposed method is able to recover missing entries from given entries, which may be noisy, considering severe fluctuation of traffic speed data compared with traffic volume. The proposed method is evaluated on Performance Measurement System (PeMS database, and the experimental results show the superiority of the proposed approach over state-of-the-art baseline approaches.
Data imputation analysis for Cosmic Rays time series

Science.gov (United States)

Fernandes, R. C.; Lucio, P. S.; Fernandez, J. H.

2017-05-01

The occurrence of missing data concerning Galactic Cosmic Rays time series (GCR) is inevitable since loss of data is due to mechanical and human failure or technical problems and different periods of operation of GCR stations. The aim of this study was to perform multiple dataset imputation in order to depict the observational dataset. The study has used the monthly time series of GCR Climax (CLMX) and Roma (ROME) from 1960 to 2004 to simulate scenarios of 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80% and 90% of missing data compared to observed ROME series, with 50 replicates. Then, the CLMX station as a proxy for allocation of these scenarios was used. Three different methods for monthly dataset imputation were selected: AMÉLIA II - runs the bootstrap Expectation Maximization algorithm, MICE - runs an algorithm via Multivariate Imputation by Chained Equations and MTSDI - an Expectation Maximization algorithm-based method for imputation of missing values in multivariate normal time series. The synthetic time series compared with the observed ROME series has also been evaluated using several skill measures as such as RMSE, NRMSE, Agreement Index, R, R2, F-test and t-test. The results showed that for CLMX and ROME, the R2 and R statistics were equal to 0.98 and 0.96, respectively. It was observed that increases in the number of gaps generate loss of quality of the time series. Data imputation was more efficient with MTSDI method, with negligible errors and best skill coefficients. The results suggest a limit of about 60% of missing data for imputation, for monthly averages, no more than this. It is noteworthy that CLMX, ROME and KIEL stations present no missing data in the target period. This methodology allowed reconstructing 43 time series.
A Comparison of Joint Model and Fully Conditional Specification Imputation for Multilevel Missing Data

Science.gov (United States)

Mistler, Stephen A.; Enders, Craig K.

2017-01-01

Multiple imputation methods can generally be divided into two broad frameworks: joint model (JM) imputation and fully conditional specification (FCS) imputation. JM draws missing values simultaneously for all incomplete variables using a multivariate distribution, whereas FCS imputes variables one at a time from a series of univariate conditional…
Statistical Analysis of a Class: Monte Carlo and Multiple Imputation Spreadsheet Methods for Estimation and Extrapolation

Science.gov (United States)

Fish, Laurel J.; Halcoussis, Dennis; Phillips, G. Michael

2017-01-01

The Monte Carlo method and related multiple imputation methods are traditionally used in math, physics and science to estimate and analyze data and are now becoming standard tools in analyzing business and financial problems. However, few sources explain the application of the Monte Carlo method for individuals and business professionals who are…
Handling missing data for the identification of charged particles in a multilayer detector: A comparison between different imputation methods

Energy Technology Data Exchange (ETDEWEB)

Riggi, S., E-mail: sriggi@oact.inaf.it [INAF - Osservatorio Astrofisico di Catania (Italy); Riggi, D. [Keras Strategy - Milano (Italy); Riggi, F. [Dipartimento di Fisica e Astronomia - Università di Catania (Italy); INFN, Sezione di Catania (Italy)

2015-04-21

Identification of charged particles in a multilayer detector by the energy loss technique may also be achieved by the use of a neural network. The performance of the network becomes worse when a large fraction of information is missing, for instance due to detector inefficiencies. Algorithms which provide a way to impute missing information have been developed over the past years. Among the various approaches, we focused on normal mixtures’ models in comparison with standard mean imputation and multiple imputation methods. Further, to account for the intrinsic asymmetry of the energy loss data, we considered skew-normal mixture models and provided a closed form implementation in the Expectation-Maximization (EM) algorithm framework to handle missing patterns. The method has been applied to a test case where the energy losses of pions, kaons and protons in a six-layers’ Silicon detector are considered as input neurons to a neural network. Results are given in terms of reconstruction efficiency and purity of the various species in different momentum bins.
Using imputed genotype data in the joint score tests for genetic association and gene-environment interactions in case-control studies.

Science.gov (United States)

Song, Minsun; Wheeler, William; Caporaso, Neil E; Landi, Maria Teresa; Chatterjee, Nilanjan

2018-03-01

Genome-wide association studies (GWAS) are now routinely imputed for untyped single nucleotide polymorphisms (SNPs) based on various powerful statistical algorithms for imputation trained on reference datasets. The use of predicted allele counts for imputed SNPs as the dosage variable is known to produce valid score test for genetic association. In this paper, we investigate how to best handle imputed SNPs in various modern complex tests for genetic associations incorporating gene-environment interactions. We focus on case-control association studies where inference for an underlying logistic regression model can be performed using alternative methods that rely on varying degree on an assumption of gene-environment independence in the underlying population. As increasingly large-scale GWAS are being performed through consortia effort where it is preferable to share only summary-level information across studies, we also describe simple mechanisms for implementing score tests based on standard meta-analysis of "one-step" maximum-likelihood estimates across studies. Applications of the methods in simulation studies and a dataset from GWAS of lung cancer illustrate ability of the proposed methods to maintain type-I error rates for the underlying testing procedures. For analysis of imputed SNPs, similar to typed SNPs, the retrospective methods can lead to considerable efficiency gain for modeling of gene-environment interactions under the assumption of gene-environment independence. Methods are made available for public use through CGEN R software package. © 2017 WILEY PERIODICALS, INC.
Time Series Imputation via L1 Norm-Based Singular Spectrum Analysis

Science.gov (United States)

Kalantari, Mahdi; Yarmohammadi, Masoud; Hassani, Hossein; Silva, Emmanuel Sirimal

Missing values in time series data is a well-known and important problem which many researchers have studied extensively in various fields. In this paper, a new nonparametric approach for missing value imputation in time series is proposed. The main novelty of this research is applying the L1 norm-based version of Singular Spectrum Analysis (SSA), namely L1-SSA which is robust against outliers. The performance of the new imputation method has been compared with many other established methods. The comparison is done by applying them to various real and simulated time series. The obtained results confirm that the SSA-based methods, especially L1-SSA can provide better imputation in comparison to other methods.
Synthetic Multiple-Imputation Procedure for Multistage Complex Samples

Directory of Open Access Journals (Sweden)

Zhou Hanzhi

2016-03-01

Full Text Available Multiple imputation (MI is commonly used when item-level missing data are present. However, MI requires that survey design information be built into the imputation models. For multistage stratified clustered designs, this requires dummy variables to represent strata as well as primary sampling units (PSUs nested within each stratum in the imputation model. Such a modeling strategy is not only operationally burdensome but also inferentially inefficient when there are many strata in the sample design. Complexity only increases when sampling weights need to be modeled. This article develops a generalpurpose analytic strategy for population inference from complex sample designs with item-level missingness. In a simulation study, the proposed procedures demonstrate efficient estimation and good coverage properties. We also consider an application to accommodate missing body mass index (BMI data in the analysis of BMI percentiles using National Health and Nutrition Examination Survey (NHANES III data. We argue that the proposed methods offer an easy-to-implement solution to problems that are not well-handled by current MI techniques. Note that, while the proposed method borrows from the MI framework to develop its inferential methods, it is not designed as an alternative strategy to release multiply imputed datasets for complex sample design data, but rather as an analytic strategy in and of itself.
ParaHaplo 3.0: A program package for imputation and a haplotype-based whole-genome association study using hybrid parallel computing

Directory of Open Access Journals (Sweden)

Kamatani Naoyuki

2011-05-01

Full Text Available Abstract Background Use of missing genotype imputations and haplotype reconstructions are valuable in genome-wide association studies (GWASs. By modeling the patterns of linkage disequilibrium in a reference panel, genotypes not directly measured in the study samples can be imputed and used for GWASs. Since millions of single nucleotide polymorphisms need to be imputed in a GWAS, faster methods for genotype imputation and haplotype reconstruction are required. Results We developed a program package for parallel computation of genotype imputation and haplotype reconstruction. Our program package, ParaHaplo 3.0, is intended for use in workstation clusters using the Intel Message Passing Interface. We compared the performance of ParaHaplo 3.0 on the Japanese in Tokyo, Japan and Han Chinese in Beijing, and Chinese in the HapMap dataset. A parallel version of ParaHaplo 3.0 can conduct genotype imputation 20 times faster than a non-parallel version of ParaHaplo. Conclusions ParaHaplo 3.0 is an invaluable tool for conducting haplotype-based GWASs. The need for faster genotype imputation and haplotype reconstruction using parallel computing will become increasingly important as the data sizes of such projects continue to increase. ParaHaplo executable binaries and program sources are available at http://en.sourceforge.jp/projects/parallelgwas/releases/.
Outlier Removal in Model-Based Missing Value Imputation for Medical Datasets

Directory of Open Access Journals (Sweden)

Min-Wei Huang

2018-01-01

Full Text Available Many real-world medical datasets contain some proportion of missing (attribute values. In general, missing value imputation can be performed to solve this problem, which is to provide estimations for the missing values by a reasoning process based on the (complete observed data. However, if the observed data contain some noisy information or outliers, the estimations of the missing values may not be reliable or may even be quite different from the real values. The aim of this paper is to examine whether a combination of instance selection from the observed data and missing value imputation offers better performance than performing missing value imputation alone. In particular, three instance selection algorithms, DROP3, GA, and IB3, and three imputation algorithms, KNNI, MLP, and SVM, are used in order to find out the best combination. The experimental results show that that performing instance selection can have a positive impact on missing value imputation over the numerical data type of medical datasets, and specific combinations of instance selection and imputation methods can improve the imputation results over the mixed data type of medical datasets. However, instance selection does not have a definitely positive impact on the imputation result for categorical medical datasets.
A nonparametric multiple imputation approach for missing categorical data

Directory of Open Access Journals (Sweden)

Muhan Zhou

2017-06-01

Full Text Available Abstract Background Incomplete categorical variables with more than two categories are common in public health data. However, most of the existing missing-data methods do not use the information from nonresponse (missingness probabilities. Methods We propose a nearest-neighbour multiple imputation approach to impute a missing at random categorical outcome and to estimate the proportion of each category. The donor set for imputation is formed by measuring distances between each missing value with other non-missing values. The distance function is calculated based on a predictive score, which is derived from two working models: one fits a multinomial logistic regression for predicting the missing categorical outcome (the outcome model and the other fits a logistic regression for predicting missingness probabilities (the missingness model. A weighting scheme is used to accommodate contributions from two working models when generating the predictive score. A missing value is imputed by randomly selecting one of the non-missing values with the smallest distances. We conduct a simulation to evaluate the performance of the proposed method and compare it with several alternative methods. A real-data application is also presented. Results The simulation study suggests that the proposed method performs well when missingness probabilities are not extreme under some misspecifications of the working models. However, the calibration estimator, which is also based on two working models, can be highly unstable when missingness probabilities for some observations are extremely high. In this scenario, the proposed method produces more stable and better estimates. In addition, proper weights need to be chosen to balance the contributions from the two working models and achieve optimal results for the proposed method. Conclusions We conclude that the proposed multiple imputation method is a reasonable approach to dealing with missing categorical outcome data with

3D-MICE: integration of cross-sectional and longitudinal imputation for multi-analyte longitudinal clinical data.

Science.gov (United States)

Luo, Yuan; Szolovits, Peter; Dighe, Anand S; Baron, Jason M

2018-06-01

A key challenge in clinical data mining is that most clinical datasets contain missing data. Since many commonly used machine learning algorithms require complete datasets (no missing data), clinical analytic approaches often entail an imputation procedure to "fill in" missing data. However, although most clinical datasets contain a temporal component, most commonly used imputation methods do not adequately accommodate longitudinal time-based data. We sought to develop a new imputation algorithm, 3-dimensional multiple imputation with chained equations (3D-MICE), that can perform accurate imputation of missing clinical time series data. We extracted clinical laboratory test results for 13 commonly measured analytes (clinical laboratory tests). We imputed missing test results for the 13 analytes using 3 imputation methods: multiple imputation with chained equations (MICE), Gaussian process (GP), and 3D-MICE. 3D-MICE utilizes both MICE and GP imputation to integrate cross-sectional and longitudinal information. To evaluate imputation method performance, we randomly masked selected test results and imputed these masked results alongside results missing from our original data. We compared predicted results to measured results for masked data points. 3D-MICE performed significantly better than MICE and GP-based imputation in a composite of all 13 analytes, predicting missing results with a normalized root-mean-square error of 0.342, compared to 0.373 for MICE alone and 0.358 for GP alone. 3D-MICE offers a novel and practical approach to imputing clinical laboratory time series data. 3D-MICE may provide an additional tool for use as a foundation in clinical predictive analytics and intelligent clinical decision support.
Multiple Improvements of Multiple Imputation Likelihood Ratio Tests

OpenAIRE

Chan, Kin Wai; Meng, Xiao-Li

2017-01-01

Multiple imputation (MI) inference handles missing data by first properly imputing the missing values $m$ times, and then combining the $m$ analysis results from applying a complete-data procedure to each of the completed datasets. However, the existing method for combining likelihood ratio tests has multiple defects: (i) the combined test statistic can be negative in practice when the reference null distribution is a standard $F$ distribution; (ii) it is not invariant to re-parametrization; ...
Imputation of missing data in time series for air pollutants

Science.gov (United States)

Junger, W. L.; Ponce de Leon, A.

2015-02-01

Missing data are major concerns in epidemiological studies of the health effects of environmental air pollutants. This article presents an imputation-based method that is suitable for multivariate time series data, which uses the EM algorithm under the assumption of normal distribution. Different approaches are considered for filtering the temporal component. A simulation study was performed to assess validity and performance of proposed method in comparison with some frequently used methods. Simulations showed that when the amount of missing data was as low as 5%, the complete data analysis yielded satisfactory results regardless of the generating mechanism of the missing data, whereas the validity began to degenerate when the proportion of missing values exceeded 10%. The proposed imputation method exhibited good accuracy and precision in different settings with respect to the patterns of missing observations. Most of the imputations obtained valid results, even under missing not at random. The methods proposed in this study are implemented as a package called mtsdi for the statistical software system R.
Improved imputation accuracy of rare and low-frequency variants using population-specific high-coverage WGS-based imputation reference panel.

Science.gov (United States)

Mitt, Mario; Kals, Mart; Pärn, Kalle; Gabriel, Stacey B; Lander, Eric S; Palotie, Aarno; Ripatti, Samuli; Morris, Andrew P; Metspalu, Andres; Esko, Tõnu; Mägi, Reedik; Palta, Priit

2017-06-01

Genetic imputation is a cost-efficient way to improve the power and resolution of genome-wide association (GWA) studies. Current publicly accessible imputation reference panels accurately predict genotypes for common variants with minor allele frequency (MAF)≥5% and low-frequency variants (0.5≤MAF<5%) across diverse populations, but the imputation of rare variation (MAF<0.5%) is still rather limited. In the current study, we evaluate imputation accuracy achieved with reference panels from diverse populations with a population-specific high-coverage (30 ×) whole-genome sequencing (WGS) based reference panel, comprising of 2244 Estonian individuals (0.25% of adult Estonians). Although the Estonian-specific panel contains fewer haplotypes and variants, the imputation confidence and accuracy of imputed low-frequency and rare variants was significantly higher. The results indicate the utility of population-specific reference panels for human genetic studies.
Data driven estimation of imputation error-a strategy for imputation with a reject option

DEFF Research Database (Denmark)

Bak, Nikolaj; Hansen, Lars Kai

2016-01-01

Missing data is a common problem in many research fields and is a challenge that always needs careful considerations. One approach is to impute the missing values, i.e., replace missing values with estimates. When imputation is applied, it is typically applied to all records with missing values i...
Clustering with Missing Values: No Imputation Required

Science.gov (United States)

Wagstaff, Kiri

2004-01-01

Clustering algorithms can identify groups in large data sets, such as star catalogs and hyperspectral images. In general, clustering methods cannot analyze items that have missing data values. Common solutions either fill in the missing values (imputation) or ignore the missing data (marginalization). Imputed values are treated as just as reliable as the truly observed data, but they are only as good as the assumptions used to create them. In contrast, we present a method for encoding partially observed features as a set of supplemental soft constraints and introduce the KSC algorithm, which incorporates constraints into the clustering process. In experiments on artificial data and data from the Sloan Digital Sky Survey, we show that soft constraints are an effective way to enable clustering with missing values.
Comparação de métodos de imputação única e múltipla usando como exemplo um modelo de risco para mortalidade cirúrgica Comparison of simple and multiple imputation methods using a risk model for surgical mortality as example

Directory of Open Access Journals (Sweden)

Luciana Neves Nunes

2010-12-01

sample size was 450 patients. The imputation methods applied were: two single imputations and one multiple imputation and the assumption was MAR (Missing at Random. RESULTS: The variable with missing data was serum albumin with 27.1% of missing rate. The logistic models adjusted by simple imputation were similar, but differed from models obtained by multiple imputation in relation to the inclusion of variables. CONCLUSIONS: The results indicate that it is important to take into account the relationship of albumin to other variables observed, because different models were obtained in single and multiple imputations. Single imputation underestimates the variability generating narrower confidence intervals. It is important to consider the use of imputation methods when there is missing data, especially multiple imputation that takes into account the variability between imputations for estimates of the model.
Missing value imputation: with application to handwriting data

Science.gov (United States)

Xu, Zhen; Srihari, Sargur N.

2015-01-01

Missing values make pattern analysis difficult, particularly with limited available data. In longitudinal research, missing values accumulate, thereby aggravating the problem. Here we consider how to deal with temporal data with missing values in handwriting analysis. In the task of studying development of individuality of handwriting, we encountered the fact that feature values are missing for several individuals at several time instances. Six algorithms, i.e., random imputation, mean imputation, most likely independent value imputation, and three methods based on Bayesian network (static Bayesian network, parameter EM, and structural EM), are compared with children's handwriting data. We evaluate the accuracy and robustness of the algorithms under different ratios of missing data and missing values, and useful conclusions are given. Specifically, static Bayesian network is used for our data which contain around 5% missing data to provide adequate accuracy and low computational cost.
UniFIeD Univariate Frequency-based Imputation for Time Series Data

OpenAIRE

Friese, Martina; Stork, Jörg; Ramos Guerra, Ricardo; Bartz-Beielstein, Thomas; Thaker, Soham; Flasch, Oliver; Zaefferer, Martin

2013-01-01

This paper introduces UniFIeD, a new data preprocessing method for time series. UniFIeD can cope with large intervals of missing data. A scalable test function generator, which allows the simulation of time series with different gap sizes, is presented additionally. An experimental study demonstrates that (i) UniFIeD shows a significant better performance than simple imputation methods and (ii) UniFIeD is able to handle situations, where advanced imputation methods fail. The results are indep...
Learning-Based Adaptive Imputation Methodwith kNN Algorithm for Missing Power Data

Directory of Open Access Journals (Sweden)

Minkyung Kim

2017-10-01

Full Text Available This paper proposes a learning-based adaptive imputation method (LAI for imputing missing power data in an energy system. This method estimates the missing power data by using the pattern that appears in the collected data. Here, in order to capture the patterns from past power data, we newly model a feature vector by using past data and its variations. The proposed LAI then learns the optimal length of the feature vector and the optimal historical length, which are significant hyper parameters of the proposed method, by utilizing intentional missing data. Based on a weighted distance between feature vectors representing a missing situation and past situation, missing power data are estimated by referring to the k most similar past situations in the optimal historical length. We further extend the proposed LAI to alleviate the effect of unexpected variation in power data and refer to this new approach as the extended LAI method (eLAI. The eLAI selects a method between linear interpolation (LI and the proposed LAI to improve accuracy under unexpected variations. Finally, from a simulation under various energy consumption profiles, we verify that the proposed eLAI achieves about a 74% reduction of the average imputation error in an energy system, compared to the existing imputation methods.
Fully conditional specification in multivariate imputation

NARCIS (Netherlands)

van Buuren, S.; Brand, J. P.L.; Groothuis-Oudshoorn, C. G.M.; Rubin, D. B.

2006-01-01

The use of the Gibbs sampler with fully conditionally specified models, where the distribution of each variable given the other variables is the starting point, has become a popular method to create imputations in incomplete multivariate data. The theoretical weakness of this approach is that the
Estimating cavity tree and snag abundance using negative binomial regression models and nearest neighbor imputation methods

Science.gov (United States)

Bianca N.I. Eskelson; Hailemariam Temesgen; Tara M. Barrett

2009-01-01

Cavity tree and snag abundance data are highly variable and contain many zero observations. We predict cavity tree and snag abundance from variables that are readily available from forest cover maps or remotely sensed data using negative binomial (NB), zero-inflated NB, and zero-altered NB (ZANB) regression models as well as nearest neighbor (NN) imputation methods....
A New Missing Data Imputation Algorithm Applied to Electrical Data Loggers

Directory of Open Access Journals (Sweden)

Concepción Crespo Turrado

2015-12-01

Full Text Available Nowadays, data collection is a key process in the study of electrical power networks when searching for harmonics and a lack of balance among phases. In this context, the lack of data of any of the main electrical variables (phase-to-neutral voltage, phase-to-phase voltage, and current in each phase and power factor adversely affects any time series study performed. When this occurs, a data imputation process must be accomplished in order to substitute the data that is missing for estimated values. This paper presents a novel missing data imputation method based on multivariate adaptive regression splines (MARS and compares it with the well-known technique called multivariate imputation by chained equations (MICE. The results obtained demonstrate how the proposed method outperforms the MICE algorithm.
PRIMAL: Fast and accurate pedigree-based imputation from sequence data in a founder population.

Directory of Open Access Journals (Sweden)

Oren E Livne

2015-03-01

Full Text Available Founder populations and large pedigrees offer many well-known advantages for genetic mapping studies, including cost-efficient study designs. Here, we describe PRIMAL (PedigRee IMputation ALgorithm, a fast and accurate pedigree-based phasing and imputation algorithm for founder populations. PRIMAL incorporates both existing and original ideas, such as a novel indexing strategy of Identity-By-Descent (IBD segments based on clique graphs. We were able to impute the genomes of 1,317 South Dakota Hutterites, who had genome-wide genotypes for ~300,000 common single nucleotide variants (SNVs, from 98 whole genome sequences. Using a combination of pedigree-based and LD-based imputation, we were able to assign 87% of genotypes with >99% accuracy over the full range of allele frequencies. Using the IBD cliques we were also able to infer the parental origin of 83% of alleles, and genotypes of deceased recent ancestors for whom no genotype information was available. This imputed data set will enable us to better study the relative contribution of rare and common variants on human phenotypes, as well as parental origin effect of disease risk alleles in >1,000 individuals at minimal cost.
Flexible Modeling of Survival Data with Covariates Subject to Detection Limits via Multiple Imputation.

Science.gov (United States)

Bernhardt, Paul W; Wang, Huixia Judy; Zhang, Daowen

2014-01-01

Models for survival data generally assume that covariates are fully observed. However, in medical studies it is not uncommon for biomarkers to be censored at known detection limits. A computationally-efficient multiple imputation procedure for modeling survival data with covariates subject to detection limits is proposed. This procedure is developed in the context of an accelerated failure time model with a flexible seminonparametric error distribution. The consistency and asymptotic normality of the multiple imputation estimator are established and a consistent variance estimator is provided. An iterative version of the proposed multiple imputation algorithm that approximates the EM algorithm for maximum likelihood is also suggested. Simulation studies demonstrate that the proposed multiple imputation methods work well while alternative methods lead to estimates that are either biased or more variable. The proposed methods are applied to analyze the dataset from a recently-conducted GenIMS study.
A suggested approach for imputation of missing dietary data for young children in daycare.

Science.gov (United States)

Stevens, June; Ou, Fang-Shu; Truesdale, Kimberly P; Zeng, Donglin; Vaughn, Amber E; Pratt, Charlotte; Ward, Dianne S

2015-01-01

Parent-reported 24-h diet recalls are an accepted method of estimating intake in young children. However, many children eat while at childcare making accurate proxy reports by parents difficult. The goal of this study was to demonstrate a method to impute missing weekday lunch and daytime snack nutrient data for daycare children and to explore the concurrent predictive and criterion validity of the method. Data were from children aged 2-5 years in the My Parenting SOS project (n=308; 870 24-h diet recalls). Mixed models were used to simultaneously predict breakfast, dinner, and evening snacks (B+D+ES); lunch; and daytime snacks for all children after adjusting for age, sex, and body mass index (BMI). From these models, we imputed the missing weekday daycare lunches by interpolation using the mean lunch to B+D+ES [L/(B+D+ES)] ratio among non-daycare children on weekdays and the L/(B+D+ES) ratio for all children on weekends. Daytime snack data were used to impute snacks. The reported mean (± standard deviation) weekday intake was lower for daycare children [725 (±324) kcal] compared to non-daycare children [1,048 (±463) kcal]. Weekend intake for all children was 1,173 (±427) kcal. After imputation, weekday caloric intake for daycare children was 1,230 (±409) kcal. Daily intakes that included imputed data were associated with age and sex but not with BMI. This work indicates that imputation is a promising method for improving the precision of daily nutrient data from young children.
Using imputation to provide location information for nongeocoded addresses.

Directory of Open Access Journals (Sweden)

Frank C Curriero

2010-02-01

Full Text Available The importance of geography as a source of variation in health research continues to receive sustained attention in the literature. The inclusion of geographic information in such research often begins by adding data to a map which is predicated by some knowledge of location. A precise level of spatial information is conventionally achieved through geocoding, the geographic information system (GIS process of translating mailing address information to coordinates on a map. The geocoding process is not without its limitations, though, since there is always a percentage of addresses which cannot be converted successfully (nongeocodable. This raises concerns regarding bias since traditionally the practice has been to exclude nongeocoded data records from analysis.In this manuscript we develop and evaluate a set of imputation strategies for dealing with missing spatial information from nongeocoded addresses. The strategies are developed assuming a known zip code with increasing use of collateral information, namely the spatial distribution of the population at risk. Strategies are evaluated using prostate cancer data obtained from the Maryland Cancer Registry. We consider total case enumerations at the Census county, tract, and block group level as the outcome of interest when applying and evaluating the methods. Multiple imputation is used to provide estimated total case counts based on complete data (geocodes plus imputed nongeocodes with a measure of uncertainty. Results indicate that the imputation strategy based on using available population-based age, gender, and race information performed the best overall at the county, tract, and block group levels.The procedure allows for the potentially biased and likely under reported outcome, case enumerations based on only the geocoded records, to be presented with a statistically adjusted count (imputed count with a measure of uncertainty that are based on all the case data, the geocodes and imputed
Effects of Different Missing Data Imputation Techniques on the Performance of Undiagnosed Diabetes Risk Prediction Models in a Mixed-Ancestry Population of South Africa.

Directory of Open Access Journals (Sweden)

Katya L Masconi

Full Text Available Imputation techniques used to handle missing data are based on the principle of replacement. It is widely advocated that multiple imputation is superior to other imputation methods, however studies have suggested that simple methods for filling missing data can be just as accurate as complex methods. The objective of this study was to implement a number of simple and more complex imputation methods, and assess the effect of these techniques on the performance of undiagnosed diabetes risk prediction models during external validation.Data from the Cape Town Bellville-South cohort served as the basis for this study. Imputation methods and models were identified via recent systematic reviews. Models' discrimination was assessed and compared using C-statistic and non-parametric methods, before and after recalibration through simple intercept adjustment.The study sample consisted of 1256 individuals, of whom 173 were excluded due to previously diagnosed diabetes. Of the final 1083 individuals, 329 (30.4% had missing data. Family history had the highest proportion of missing data (25%. Imputation of the outcome, undiagnosed diabetes, was highest in stochastic regression imputation (163 individuals. Overall, deletion resulted in the lowest model performances while simple imputation yielded the highest C-statistic for the Cambridge Diabetes Risk model, Kuwaiti Risk model, Omani Diabetes Risk model and Rotterdam Predictive model. Multiple imputation only yielded the highest C-statistic for the Rotterdam Predictive model, which were matched by simpler imputation methods.Deletion was confirmed as a poor technique for handling missing data. However, despite the emphasized disadvantages of simpler imputation methods, this study showed that implementing these methods results in similar predictive utility for undiagnosed diabetes when compared to multiple imputation.
A suggested approach for imputation of missing dietary data for young children in daycare

Directory of Open Access Journals (Sweden)

June Stevens

2015-12-01

Full Text Available Background: Parent-reported 24-h diet recalls are an accepted method of estimating intake in young children. However, many children eat while at childcare making accurate proxy reports by parents difficult. Objective: The goal of this study was to demonstrate a method to impute missing weekday lunch and daytime snack nutrient data for daycare children and to explore the concurrent predictive and criterion validity of the method. Design: Data were from children aged 2-5 years in the My Parenting SOS project (n=308; 870 24-h diet recalls. Mixed models were used to simultaneously predict breakfast, dinner, and evening snacks (B+D+ES; lunch; and daytime snacks for all children after adjusting for age, sex, and body mass index (BMI. From these models, we imputed the missing weekday daycare lunches by interpolation using the mean lunch to B+D+ES [L/(B+D+ES] ratio among non-daycare children on weekdays and the L/(B+D+ES ratio for all children on weekends. Daytime snack data were used to impute snacks. Results: The reported mean (± standard deviation weekday intake was lower for daycare children [725 (±324 kcal] compared to non-daycare children [1,048 (±463 kcal]. Weekend intake for all children was 1,173 (±427 kcal. After imputation, weekday caloric intake for daycare children was 1,230 (±409 kcal. Daily intakes that included imputed data were associated with age and sex but not with BMI. Conclusion: This work indicates that imputation is a promising method for improving the precision of daily nutrient data from young children.
Quick, “Imputation-free” meta-analysis with proxy-SNPs

Directory of Open Access Journals (Sweden)

Meesters Christian

2012-09-01

Full Text Available Abstract Background Meta-analysis (MA is widely used to pool genome-wide association studies (GWASes in order to a increase the power to detect strong or weak genotype effects or b as a result verification method. As a consequence of differing SNP panels among genotyping chips, imputation is the method of choice within GWAS consortia to avoid losing too many SNPs in a MA. YAMAS (Yet Another Meta Analysis Software, however, enables cross-GWAS conclusions prior to finished and polished imputation runs, which eventually are time-consuming. Results Here we present a fast method to avoid forfeiting SNPs present in only a subset of studies, without relying on imputation. This is accomplished by using reference linkage disequilibrium data from 1,000 Genomes/HapMap projects to find proxy-SNPs together with in-phase alleles for SNPs missing in at least one study. MA is conducted by combining association effect estimates of a SNP and those of its proxy-SNPs. Our algorithm is implemented in the MA software YAMAS. Association results from GWAS analysis applications can be used as input files for MA, tremendously speeding up MA compared to the conventional imputation approach. We show that our proxy algorithm is well-powered and yields valuable ad hoc results, possibly providing an incentive for follow-up studies. We propose our method as a quick screening step prior to imputation-based MA, as well as an additional main approach for studies without available reference data matching the ethnicities of study participants. As a proof of principle, we analyzed six dbGaP Type II Diabetes GWAS and found that the proxy algorithm clearly outperforms naïve MA on the p-value level: for 17 out of 23 we observe an improvement on the p-value level by a factor of more than two, and a maximum improvement by a factor of 2127. Conclusions YAMAS is an efficient and fast meta-analysis program which offers various methods, including conventional MA as well as inserting proxy

Evaluation and application of summary statistic imputation to discover new height-associated loci.

Science.gov (United States)

Rüeger, Sina; McDaid, Aaron; Kutalik, Zoltán

2018-05-01

As most of the heritability of complex traits is attributed to common and low frequency genetic variants, imputing them by combining genotyping chips and large sequenced reference panels is the most cost-effective approach to discover the genetic basis of these traits. Association summary statistics from genome-wide meta-analyses are available for hundreds of traits. Updating these to ever-increasing reference panels is very cumbersome as it requires reimputation of the genetic data, rerunning the association scan, and meta-analysing the results. A much more efficient method is to directly impute the summary statistics, termed as summary statistics imputation, which we improved to accommodate variable sample size across SNVs. Its performance relative to genotype imputation and practical utility has not yet been fully investigated. To this end, we compared the two approaches on real (genotyped and imputed) data from 120K samples from the UK Biobank and show that, genotype imputation boasts a 3- to 5-fold lower root-mean-square error, and better distinguishes true associations from null ones: We observed the largest differences in power for variants with low minor allele frequency and low imputation quality. For fixed false positive rates of 0.001, 0.01, 0.05, using summary statistics imputation yielded a decrease in statistical power by 9, 43 and 35%, respectively. To test its capacity to discover novel associations, we applied summary statistics imputation to the GIANT height meta-analysis summary statistics covering HapMap variants, and identified 34 novel loci, 19 of which replicated using data in the UK Biobank. Additionally, we successfully replicated 55 out of the 111 variants published in an exome chip study. Our study demonstrates that summary statistics imputation is a very efficient and cost-effective way to identify and fine-map trait-associated loci. Moreover, the ability to impute summary statistics is important for follow-up analyses, such as Mendelian
Accuracy of genome-wide imputation of untyped markers and impacts on statistical power for association studies

Directory of Open Access Journals (Sweden)

McElwee Joshua

2009-06-01

Full Text Available Abstract Background Although high-throughput genotyping arrays have made whole-genome association studies (WGAS feasible, only a small proportion of SNPs in the human genome are actually surveyed in such studies. In addition, various SNP arrays assay different sets of SNPs, which leads to challenges in comparing results and merging data for meta-analyses. Genome-wide imputation of untyped markers allows us to address these issues in a direct fashion. Methods 384 Caucasian American liver donors were genotyped using Illumina 650Y (Ilmn650Y arrays, from which we also derived genotypes from the Ilmn317K array. On these data, we compared two imputation methods: MACH and BEAGLE. We imputed 2.5 million HapMap Release22 SNPs, and conducted GWAS on ~40,000 liver mRNA expression traits (eQTL analysis. In addition, 200 Caucasian American and 200 African American subjects were genotyped using the Affymetrix 500 K array plus a custom 164 K fill-in chip. We then imputed the HapMap SNPs and quantified the accuracy by randomly masking observed SNPs. Results MACH and BEAGLE perform similarly with respect to imputation accuracy. The Ilmn650Y results in excellent imputation performance, and it outperforms Affx500K or Ilmn317K sets. For Caucasian Americans, 90% of the HapMap SNPs were imputed at 98% accuracy. As expected, imputation of poorly tagged SNPs (untyped SNPs in weak LD with typed markers was not as successful. It was more challenging to impute genotypes in the African American population, given (1 shorter LD blocks and (2 admixture with Caucasian populations in this population. To address issue (2, we pooled HapMap CEU and YRI data as an imputation reference set, which greatly improved overall performance. The approximate 40,000 phenotypes scored in these populations provide a path to determine empirically how the power to detect associations is affected by the imputation procedures. That is, at a fixed false discovery rate, the number of cis
Comparison of results from different imputation techniques for missing data from an anti-obesity drug trial

DEFF Research Database (Denmark)

Jørgensen, Anders W.; Lundstrøm, Lars H; Wetterslev, Jørn

2014-01-01

BACKGROUND: In randomised trials of medical interventions, the most reliable analysis follows the intention-to-treat (ITT) principle. However, the ITT analysis requires that missing outcome data have to be imputed. Different imputation techniques may give different results and some may lead to bias...... of handling missing data in a 60-week placebo controlled anti-obesity drug trial on topiramate. METHODS: We compared an analysis of complete cases with datasets where missing body weight measurements had been replaced using three different imputation methods: LOCF, baseline carried forward (BOCF) and MI...
Cost reduction for web-based data imputation

KAUST Repository

Li, Zhixu; Shang, Shuo; Xie, Qing; Zhang, Xiangliang

2014-01-01

Web-based Data Imputation enables the completion of incomplete data sets by retrieving absent field values from the Web. In particular, complete fields can be used as keywords in imputation queries for absent fields. However, due to the ambiguity
Multiple imputation in the presence of non-normal data.

Science.gov (United States)

Lee, Katherine J; Carlin, John B

2017-02-20

Multiple imputation (MI) is becoming increasingly popular for handling missing data. Standard approaches for MI assume normality for continuous variables (conditionally on the other variables in the imputation model). However, it is unclear how to impute non-normally distributed continuous variables. Using simulation and a case study, we compared various transformations applied prior to imputation, including a novel non-parametric transformation, to imputation on the raw scale and using predictive mean matching (PMM) when imputing non-normal data. We generated data from a range of non-normal distributions, and set 50% to missing completely at random or missing at random. We then imputed missing values on the raw scale, following a zero-skewness log, Box-Cox or non-parametric transformation and using PMM with both type 1 and 2 matching. We compared inferences regarding the marginal mean of the incomplete variable and the association with a fully observed outcome. We also compared results from these approaches in the analysis of depression and anxiety symptoms in parents of very preterm compared with term-born infants. The results provide novel empirical evidence that the decision regarding how to impute a non-normal variable should be based on the nature of the relationship between the variables of interest. If the relationship is linear in the untransformed scale, transformation can introduce bias irrespective of the transformation used. However, if the relationship is non-linear, it may be important to transform the variable to accurately capture this relationship. A useful alternative is to impute the variable using PMM with type 1 matching. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation.

Science.gov (United States)

Välikangas, Tommi; Suomi, Tomi; Elo, Laura L

2017-05-31

Label-free mass spectrometry (MS) has developed into an important tool applied in various fields of biological and life sciences. Several software exist to process the raw MS data into quantified protein abundances, including open source and commercial solutions. Each software includes a set of unique algorithms for different tasks of the MS data processing workflow. While many of these algorithms have been compared separately, a thorough and systematic evaluation of their overall performance is missing. Moreover, systematic information is lacking about the amount of missing values produced by the different proteomics software and the capabilities of different data imputation methods to account for them.In this study, we evaluated the performance of five popular quantitative label-free proteomics software workflows using four different spike-in data sets. Our extensive testing included the number of proteins quantified and the number of missing values produced by each workflow, the accuracy of detecting differential expression and logarithmic fold change and the effect of different imputation and filtering methods on the differential expression results. We found that the Progenesis software performed consistently well in the differential expression analysis and produced few missing values. The missing values produced by the other software decreased their performance, but this difference could be mitigated using proper data filtering or imputation methods. Among the imputation methods, we found that the local least squares (lls) regression imputation consistently increased the performance of the software in the differential expression analysis, and a combination of both data filtering and local least squares imputation increased performance the most in the tested data sets. © The Author 2017. Published by Oxford University Press.
iVAR: a program for imputing missing data in multivariate time series using vector autoregressive models.

Science.gov (United States)

Liu, Siwei; Molenaar, Peter C M

2014-12-01

This article introduces iVAR, an R program for imputing missing data in multivariate time series on the basis of vector autoregressive (VAR) models. We conducted a simulation study to compare iVAR with three methods for handling missing data: listwise deletion, imputation with sample means and variances, and multiple imputation ignoring time dependency. The results showed that iVAR produces better estimates for the cross-lagged coefficients than do the other three methods. We demonstrate the use of iVAR with an empirical example of time series electrodermal activity data and discuss the advantages and limitations of the program.
Design of a bovine low-density SNP array optimized for imputation.

Directory of Open Access Journals (Sweden)

Didier Boichard

Full Text Available The Illumina BovineLD BeadChip was designed to support imputation to higher density genotypes in dairy and beef breeds by including single-nucleotide polymorphisms (SNPs that had a high minor allele frequency as well as uniform spacing across the genome except at the ends of the chromosome where densities were increased. The chip also includes SNPs on the Y chromosome and mitochondrial DNA loci that are useful for determining subspecies classification and certain paternal and maternal breed lineages. The total number of SNPs was 6,909. Accuracy of imputation to Illumina BovineSNP50 genotypes using the BovineLD chip was over 97% for most dairy and beef populations. The BovineLD imputations were about 3 percentage points more accurate than those from the Illumina GoldenGate Bovine3K BeadChip across multiple populations. The improvement was greatest when neither parent was genotyped. The minor allele frequencies were similar across taurine beef and dairy breeds as was the proportion of SNPs that were polymorphic. The new BovineLD chip should facilitate low-cost genomic selection in taurine beef and dairy cattle.
Sequence imputation of HPV16 genomes for genetic association studies.

Directory of Open Access Journals (Sweden)

Benjamin Smith

Full Text Available Human Papillomavirus type 16 (HPV16 causes over half of all cervical cancer and some HPV16 variants are more oncogenic than others. The genetic basis for the extraordinary oncogenic properties of HPV16 compared to other HPVs is unknown. In addition, we neither know which nucleotides vary across and within HPV types and lineages, nor which of the single nucleotide polymorphisms (SNPs determine oncogenicity.A reference set of 62 HPV16 complete genome sequences was established and used to examine patterns of evolutionary relatedness amongst variants using a pairwise identity heatmap and HPV16 phylogeny. A BLAST-based algorithm was developed to impute complete genome data from partial sequence information using the reference database. To interrogate the oncogenic risk of determined and imputed HPV16 SNPs, odds-ratios for each SNP were calculated in a case-control viral genome-wide association study (VWAS using biopsy confirmed high-grade cervix neoplasia and self-limited HPV16 infections from Guanacaste, Costa Rica.HPV16 variants display evolutionarily stable lineages that contain conserved diagnostic SNPs. The imputation algorithm indicated that an average of 97.5±1.03% of SNPs could be accurately imputed. The VWAS revealed specific HPV16 viral SNPs associated with variant lineages and elevated odds ratios; however, individual causal SNPs could not be distinguished with certainty due to the nature of HPV evolution.Conserved and lineage-specific SNPs can be imputed with a high degree of accuracy from limited viral polymorphic data due to the lack of recombination and the stochastic mechanism of variation accumulation in the HPV genome. However, to determine the role of novel variants or non-lineage-specific SNPs by VWAS will require direct sequence analysis. The investigation of patterns of genetic variation and the identification of diagnostic SNPs for lineages of HPV16 variants provides a valuable resource for future studies of HPV16
The use of multiple imputation for the accurate measurements of individual feed intake by electronic feeders.

Science.gov (United States)

Jiao, S; Tiezzi, F; Huang, Y; Gray, K A; Maltecca, C

2016-02-01

Obtaining accurate individual feed intake records is the key first step in achieving genetic progress toward more efficient nutrient utilization in pigs. Feed intake records collected by electronic feeding systems contain errors (erroneous and abnormal values exceeding certain cutoff criteria), which are due to feeder malfunction or animal-feeder interaction. In this study, we examined the use of a novel data-editing strategy involving multiple imputation to minimize the impact of errors and missing values on the quality of feed intake data collected by an electronic feeding system. Accuracy of feed intake data adjustment obtained from the conventional linear mixed model (LMM) approach was compared with 2 alternative implementations of multiple imputation by chained equation, denoted as MI (multiple imputation) and MICE (multiple imputation by chained equation). The 3 methods were compared under 3 scenarios, where 5, 10, and 20% feed intake error rates were simulated. Each of the scenarios was replicated 5 times. Accuracy of the alternative error adjustment was measured as the correlation between the true daily feed intake (DFI; daily feed intake in the testing period) or true ADFI (the mean DFI across testing period) and the adjusted DFI or adjusted ADFI. In the editing process, error cutoff criteria are used to define if a feed intake visit contains errors. To investigate the possibility that the error cutoff criteria may affect any of the 3 methods, the simulation was repeated with 2 alternative error cutoff values. Multiple imputation methods outperformed the LMM approach in all scenarios with mean accuracies of 96.7, 93.5, and 90.2% obtained with MI and 96.8, 94.4, and 90.1% obtained with MICE compared with 91.0, 82.6, and 68.7% using LMM for DFI. Similar results were obtained for ADFI. Furthermore, multiple imputation methods consistently performed better than LMM regardless of the cutoff criteria applied to define errors. In conclusion, multiple imputation
Missing Data Imputation of Solar Radiation Data under Different Atmospheric Conditions

Science.gov (United States)

Turrado, Concepción Crespo; López, María del Carmen Meizoso; Lasheras, Fernando Sánchez; Gómez, Benigno Antonio Rodríguez; Rollé, José Luis Calvo; de Cos Juez, Francisco Javier

2014-01-01

Global solar broadband irradiance on a planar surface is measured at weather stations by pyranometers. In the case of the present research, solar radiation values from nine meteorological stations of the MeteoGalicia real-time observational network, captured and stored every ten minutes, are considered. In this kind of record, the lack of data and/or the presence of wrong values adversely affects any time series study. Consequently, when this occurs, a data imputation process must be performed in order to replace missing data with estimated values. This paper aims to evaluate the multivariate imputation of ten-minute scale data by means of the chained equations method (MICE). This method allows the network itself to impute the missing or wrong data of a solar radiation sensor, by using either all or just a group of the measurements of the remaining sensors. Very good results have been obtained with the MICE method in comparison with other methods employed in this field such as Inverse Distance Weighting (IDW) and Multiple Linear Regression (MLR). The average RMSE value of the predictions for the MICE algorithm was 13.37% while that for the MLR it was 28.19%, and 31.68% for the IDW. PMID:25356644
Missing Data Imputation of Solar Radiation Data under Different Atmospheric Conditions

Directory of Open Access Journals (Sweden)

Concepción Crespo Turrado

2014-10-01

Full Text Available Global solar broadband irradiance on a planar surface is measured at weather stations by pyranometers. In the case of the present research, solar radiation values from nine meteorological stations of the MeteoGalicia real-time observational network, captured and stored every ten minutes, are considered. In this kind of record, the lack of data and/or the presence of wrong values adversely affects any time series study. Consequently, when this occurs, a data imputation process must be performed in order to replace missing data with estimated values. This paper aims to evaluate the multivariate imputation of ten-minute scale data by means of the chained equations method (MICE. This method allows the network itself to impute the missing or wrong data of a solar radiation sensor, by using either all or just a group of the measurements of the remaining sensors. Very good results have been obtained with the MICE method in comparison with other methods employed in this field such as Inverse Distance Weighting (IDW and Multiple Linear Regression (MLR. The average RMSE value of the predictions for the MICE algorithm was 13.37% while that for the MLR it was 28.19%, and 31.68% for the IDW.
Missing data imputation of solar radiation data under different atmospheric conditions.

Science.gov (United States)

Turrado, Concepción Crespo; López, María Del Carmen Meizoso; Lasheras, Fernando Sánchez; Gómez, Benigno Antonio Rodríguez; Rollé, José Luis Calvo; Juez, Francisco Javier de Cos

2014-10-29

Global solar broadband irradiance on a planar surface is measured at weather stations by pyranometers. In the case of the present research, solar radiation values from nine meteorological stations of the MeteoGalicia real-time observational network, captured and stored every ten minutes, are considered. In this kind of record, the lack of data and/or the presence of wrong values adversely affects any time series study. Consequently, when this occurs, a data imputation process must be performed in order to replace missing data with estimated values. This paper aims to evaluate the multivariate imputation of ten-minute scale data by means of the chained equations method (MICE). This method allows the network itself to impute the missing or wrong data of a solar radiation sensor, by using either all or just a group of the measurements of the remaining sensors. Very good results have been obtained with the MICE method in comparison with other methods employed in this field such as Inverse Distance Weighting (IDW) and Multiple Linear Regression (MLR). The average RMSE value of the predictions for the MICE algorithm was 13.37% while that for the MLR it was 28.19%, and 31.68% for the IDW.
Saturated linkage map construction in Rubus idaeus using genotyping by sequencing and genome-independent imputation

Directory of Open Access Journals (Sweden)

Ward Judson A

2013-01-01

Full Text Available Abstract Background Rapid development of highly saturated genetic maps aids molecular breeding, which can accelerate gain per breeding cycle in woody perennial plants such as Rubus idaeus (red raspberry. Recently, robust genotyping methods based on high-throughput sequencing were developed, which provide high marker density, but result in some genotype errors and a large number of missing genotype values. Imputation can reduce the number of missing values and can correct genotyping errors, but current methods of imputation require a reference genome and thus are not an option for most species. Results Genotyping by Sequencing (GBS was used to produce highly saturated maps for a R. idaeus pseudo-testcross progeny. While low coverage and high variance in sequencing resulted in a large number of missing values for some individuals, a novel method of imputation based on maximum likelihood marker ordering from initial marker segregation overcame the challenge of missing values, and made map construction computationally tractable. The two resulting parental maps contained 4521 and 2391 molecular markers spanning 462.7 and 376.6 cM respectively over seven linkage groups. Detection of precise genomic regions with segregation distortion was possible because of map saturation. Microsatellites (SSRs linked these results to published maps for cross-validation and map comparison. Conclusions GBS together with genome-independent imputation provides a rapid method for genetic map construction in any pseudo-testcross progeny. Our method of imputation estimates the correct genotype call of missing values and corrects genotyping errors that lead to inflated map size and reduced precision in marker placement. Comparison of SSRs to published R. idaeus maps showed that the linkage maps constructed with GBS and our method of imputation were robust, and marker positioning reliable. The high marker density allowed identification of genomic regions with segregation
TRANSPOSABLE REGULARIZED COVARIANCE MODELS WITH AN APPLICATION TO MISSING DATA IMPUTATION.

Science.gov (United States)

Allen, Genevera I; Tibshirani, Robert

2010-06-01

Missing data estimation is an important challenge with high-dimensional data arranged in the form of a matrix. Typically this data matrix is transposable , meaning that either the rows, columns or both can be treated as features. To model transposable data, we present a modification of the matrix-variate normal, the mean-restricted matrix-variate normal , in which the rows and columns each have a separate mean vector and covariance matrix. By placing additive penalties on the inverse covariance matrices of the rows and columns, these so called transposable regularized covariance models allow for maximum likelihood estimation of the mean and non-singular covariance matrices. Using these models, we formulate EM-type algorithms for missing data imputation in both the multivariate and transposable frameworks. We present theoretical results exploiting the structure of our transposable models that allow these models and imputation methods to be applied to high-dimensional data. Simulations and results on microarray data and the Netflix data show that these imputation techniques often outperform existing methods and offer a greater degree of flexibility.
Imputation by the mean score should be avoided when validating a Patient Reported Outcomes questionnaire by a Rasch model in presence of informative missing data

LENUS (Irish Health Repository)

Hardouin, Jean-Benoit

2011-07-14

Abstract Background Nowadays, more and more clinical scales consisting in responses given by the patients to some items (Patient Reported Outcomes - PRO), are validated with models based on Item Response Theory, and more specifically, with a Rasch model. In the validation sample, presence of missing data is frequent. The aim of this paper is to compare sixteen methods for handling the missing data (mainly based on simple imputation) in the context of psychometric validation of PRO by a Rasch model. The main indexes used for validation by a Rasch model are compared. Methods A simulation study was performed allowing to consider several cases, notably the possibility for the missing values to be informative or not and the rate of missing data. Results Several imputations methods produce bias on psychometrical indexes (generally, the imputation methods artificially improve the psychometric qualities of the scale). In particular, this is the case with the method based on the Personal Mean Score (PMS) which is the most commonly used imputation method in practice. Conclusions Several imputation methods should be avoided, in particular PMS imputation. From a general point of view, it is important to use an imputation method that considers both the ability of the patient (measured for example by his\\/her score), and the difficulty of the item (measured for example by its rate of favourable responses). Another recommendation is to always consider the addition of a random process in the imputation method, because such a process allows reducing the bias. Last, the analysis realized without imputation of the missing data (available case analyses) is an interesting alternative to the simple imputation in this context.
Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies.

Science.gov (United States)

Lazar, Cosmin; Gatto, Laurent; Ferro, Myriam; Bruley, Christophe; Burger, Thomas

2016-04-01

Missing values are a genuine issue in label-free quantitative proteomics. Recent works have surveyed the different statistical methods to conduct imputation and have compared them on real or simulated data sets and recommended a list of missing value imputation methods for proteomics application. Although insightful, these comparisons do not account for two important facts: (i) depending on the proteomics data set, the missingness mechanism may be of different natures and (ii) each imputation method is devoted to a specific type of missingness mechanism. As a result, we believe that the question at stake is not to find the most accurate imputation method in general but instead the most appropriate one. We describe a series of comparisons that support our views: For instance, we show that a supposedly "under-performing" method (i.e., giving baseline average results), if applied at the "appropriate" time in the data-processing pipeline (before or after peptide aggregation) on a data set with the "appropriate" nature of missing values, can outperform a blindly applied, supposedly "better-performing" method (i.e., the reference method from the state-of-the-art). This leads us to formulate few practical guidelines regarding the choice and the application of an imputation method in a proteomics context.
Factors associated with low birth weight in Nepal using multiple imputation

Directory of Open Access Journals (Sweden)

Usha Singh

2017-02-01

Full Text Available Abstract Background Survey data from low income countries on birth weight usually pose a persistent problem. The studies conducted on birth weight have acknowledged missing data on birth weight, but they are not included in the analysis. Furthermore, other missing data presented on determinants of birth weight are not addressed. Thus, this study tries to identify determinants that are associated with low birth weight (LBW using multiple imputation to handle missing data on birth weight and its determinants. Methods The child dataset from Nepal Demographic and Health Survey (NDHS, 2011 was utilized in this study. A total of 5,240 children were born between 2006 and 2011, out of which 87% had at least one measured variable missing and 21% had no recorded birth weight. All the analyses were carried out in R version 3.1.3. Transform-then impute method was applied to check for interaction between explanatory variables and imputed missing data. Survey package was applied to each imputed dataset to account for survey design and sampling method. Survey logistic regression was applied to identify the determinants associated with LBW. Results The prevalence of LBW was 15.4% after imputation. Women with the highest autonomy on their own health compared to those with health decisions involving husband or others (adjusted odds ratio (OR 1.87, 95% confidence interval (95% CI = 1.31, 2.67, and husband and women together (adjusted OR 1.57, 95% CI = 1.05, 2.35 were less likely to give birth to LBW infants. Mothers using highly polluting cooking fuels (adjusted OR 1.49, 95% CI = 1.03, 2.22 were more likely to give birth to LBW infants than mothers using non-polluting cooking fuels. Conclusion The findings of this study suggested that obtaining the prevalence of LBW from only the sample of measured birth weight and ignoring missing data results in underestimation.
Imputation and quality control steps for combining multiple genome-wide datasets

Directory of Open Access Journals (Sweden)

Shefali S Verma

2014-12-01

Full Text Available The electronic MEdical Records and GEnomics (eMERGE network brings together DNA biobanks linked to electronic health records (EHRs from multiple institutions. Approximately 52,000 DNA samples from distinct individuals have been genotyped using genome-wide SNP arrays across the nine sites of the network. The eMERGE Coordinating Center and the Genomics Workgroup developed a pipeline to impute and merge genomic data across the different SNP arrays to maximize sample size and power to detect associations with a variety of clinical endpoints. The 1000 Genomes cosmopolitan reference panel was used for imputation. Imputation results were evaluated using the following metrics: accuracy of imputation, allelic R2 (estimated correlation between the imputed and true genotypes, and the relationship between allelic R2 and minor allele frequency. Computation time and memory resources required by two different software packages (BEAGLE and IMPUTE2 were also evaluated. A number of challenges were encountered due to the complexity of using two different imputation software packages, multiple ancestral populations, and many different genotyping platforms. We present lessons learned and describe the pipeline implemented here to impute and merge genomic data sets. The eMERGE imputed dataset will serve as a valuable resource for discovery, leveraging the clinical data that can be mined from the EHR.
Flexible Imputation of Missing Data

CERN Document Server

van Buuren, Stef

2012-01-01

Missing data form a problem in every scientific discipline, yet the techniques required to handle them are complicated and often lacking. One of the great ideas in statistical science--multiple imputation--fills gaps in the data with plausible values, the uncertainty of which is coded in the data itself. It also solves other problems, many of which are missing data problems in disguise. Flexible Imputation of Missing Data is supported by many examples using real data taken from the author's vast experience of collaborative research, and presents a practical guide for handling missing data unde

Imputation of missing genotypes within LD-blocks relying on the basic coalescent and beyond: consideration of population growth and structure.

Science.gov (United States)

Kabisch, Maria; Hamann, Ute; Lorenzo Bermejo, Justo

2017-10-17

Genotypes not directly measured in genetic studies are often imputed to improve statistical power and to increase mapping resolution. The accuracy of standard imputation techniques strongly depends on the similarity of linkage disequilibrium (LD) patterns in the study and reference populations. Here we develop a novel approach for genotype imputation in low-recombination regions that relies on the coalescent and permits to explicitly account for population demographic factors. To test the new method, study and reference haplotypes were simulated and gene trees were inferred under the basic coalescent and also considering population growth and structure. The reference haplotypes that first coalesced with study haplotypes were used as templates for genotype imputation. Computer simulations were complemented with the analysis of real data. Genotype concordance rates were used to compare the accuracies of coalescent-based and standard (IMPUTE2) imputation. Simulations revealed that, in LD-blocks, imputation accuracy relying on the basic coalescent was higher and less variable than with IMPUTE2. Explicit consideration of population growth and structure, even if present, did not practically improve accuracy. The advantage of coalescent-based over standard imputation increased with the minor allele frequency and it decreased with population stratification. Results based on real data indicated that, even in low-recombination regions, further research is needed to incorporate recombination in coalescence inference, in particular for studies with genetically diverse and admixed individuals. To exploit the full potential of coalescent-based methods for the imputation of missing genotypes in genetic studies, further methodological research is needed to reduce computer time, to take into account recombination, and to implement these methods in user-friendly computer programs. Here we provide reproducible code which takes advantage of publicly available software to facilitate
Combining Fourier and lagged k-nearest neighbor imputation for biomedical time series data.

Science.gov (United States)

Rahman, Shah Atiqur; Huang, Yuxiao; Claassen, Jan; Heintzman, Nathaniel; Kleinberg, Samantha

2015-12-01

Most clinical and biomedical data contain missing values. A patient's record may be split across multiple institutions, devices may fail, and sensors may not be worn at all times. While these missing values are often ignored, this can lead to bias and error when the data are mined. Further, the data are not simply missing at random. Instead the measurement of a variable such as blood glucose may depend on its prior values as well as that of other variables. These dependencies exist across time as well, but current methods have yet to incorporate these temporal relationships as well as multiple types of missingness. To address this, we propose an imputation method (FLk-NN) that incorporates time lagged correlations both within and across variables by combining two imputation methods, based on an extension to k-NN and the Fourier transform. This enables imputation of missing values even when all data at a time point is missing and when there are different types of missingness both within and across variables. In comparison to other approaches on three biological datasets (simulated and actual Type 1 diabetes datasets, and multi-modality neurological ICU monitoring) the proposed method has the highest imputation accuracy. This was true for up to half the data being missing and when consecutive missing values are a significant fraction of the overall time series length. Copyright © 2015 Elsevier Inc. All rights reserved.
Multiple Imputation of Predictor Variables Using Generalized Additive Models

NARCIS (Netherlands)

de Jong, Roel; van Buuren, Stef; Spiess, Martin

2016-01-01

The sensitivity of multiple imputation methods to deviations from their distributional assumptions is investigated using simulations, where the parameters of scientific interest are the coefficients of a linear regression model, and values in predictor variables are missing at random. The
BRITS: Bidirectional Recurrent Imputation for Time Series

OpenAIRE

Cao, Wei; Wang, Dong; Li, Jian; Zhou, Hao; Li, Lei; Li, Yitan

2018-01-01

Time series are widely used as signals in many classification/regression tasks. It is ubiquitous that time series contains many missing values. Given multiple correlated time series data, how to fill in missing values and to predict their class labels? Existing imputation methods often impose strong assumptions of the underlying data generating process, such as linear dynamics in the state space. In this paper, we propose BRITS, a novel method based on recurrent neural networks for missing va...
Effect of imputing markers from a low-density chip on the reliability of genomic breeding values in Holstein populations

DEFF Research Database (Denmark)

Dassonneville, R; Brøndum, Rasmus Froberg; Druet, T

2011-01-01

The purpose of this study was to investigate the imputation error and loss of reliability of direct genomic values (DGV) or genomically enhanced breeding values (GEBV) when using genotypes imputed from a 3,000-marker single nucleotide polymorphism (SNP) panel to a 50,000-marker SNP panel. Data...... of missing markers and prediction of breeding values were performed using 2 different reference populations in each country: either a national reference population or a combined EuroGenomics reference population. Validation for accuracy of imputation and genomic prediction was done based on national test...... with a national reference data set gave an absolute loss of 0.05 in mean reliability of GEBV in the French study, whereas a loss of 0.03 was obtained for reliability of DGV in the Nordic study. When genotypes were imputed using the EuroGenomics reference, a loss of 0.02 in mean reliability of GEBV was detected...
Collateral missing value imputation: a new robust missing value estimation algorithm for microarray data.

Science.gov (United States)

Sehgal, Muhammad Shoaib B; Gondal, Iqbal; Dooley, Laurence S

2005-05-15

Microarray data are used in a range of application areas in biology, although often it contains considerable numbers of missing values. These missing values can significantly affect subsequent statistical analysis and machine learning algorithms so there is a strong motivation to estimate these values as accurately as possible before using these algorithms. While many imputation algorithms have been proposed, more robust techniques need to be developed so that further analysis of biological data can be accurately undertaken. In this paper, an innovative missing value imputation algorithm called collateral missing value estimation (CMVE) is presented which uses multiple covariance-based imputation matrices for the final prediction of missing values. The matrices are computed and optimized using least square regression and linear programming methods. The new CMVE algorithm has been compared with existing estimation techniques including Bayesian principal component analysis imputation (BPCA), least square impute (LSImpute) and K-nearest neighbour (KNN). All these methods were rigorously tested to estimate missing values in three separate non-time series (ovarian cancer based) and one time series (yeast sporulation) dataset. Each method was quantitatively analyzed using the normalized root mean square (NRMS) error measure, covering a wide range of randomly introduced missing value probabilities from 0.01 to 0.2. Experiments were also undertaken on the yeast dataset, which comprised 1.7% actual missing values, to test the hypothesis that CMVE performed better not only for randomly occurring but also for a real distribution of missing values. The results confirmed that CMVE consistently demonstrated superior and robust estimation capability of missing values compared with other methods for both series types of data, for the same order of computational complexity. A concise theoretical framework has also been formulated to validate the improved performance of the CMVE
Comparing strategies for selection of low-density SNPs for imputation-mediated genomic prediction in U. S. Holsteins.

Science.gov (United States)

He, Jun; Xu, Jiaqi; Wu, Xiao-Lin; Bauck, Stewart; Lee, Jungjae; Morota, Gota; Kachman, Stephen D; Spangler, Matthew L

2018-04-01

SNP chips are commonly used for genotyping animals in genomic selection but strategies for selecting low-density (LD) SNPs for imputation-mediated genomic selection have not been addressed adequately. The main purpose of the present study was to compare the performance of eight LD (6K) SNP panels, each selected by a different strategy exploiting a combination of three major factors: evenly-spaced SNPs, increased minor allele frequencies, and SNP-trait associations either for single traits independently or for all the three traits jointly. The imputation accuracies from 6K to 80K SNP genotypes were between 96.2 and 98.2%. Genomic prediction accuracies obtained using imputed 80K genotypes were between 0.817 and 0.821 for daughter pregnancy rate, between 0.838 and 0.844 for fat yield, and between 0.850 and 0.863 for milk yield. The two SNP panels optimized on the three major factors had the highest genomic prediction accuracy (0.821-0.863), and these accuracies were very close to those obtained using observed 80K genotypes (0.825-0.868). Further exploration of the underlying relationships showed that genomic prediction accuracies did not respond linearly to imputation accuracies, but were significantly affected by genotype (imputation) errors of SNPs in association with the traits to be predicted. SNPs optimal for map coverage and MAF were favorable for obtaining accurate imputation of genotypes whereas trait-associated SNPs improved genomic prediction accuracies. Thus, optimal LD SNP panels were the ones that combined both strengths. The present results have practical implications on the design of LD SNP chips for imputation-enabled genomic prediction.
Assessing accuracy of genotype imputation in American Indians.

Directory of Open Access Journals (Sweden)

Alka Malhotra

Full Text Available Genotype imputation is commonly used in genetic association studies to test untyped variants using information on linkage disequilibrium (LD with typed markers. Imputing genotypes requires a suitable reference population in which the LD pattern is known, most often one selected from HapMap. However, some populations, such as American Indians, are not represented in HapMap. In the present study, we assessed accuracy of imputation using HapMap reference populations in a genome-wide association study in Pima Indians.Data from six randomly selected chromosomes were used. Genotypes in the study population were masked (either 1% or 20% of SNPs available for a given chromosome. The masked genotypes were then imputed using the software Markov Chain Haplotyping Algorithm. Using four HapMap reference populations, average genotype error rates ranged from 7.86% for Mexican Americans to 22.30% for Yoruba. In contrast, use of the original Pima Indian data as a reference resulted in an average error rate of 1.73%.Our results suggest that the use of HapMap reference populations results in substantial inaccuracy in the imputation of genotypes in American Indians. A possible solution would be to densely genotype or sequence a reference American Indian population.
Analysis of Case-Control Association Studies: SNPs, Imputation and Haplotypes

KAUST Repository

Chatterjee, Nilanjan; Chen, Yi-Hau; Luo, Sheng; Carroll, Raymond J.

2009-01-01

Although prospective logistic regression is the standard method of analysis for case-control data, it has been recently noted that in genetic epidemiologic studies one can use the "retrospective" likelihood to gain major power by incorporating various population genetics model assumptions such as Hardy-Weinberg-Equilibrium (HWE), gene-gene and gene-environment independence. In this article we review these modern methods and contrast them with the more classical approaches through two types of applications (i) association tests for typed and untyped single nucleotide polymorphisms (SNPs) and (ii) estimation of haplotype effects and haplotype-environment interactions in the presence of haplotype-phase ambiguity. We provide novel insights to existing methods by construction of various score-tests and pseudo-likelihoods. In addition, we describe a novel two-stage method for analysis of untyped SNPs that can use any flexible external algorithm for genotype imputation followed by a powerful association test based on the retrospective likelihood. We illustrate applications of the methods using simulated and real data. © Institute of Mathematical Statistics, 2009.
Analysis of Case-Control Association Studies: SNPs, Imputation and Haplotypes

KAUST Repository

Chatterjee, Nilanjan

2009-11-01

Although prospective logistic regression is the standard method of analysis for case-control data, it has been recently noted that in genetic epidemiologic studies one can use the "retrospective" likelihood to gain major power by incorporating various population genetics model assumptions such as Hardy-Weinberg-Equilibrium (HWE), gene-gene and gene-environment independence. In this article we review these modern methods and contrast them with the more classical approaches through two types of applications (i) association tests for typed and untyped single nucleotide polymorphisms (SNPs) and (ii) estimation of haplotype effects and haplotype-environment interactions in the presence of haplotype-phase ambiguity. We provide novel insights to existing methods by construction of various score-tests and pseudo-likelihoods. In addition, we describe a novel two-stage method for analysis of untyped SNPs that can use any flexible external algorithm for genotype imputation followed by a powerful association test based on the retrospective likelihood. We illustrate applications of the methods using simulated and real data. © Institute of Mathematical Statistics, 2009.
Highly accurate sequence imputation enables precise QTL mapping in Brown Swiss cattle.

Science.gov (United States)

Frischknecht, Mirjam; Pausch, Hubert; Bapst, Beat; Signer-Hasler, Heidi; Flury, Christine; Garrick, Dorian; Stricker, Christian; Fries, Ruedi; Gredler-Grandl, Birgit

2017-12-29

Within the last few years a large amount of genomic information has become available in cattle. Densities of genomic information vary from a few thousand variants up to whole genome sequence information. In order to combine genomic information from different sources and infer genotypes for a common set of variants, genotype imputation is required. In this study we evaluated the accuracy of imputation from high density chips to whole genome sequence data in Brown Swiss cattle. Using four popular imputation programs (Beagle, FImpute, Impute2, Minimac) and various compositions of reference panels, the accuracy of the imputed sequence variant genotypes was high and differences between the programs and scenarios were small. We imputed sequence variant genotypes for more than 1600 Brown Swiss bulls and performed genome-wide association studies for milk fat percentage at two stages of lactation. We found one and three quantitative trait loci for early and late lactation fat content, respectively. Known causal variants that were imputed from the sequenced reference panel were among the most significantly associated variants of the genome-wide association study. Our study demonstrates that whole-genome sequence information can be imputed at high accuracy in cattle populations. Using imputed sequence variant genotypes in genome-wide association studies may facilitate causal variant detection.
Data Editing and Imputation in Business Surveys Using “R”

Directory of Open Access Journals (Sweden)

Elena Romascanu

2014-06-01

Full Text Available Purpose – Missing data are a recurring problem that can cause bias or lead to inefficient analyses. The objective of this paper is a direct comparison between the two statistical software features R and SPSS, in order to take full advantage of the existing automated methods for data editing process and imputation in business surveys (with a proper design of consistency rules as a partial alternative to the manual editing of data. Approach – The comparison of different methods on editing surveys data, in R with the ‘editrules’ and ‘survey’ packages because inside those, exist commonly used transformations in ofﬁcial statistics, as visualization of missing values pattern using ‘Amelia’ and ‘VIM’ packages, imputation approaches for longitudinal data using ‘VIMGUI’ and a comparison of another statistical software performance on the same features, such as SPSS. Findings – Data on business statistics received by NIS’s (National Institute of Statistics are not ready to be used for direct analysis due to in-record inconsistencies, errors and missing values from the collected data sets. The appropriate automatic methods from R packages, offers the ability to set the erroneous fields in edit-violating records, to verify the results after the imputation of missing values providing for users a flexible, less time consuming approach and easy to perform automation in R than in SPSS Macros syntax situations, when macros are very handy.
A Note on the Effect of Data Clustering on the Multiple-Imputation Variance Estimator: A Theoretical Addendum to the Lewis et al. article in JOS 2014

Directory of Open Access Journals (Sweden)

He Yulei

2016-03-01

Full Text Available Multiple imputation is a popular approach to handling missing data. Although it was originally motivated by survey nonresponse problems, it has been readily applied to other data settings. However, its general behavior still remains unclear when applied to survey data with complex sample designs, including clustering. Recently, Lewis et al. (2014 compared single- and multiple-imputation analyses for certain incomplete variables in the 2008 National Ambulatory Medicare Care Survey, which has a nationally representative, multistage, and clustered sampling design. Their study results suggested that the increase of the variance estimate due to multiple imputation compared with single imputation largely disappears for estimates with large design effects. We complement their empirical research by providing some theoretical reasoning. We consider data sampled from an equally weighted, single-stage cluster design and characterize the process using a balanced, one-way normal random-effects model. Assuming that the missingness is completely at random, we derive analytic expressions for the within- and between-multiple-imputation variance estimators for the mean estimator, and thus conveniently reveal the impact of design effects on these variance estimators. We propose approximations for the fraction of missing information in clustered samples, extending previous results for simple random samples. We discuss some generalizations of this research and its practical implications for data release by statistical agencies.
The utility of imputed matched sets. Analyzing probabilistically linked databases in a low information setting.

Science.gov (United States)

Thomas, A M; Cook, L J; Dean, J M; Olson, L M

2014-01-01

To compare results from high probability matched sets versus imputed matched sets across differing levels of linkage information. A series of linkages with varying amounts of available information were performed on two simulated datasets derived from multiyear motor vehicle crash (MVC) and hospital databases, where true matches were known. Distributions of high probability and imputed matched sets were compared against the true match population for occupant age, MVC county, and MVC hour. Regression models were fit to simulated log hospital charges and hospitalization status. High probability and imputed matched sets were not significantly different from occupant age, MVC county, and MVC hour in high information settings (p > 0.999). In low information settings, high probability matched sets were significantly different from occupant age and MVC county (p sets were not (p > 0.493). High information settings saw no significant differences in inference of simulated log hospital charges and hospitalization status between the two methods. High probability and imputed matched sets were significantly different from the outcomes in low information settings; however, imputed matched sets were more robust. The level of information available to a linkage is an important consideration. High probability matched sets are suitable for high to moderate information settings and for situations involving case-specific analysis. Conversely, imputed matched sets are preferable for low information settings when conducting population-based analyses.
VIGAN: Missing View Imputation with Generative Adversarial Networks.

Science.gov (United States)

Shang, Chao; Palmer, Aaron; Sun, Jiangwen; Chen, Ko-Shin; Lu, Jin; Bi, Jinbo

2017-01-01

In an era when big data are becoming the norm, there is less concern with the quantity but more with the quality and completeness of the data. In many disciplines, data are collected from heterogeneous sources, resulting in multi-view or multi-modal datasets. The missing data problem has been challenging to address in multi-view data analysis. Especially, when certain samples miss an entire view of data, it creates the missing view problem. Classic multiple imputations or matrix completion methods are hardly effective here when no information can be based on in the specific view to impute data for such samples. The commonly-used simple method of removing samples with a missing view can dramatically reduce sample size, thus diminishing the statistical power of a subsequent analysis. In this paper, we propose a novel approach for view imputation via generative adversarial networks (GANs), which we name by VIGAN. This approach first treats each view as a separate domain and identifies domain-to-domain mappings via a GAN using randomly-sampled data from each view, and then employs a multi-modal denoising autoencoder (DAE) to reconstruct the missing view from the GAN outputs based on paired data across the views. Then, by optimizing the GAN and DAE jointly, our model enables the knowledge integration for domain mappings and view correspondences to effectively recover the missing view. Empirical results on benchmark datasets validate the VIGAN approach by comparing against the state of the art. The evaluation of VIGAN in a genetic study of substance use disorders further proves the effectiveness and usability of this approach in life science.
Imputation of variants from the 1000 Genomes Project modestly improves known associations and can identify low-frequency variant-phenotype associations undetected by HapMap based imputation.

Science.gov (United States)

Wood, Andrew R; Perry, John R B; Tanaka, Toshiko; Hernandez, Dena G; Zheng, Hou-Feng; Melzer, David; Gibbs, J Raphael; Nalls, Michael A; Weedon, Michael N; Spector, Tim D; Richards, J Brent; Bandinelli, Stefania; Ferrucci, Luigi; Singleton, Andrew B; Frayling, Timothy M

2013-01-01

Genome-wide association (GWA) studies have been limited by the reliance on common variants present on microarrays or imputable from the HapMap Project data. More recently, the completion of the 1000 Genomes Project has provided variant and haplotype information for several million variants derived from sequencing over 1,000 individuals. To help understand the extent to which more variants (including low frequency (1% ≤ MAF 1000 Genomes imputation, respectively, and 9 and 11 that reached a stricter, likely conservative, threshold of P1000 Genomes genotype data modestly improved the strength of known associations. Of 20 associations detected at P1000 Genomes imputed data and one was nominally more strongly associated in HapMap imputed data. We also detected an association between a low frequency variant and phenotype that was previously missed by HapMap based imputation approaches. An association between rs112635299 and alpha-1 globulin near the SERPINA gene represented the known association between rs28929474 (MAF = 0.007) and alpha1-antitrypsin that predisposes to emphysema (P = 2.5×10(-12)). Our data provide important proof of principle that 1000 Genomes imputation will detect novel, low frequency-large effect associations.
Two-pass imputation algorithm for missing value estimation in gene expression time series.

Science.gov (United States)

Tsiporkova, Elena; Boeva, Veselka

2007-10-01

Gene expression microarray experiments frequently generate datasets with multiple values missing. However, most of the analysis, mining, and classification methods for gene expression data require a complete matrix of gene array values. Therefore, the accurate estimation of missing values in such datasets has been recognized as an important issue, and several imputation algorithms have already been proposed to the biological community. Most of these approaches, however, are not particularly suitable for time series expression profiles. In view of this, we propose a novel imputation algorithm, which is specially suited for the estimation of missing values in gene expression time series data. The algorithm utilizes Dynamic Time Warping (DTW) distance in order to measure the similarity between time expression profiles, and subsequently selects for each gene expression profile with missing values a dedicated set of candidate profiles for estimation. Three different DTW-based imputation (DTWimpute) algorithms have been considered: position-wise, neighborhood-wise, and two-pass imputation. These have initially been prototyped in Perl, and their accuracy has been evaluated on yeast expression time series data using several different parameter settings. The experiments have shown that the two-pass algorithm consistently outperforms, in particular for datasets with a higher level of missing entries, the neighborhood-wise and the position-wise algorithms. The performance of the two-pass DTWimpute algorithm has further been benchmarked against the weighted K-Nearest Neighbors algorithm, which is widely used in the biological community; the former algorithm has appeared superior to the latter one. Motivated by these findings, indicating clearly the added value of the DTW techniques for missing value estimation in time series data, we have built an optimized C++ implementation of the two-pass DTWimpute algorithm. The software also provides for a choice between three different
Inclusion of Population-specific Reference Panel from India to the 1000 Genomes Phase 3 Panel Improves Imputation Accuracy.

Science.gov (United States)

Ahmad, Meraj; Sinha, Anubhav; Ghosh, Sreya; Kumar, Vikrant; Davila, Sonia; Yajnik, Chittaranjan S; Chandak, Giriraj R

2017-07-27

Imputation is a computational method based on the principle of haplotype sharing allowing enrichment of genome-wide association study datasets. It depends on the haplotype structure of the population and density of the genotype data. The 1000 Genomes Project led to the generation of imputation reference panels which have been used globally. However, recent studies have shown that population-specific panels provide better enrichment of genome-wide variants. We compared the imputation accuracy using 1000 Genomes phase 3 reference panel and a panel generated from genome-wide data on 407 individuals from Western India (WIP). The concordance of imputed variants was cross-checked with next-generation re-sequencing data on a subset of genomic regions. Further, using the genome-wide data from 1880 individuals, we demonstrate that WIP works better than the 1000 Genomes phase 3 panel and when merged with it, significantly improves the imputation accuracy throughout the minor allele frequency range. We also show that imputation using only South Asian component of the 1000 Genomes phase 3 panel works as good as the merged panel, making it computationally less intensive job. Thus, our study stresses that imputation accuracy using 1000 Genomes phase 3 panel can be further improved by including population-specific reference panels from South Asia.
Imputing data that are missing at high rates using a boosting algorithm

Energy Technology Data Exchange (ETDEWEB)

Cauthen, Katherine Regina [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Lambert, Gregory [Apple Inc., Cupertino, CA (United States); Ray, Jaideep [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Lefantzi, Sophia [Sandia National Lab. (SNL-CA), Livermore, CA (United States)

2016-09-01

Traditional multiple imputation approaches may perform poorly for datasets with high rates of missingness unless many m imputations are used. This paper implements an alternative machine learning-based approach to imputing data that are missing at high rates. Here, we use boosting to create a strong learner from a weak learner fitted to a dataset missing many observations. This approach may be applied to a variety of types of learners (models). The approach is demonstrated by application to a spatiotemporal dataset for predicting dengue outbreaks in India from meteorological covariates. A Bayesian spatiotemporal CAR model is boosted to produce imputations, and the overall RMSE from a k-fold cross-validation is used to assess imputation accuracy.
Performance of genotype imputation for low frequency and rare variants from the 1000 genomes.

Science.gov (United States)

Zheng, Hou-Feng; Rong, Jing-Jing; Liu, Ming; Han, Fang; Zhang, Xing-Wei; Richards, J Brent; Wang, Li

2015-01-01

Genotype imputation is now routinely applied in genome-wide association studies (GWAS) and meta-analyses. However, most of the imputations have been run using HapMap samples as reference, imputation of low frequency and rare variants (minor allele frequency (MAF) 1000 Genomes panel) are available to facilitate imputation of these variants. Therefore, in order to estimate the performance of low frequency and rare variants imputation, we imputed 153 individuals, each of whom had 3 different genotype array data including 317k, 610k and 1 million SNPs, to three different reference panels: the 1000 Genomes pilot March 2010 release (1KGpilot), the 1000 Genomes interim August 2010 release (1KGinterim), and the 1000 Genomes phase1 November 2010 and May 2011 release (1KGphase1) by using IMPUTE version 2. The differences between these three releases of the 1000 Genomes data are the sample size, ancestry diversity, number of variants and their frequency spectrum. We found that both reference panel and GWAS chip density affect the imputation of low frequency and rare variants. 1KGphase1 outperformed the other 2 panels, at higher concordance rate, higher proportion of well-imputed variants (info>0.4) and higher mean info score in each MAF bin. Similarly, 1M chip array outperformed 610K and 317K. However for very rare variants (MAF ≤ 0.3%), only 0-1% of the variants were well imputed. We conclude that the imputation of low frequency and rare variants improves with larger reference panels and higher density of genome-wide genotyping arrays. Yet, despite a large reference panel size and dense genotyping density, very rare variants remain difficult to impute.

Public Undertakings and Imputability

DEFF Research Database (Denmark)

Ølykke, Grith Skovgaard

2013-01-01

In this article, the issue of impuability to the State of public undertakings’ decision-making is analysed and discussed in the context of the DSBFirst case. DSBFirst is owned by the independent public undertaking DSB and the private undertaking FirstGroup plc and won the contracts in the 2008...... Oeresund tender for the provision of passenger transport by railway. From the start, the services were provided at a loss, and in the end a part of DSBFirst was wound up. In order to frame the problems illustrated by this case, the jurisprudence-based imputability requirement in the definition of State aid...... in Article 107(1) TFEU is analysed. It is concluded that where the public undertaking transgresses the control system put in place by the State, conditions for imputability are not fulfilled, and it is argued that in the current state of law, there is no conditional link between the level of control...
Partial F-tests with multiply imputed data in the linear regression framework via coefficient of determination.

Science.gov (United States)

Chaurasia, Ashok; Harel, Ofer

2015-02-10

Tests for regression coefficients such as global, local, and partial F-tests are common in applied research. In the framework of multiple imputation, there are several papers addressing tests for regression coefficients. However, for simultaneous hypothesis testing, the existing methods are computationally intensive because they involve calculation with vectors and (inversion of) matrices. In this paper, we propose a simple method based on the scalar entity, coefficient of determination, to perform (global, local, and partial) F-tests with multiply imputed data. The proposed method is evaluated using simulated data and applied to suicide prevention data. Copyright © 2014 John Wiley & Sons, Ltd.
Construction and application of a Korean reference panel for imputing classical alleles and amino acids of human leukocyte antigen genes.

Science.gov (United States)

Kim, Kwangwoo; Bang, So-Young; Lee, Hye-Soon; Bae, Sang-Cheol

2014-01-01

Genetic variations of human leukocyte antigen (HLA) genes within the major histocompatibility complex (MHC) locus are strongly associated with disease susceptibility and prognosis for many diseases, including many autoimmune diseases. In this study, we developed a Korean HLA reference panel for imputing classical alleles and amino acid residues of several HLA genes. An HLA reference panel has potential for use in identifying and fine-mapping disease associations with the MHC locus in East Asian populations, including Koreans. A total of 413 unrelated Korean subjects were analyzed for single nucleotide polymorphisms (SNPs) at the MHC locus and six HLA genes, including HLA-A, -B, -C, -DRB1, -DPB1, and -DQB1. The HLA reference panel was constructed by phasing the 5,858 MHC SNPs, 233 classical HLA alleles, and 1,387 amino acid residue markers from 1,025 amino acid positions as binary variables. The imputation accuracy of the HLA reference panel was assessed by measuring concordance rates between imputed and genotyped alleles of the HLA genes from a subset of the study subjects and East Asian HapMap individuals. Average concordance rates were 95.6% and 91.1% at 2-digit and 4-digit allele resolutions, respectively. The imputation accuracy was minimally affected by SNP density of a test dataset for imputation. In conclusion, the Korean HLA reference panel we developed was highly suitable for imputing HLA alleles and amino acids from MHC SNPs in East Asians, including Koreans.
Construction and application of a Korean reference panel for imputing classical alleles and amino acids of human leukocyte antigen genes.

Directory of Open Access Journals (Sweden)

Kwangwoo Kim

Full Text Available Genetic variations of human leukocyte antigen (HLA genes within the major histocompatibility complex (MHC locus are strongly associated with disease susceptibility and prognosis for many diseases, including many autoimmune diseases. In this study, we developed a Korean HLA reference panel for imputing classical alleles and amino acid residues of several HLA genes. An HLA reference panel has potential for use in identifying and fine-mapping disease associations with the MHC locus in East Asian populations, including Koreans. A total of 413 unrelated Korean subjects were analyzed for single nucleotide polymorphisms (SNPs at the MHC locus and six HLA genes, including HLA-A, -B, -C, -DRB1, -DPB1, and -DQB1. The HLA reference panel was constructed by phasing the 5,858 MHC SNPs, 233 classical HLA alleles, and 1,387 amino acid residue markers from 1,025 amino acid positions as binary variables. The imputation accuracy of the HLA reference panel was assessed by measuring concordance rates between imputed and genotyped alleles of the HLA genes from a subset of the study subjects and East Asian HapMap individuals. Average concordance rates were 95.6% and 91.1% at 2-digit and 4-digit allele resolutions, respectively. The imputation accuracy was minimally affected by SNP density of a test dataset for imputation. In conclusion, the Korean HLA reference panel we developed was highly suitable for imputing HLA alleles and amino acids from MHC SNPs in East Asians, including Koreans.
Accounting for one-channel depletion improves missing value imputation in 2-dye microarray data.

Science.gov (United States)

Ritz, Cecilia; Edén, Patrik

2008-01-19

For 2-dye microarray platforms, some missing values may arise from an un-measurably low RNA expression in one channel only. Information of such "one-channel depletion" is so far not included in algorithms for imputation of missing values. Calculating the mean deviation between imputed values and duplicate controls in five datasets, we show that KNN-based imputation gives a systematic bias of the imputed expression values of one-channel depleted spots. Evaluating the correction of this bias by cross-validation showed that the mean square deviation between imputed values and duplicates were reduced up to 51%, depending on dataset. By including more information in the imputation step, we more accurately estimate missing expression values.
SigEMD: A powerful method for differential gene expression analysis in single-cell RNA sequencing data.

Science.gov (United States)

Wang, Tianyu; Nabavi, Sheida

2018-04-24

Differential gene expression analysis is one of the significant efforts in single cell RNA sequencing (scRNAseq) analysis to discover the specific changes in expression levels of individual cell types. Since scRNAseq exhibits multimodality, large amounts of zero counts, and sparsity, it is different from the traditional bulk RNA sequencing (RNAseq) data. The new challenges of scRNAseq data promote the development of new methods for identifying differentially expressed (DE) genes. In this study, we proposed a new method, SigEMD, that combines a data imputation approach, a logistic regression model and a nonparametric method based on the Earth Mover's Distance, to precisely and efficiently identify DE genes in scRNAseq data. The regression model and data imputation are used to reduce the impact of large amounts of zero counts, and the nonparametric method is used to improve the sensitivity of detecting DE genes from multimodal scRNAseq data. By additionally employing gene interaction network information to adjust the final states of DE genes, we further reduce the false positives of calling DE genes. We used simulated datasets and real datasets to evaluate the detection accuracy of the proposed method and to compare its performance with those of other differential expression analysis methods. Results indicate that the proposed method has an overall powerful performance in terms of precision in detection, sensitivity, and specificity. Copyright © 2018 Elsevier Inc. All rights reserved.
Multiple imputation strategies for zero-inflated cost data in economic evaluations : which method works best?

NARCIS (Netherlands)

MacNeil Vroomen, Janet; Eekhout, Iris; Dijkgraaf, Marcel G; van Hout, Hein; de Rooij, Sophia E; Heymans, Martijn W; Bosmans, Judith E

2016-01-01

Cost and effect data often have missing data because economic evaluations are frequently added onto clinical studies where cost data are rarely the primary outcome. The objective of this article was to investigate which multiple imputation strategy is most appropriate to use for missing
Random Forest as an Imputation Method for Education and Psychology Research: Its Impact on Item Fit and Difficulty of the Rasch Model

Science.gov (United States)

Golino, Hudson F.; Gomes, Cristiano M. A.

2016-01-01

This paper presents a non-parametric imputation technique, named random forest, from the machine learning field. The random forest procedure has two main tuning parameters: the number of trees grown in the prediction and the number of predictors used. Fifty experimental conditions were created in the imputation procedure, with different…
Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel

NARCIS (Netherlands)

J. Huang (Jie); B. Howie (Bryan); S. McCarthy (Shane); Y. Memari (Yasin); K. Walter (Klaudia); J.L. Min (Josine L.); P. Danecek (Petr); G. Malerba (Giovanni); E. Trabetti (Elisabetta); H.-F. Zheng (Hou-Feng); G. Gambaro (Giovanni); J.B. Richards (Brent); R. Durbin (Richard); N.J. Timpson (Nicholas); J. Marchini (Jonathan); N. Soranzo (Nicole); S.H. Al Turki (Saeed); A. Amuzu (Antoinette); C. Anderson (Carl); R. Anney (Richard); D. Antony (Dinu); M.S. Artigas; M. Ayub (Muhammad); S. Bala (Senduran); J.C. Barrett (Jeffrey); I.E. Barroso (Inês); P.L. Beales (Philip); M. Benn (Marianne); J. Bentham (Jamie); S. Bhattacharya (Shoumo); E. Birney (Ewan); D.H.R. Blackwood (Douglas); M. Bobrow (Martin); E. Bochukova (Elena); P.F. Bolton (Patrick F.); R. Bounds (Rebecca); C. Boustred (Chris); G. Breen (Gerome); M. Calissano (Mattia); K. Carss (Keren); J.P. Casas (Juan Pablo); J.C. Chambers (John C.); R. Charlton (Ruth); K. Chatterjee (Krishna); L. Chen (Lu); A. Ciampi (Antonio); S. Cirak (Sebahattin); P. Clapham (Peter); G. Clement (Gail); G. Coates (Guy); M. Cocca (Massimiliano); D.A. Collier (David); C. Cosgrove (Catherine); T. Cox (Tony); N.J. Craddock (Nick); L. Crooks (Lucy); S. Curran (Sarah); D. Curtis (David); A. Daly (Allan); I.N.M. Day (Ian N.M.); A.G. Day-Williams (Aaron); G.V. Dedoussis (George); T. Down (Thomas); Y. Du (Yuanping); C.M. van Duijn (Cornelia); I. Dunham (Ian); T. Edkins (Ted); R. Ekong (Rosemary); P. Ellis (Peter); D.M. Evans (David); I.S. Farooqi (I. Sadaf); D.R. Fitzpatrick (David R.); P. Flicek (Paul); J. Floyd (James); A.R. Foley (A. Reghan); C.S. Franklin (Christopher S.); M. Futema (Marta); L. Gallagher (Louise); P. Gasparini (Paolo); T.R. Gaunt (Tom); M. Geihs (Matthias); D. Geschwind (Daniel); C.M.T. Greenwood (Celia); H. Griffin (Heather); D. Grozeva (Detelina); X. Guo (Xiaosen); X. Guo (Xueqin); H. Gurling (Hugh); D. Hart (Deborah); A.E. Hendricks (Audrey E.); P.A. Holmans (Peter A.); L. Huang (Liren); T. Hubbard (Tim); S.E. Humphries (Steve E.); M.E. Hurles (Matthew); P.G. Hysi (Pirro); V. Iotchkova (Valentina); A. Isaacs (Aaron); D.K. Jackson (David K.); Y. Jamshidi (Yalda); J. Johnson (Jon); C. Joyce (Chris); K.J. Karczewski (Konrad); J. Kaye (Jane); T. Keane (Thomas); J.P. Kemp (John); K. Kennedy (Karen); A. Kent (Alastair); J. Keogh (Julia); F. Khawaja (Farrah); M.E. Kleber (Marcus); M. Van Kogelenberg (Margriet); A. Kolb-Kokocinski (Anja); J.S. Kooner (Jaspal S.); G. Lachance (Genevieve); C. Langenberg (Claudia); C. Langford (Cordelia); D. Lawson (Daniel); I. Lee (Irene); E.M. van Leeuwen (Elisa); M. Lek (Monkol); R. Li (Rui); Y. Li (Yingrui); J. Liang (Jieqin); H. Lin (Hong); R. Liu (Ryan); J. Lönnqvist (Jouko); L.R. Lopes (Luis R.); M.C. Lopes (Margarida); J. Luan; D.G. MacArthur (Daniel G.); M. Mangino (Massimo); G. Marenne (Gaëlle); W. März (Winfried); J. Maslen (John); A. Matchan (Angela); I. Mathieson (Iain); P. McGuffin (Peter); A.M. McIntosh (Andrew); A.G. McKechanie (Andrew G.); A. McQuillin (Andrew); S. Metrustry (Sarah); N. Migone (Nicola); H.M. Mitchison (Hannah M.); A. Moayyeri (Alireza); J. Morris (James); R. Morris (Richard); D. Muddyman (Dawn); F. Muntoni; B.G. Nordestgaard (Børge G.); K. Northstone (Kate); M.C. O'donovan (Michael); S. O'Rahilly (Stephen); A. Onoufriadis (Alexandros); K. Oualkacha (Karim); M.J. Owen (Michael J.); A. Palotie (Aarno); K. Panoutsopoulou (Kalliope); V. Parker (Victoria); J.R. Parr (Jeremy R.); L. Paternoster (Lavinia); T. Paunio (Tiina); F. Payne (Felicity); S.J. Payne (Stewart J.); J.R.B. Perry (John); O.P.H. Pietiläinen (Olli); V. Plagnol (Vincent); R.C. Pollitt (Rebecca C.); S. Povey (Sue); M.A. Quail (Michael A.); L. Quaye (Lydia); L. Raymond (Lucy); K. Rehnström (Karola); C.K. Ridout (Cheryl K.); S.M. Ring (Susan); G.R.S. Ritchie (Graham R.S.); N. Roberts (Nicola); R.L. Robinson (Rachel L.); D.B. Savage (David); P.J. Scambler (Peter); S. Schiffels (Stephan); M. Schmidts (Miriam); N. Schoenmakers (Nadia); R.H. Scott (Richard H.); R.A. Scott (Robert); R.K. Semple (Robert K.); E. Serra (Eva); S.I. Sharp (Sally I.); A.C. Shaw (Adam C.); H.A. Shihab (Hashem A.); S.-Y. Shin (So-Youn); D. Skuse (David); K.S. Small (Kerrin); C. Smee (Carol); G.D. Smith; L. Southam (Lorraine); O. Spasic-Boskovic (Olivera); T.D. Spector (Timothy); D. St. Clair (David); B. St Pourcain (Beate); J. Stalker (Jim); E. Stevens (Elizabeth); J. Sun (Jianping); G. Surdulescu (Gabriela); J. Suvisaari (Jaana); P. Syrris (Petros); I. Tachmazidou (Ioanna); R. Taylor (Rohan); J. Tian (Jing); M.D. Tobin (Martin); D. Toniolo (Daniela); M. Traglia (Michela); A. Tybjaerg-Hansen; A.M. Valdes; A.M. Vandersteen (Anthony M.); A. Varbo (Anette); P. Vijayarangakannan (Parthiban); P.M. Visscher (Peter); L.V. Wain (Louise); J.T. Walters (James); G. Wang (Guangbiao); J. Wang (Jun); Y. Wang (Yu); K. Ward (Kirsten); E. Wheeler (Eleanor); P.H. Whincup (Peter); T. Whyte (Tamieka); H.J. Williams (Hywel J.); K.A. Williamson (Kathleen); C. Wilson (Crispian); S.G. Wilson (Scott); K. Wong (Kim); C. Xu (Changjiang); J. Yang (Jian); G. Zaza (Gianluigi); E. Zeggini (Eleftheria); F. Zhang (Feng); P. Zhang (Pingbo); W. Zhang (Weihua)

2015-01-01

textabstractImputing genotypes from reference panels created by whole-genome sequencing (WGS) provides a cost-effective strategy for augmenting the single-nucleotide polymorphism (SNP) content of genome-wide arrays. The UK10K Cohorts project has generated a data set of 3,781 whole genomes sequenced
Imputed prices of greenhouse gases and land forests

International Nuclear Information System (INIS)

Uzawa, Hirofumi

1993-01-01

The theory of dynamic optimum formulated by Maeler gives us the basic theoretical framework within which it is possible to analyse the economic and, possibly, political circumstances under which the phenomenon of global warming occurs, and to search for the policy and institutional arrangements whereby it would be effectively arrested. The analysis developed here is an application of Maeler's theory to atmospheric quality. In the analysis a central role is played by the concept of imputed price in the dynamic context. Our determination of imputed prices of atmospheric carbon dioxide and land forests takes into account the difference in the stages of economic development. Indeed, the ratios of the imputed prices of atmospheric carbon dioxide and land forests over the per capita level of real national income are identical for all countries involved. (3 figures, 2 tables) (Author)
An efficient method to transcription factor binding sites imputation via simultaneous completion of multiple matrices with positional consistency.

Science.gov (United States)

Guo, Wei-Li; Huang, De-Shuang

2017-08-22

Transcription factors (TFs) are DNA-binding proteins that have a central role in regulating gene expression. Identification of DNA-binding sites of TFs is a key task in understanding transcriptional regulation, cellular processes and disease. Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) enables genome-wide identification of in vivo TF binding sites. However, it is still difficult to map every TF in every cell line owing to cost and biological material availability, which poses an enormous obstacle for integrated analysis of gene regulation. To address this problem, we propose a novel computational approach, TFBSImpute, for predicting additional TF binding profiles by leveraging information from available ChIP-seq TF binding data. TFBSImpute fuses the dataset to a 3-mode tensor and imputes missing TF binding signals via simultaneous completion of multiple TF binding matrices with positional consistency. We show that signals predicted by our method achieve overall similarity with experimental data and that TFBSImpute significantly outperforms baseline approaches, by assessing the performance of imputation methods against observed ChIP-seq TF binding profiles. Besides, motif analysis shows that TFBSImpute preforms better in capturing binding motifs enriched in observed data compared with baselines, indicating that the higher performance of TFBSImpute is not simply due to averaging related samples. We anticipate that our approach will constitute a useful complement to experimental mapping of TF binding, which is beneficial for further study of regulation mechanisms and disease.
A Time-Series Water Level Forecasting Model Based on Imputation and Variable Selection Method.

Science.gov (United States)

Yang, Jun-He; Cheng, Ching-Hsue; Chan, Chia-Pan

2017-01-01

Reservoirs are important for households and impact the national economy. This paper proposed a time-series forecasting model based on estimating a missing value followed by variable selection to forecast the reservoir's water level. This study collected data from the Taiwan Shimen Reservoir as well as daily atmospheric data from 2008 to 2015. The two datasets are concatenated into an integrated dataset based on ordering of the data as a research dataset. The proposed time-series forecasting model summarily has three foci. First, this study uses five imputation methods to directly delete the missing value. Second, we identified the key variable via factor analysis and then deleted the unimportant variables sequentially via the variable selection method. Finally, the proposed model uses a Random Forest to build the forecasting model of the reservoir's water level. This was done to compare with the listing method under the forecasting error. These experimental results indicate that the Random Forest forecasting model when applied to variable selection with full variables has better forecasting performance than the listing model. In addition, this experiment shows that the proposed variable selection can help determine five forecast methods used here to improve the forecasting capability.
Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel

DEFF Research Database (Denmark)

Huang, Jie; Howie, Bryan; Mccarthy, Shane

2015-01-01

Imputing genotypes from reference panels created by whole-genome sequencing (WGS) provides a cost-effective strategy for augmenting the single-nucleotide polymorphism (SNP) content of genome-wide arrays. The UK10K Cohorts project has generated a data set of 3,781 whole genomes sequenced at low de...
Practical considerations for sensitivity analysis after multiple imputation applied to epidemiological studies with incomplete data

Science.gov (United States)

2012-01-01

Background Multiple Imputation as usually implemented assumes that data are Missing At Random (MAR), meaning that the underlying missing data mechanism, given the observed data, is independent of the unobserved data. To explore the sensitivity of the inferences to departures from the MAR assumption, we applied the method proposed by Carpenter et al. (2007). This approach aims to approximate inferences under a Missing Not At random (MNAR) mechanism by reweighting estimates obtained after multiple imputation where the weights depend on the assumed degree of departure from the MAR assumption. Methods The method is illustrated with epidemiological data from a surveillance system of hepatitis C virus (HCV) infection in France during the 2001–2007 period. The subpopulation studied included 4343 HCV infected patients who reported drug use. Risk factors for severe liver disease were assessed. After performing complete-case and multiple imputation analyses, we applied the sensitivity analysis to 3 risk factors of severe liver disease: past excessive alcohol consumption, HIV co-infection and infection with HCV genotype 3. Results In these data, the association between severe liver disease and HIV was underestimated, if given the observed data the chance of observing HIV status is high when this is positive. Inference for two other risk factors were robust to plausible local departures from the MAR assumption. Conclusions We have demonstrated the practical utility of, and advocate, a pragmatic widely applicable approach to exploring plausible departures from the MAR assumption post multiple imputation. We have developed guidelines for applying this approach to epidemiological studies. PMID:22681630
DTW-APPROACH FOR UNCORRELATED MULTIVARIATE TIME SERIES IMPUTATION

OpenAIRE

Phan , Thi-Thu-Hong; Poisson Caillault , Emilie; Bigand , André; Lefebvre , Alain

2017-01-01

International audience; Missing data are inevitable in almost domains of applied sciences. Data analysis with missing values can lead to a loss of efficiency and unreliable results, especially for large missing sub-sequence(s). Some well-known methods for multivariate time series imputation require high correlations between series or their features. In this paper , we propose an approach based on the shape-behaviour relation in low/un-correlated multivariate time series under an assumption of...
48 CFR 1830.7002-4 - Determining imputed cost of money.

Science.gov (United States)

2010-10-01

... money. 1830.7002-4 Section 1830.7002-4 Federal Acquisition Regulations System NATIONAL AERONAUTICS AND... Determining imputed cost of money. (a) Determine the imputed cost of money for an asset under construction, fabrication, or development by applying a cost of money rate (see 1830.7002-2) to the representative...
Multiple imputation to account for missing data in a survey: estimating the prevalence of osteoporosis.

Science.gov (United States)

Kmetic, Andrew; Joseph, Lawrence; Berger, Claudie; Tenenhouse, Alan

2002-07-01

Nonresponse bias is a concern in any epidemiologic survey in which a subset of selected individuals declines to participate. We reviewed multiple imputation, a widely applicable and easy to implement Bayesian methodology to adjust for nonresponse bias. To illustrate the method, we used data from the Canadian Multicentre Osteoporosis Study, a large cohort study of 9423 randomly selected Canadians, designed in part to estimate the prevalence of osteoporosis. Although subjects were randomly selected, only 42% of individuals who were contacted agreed to participate fully in the study. The study design included a brief questionnaire for those invitees who declined further participation in order to collect information on the major risk factors for osteoporosis. These risk factors (which included age, sex, previous fractures, family history of osteoporosis, and current smoking status) were then used to estimate the missing osteoporosis status for nonparticipants using multiple imputation. Both ignorable and nonignorable imputation models are considered. Our results suggest that selection bias in the study is of concern, but only slightly, in very elderly (age 80+ years), both women and men. Epidemiologists should consider using multiple imputation more often than is current practice.
Different methods for analysing and imputation missing values in wind speed series; La problematica de la calidad de la informacion en series de velocidad del viento-metodologias de analisis y imputacion de datos faltantes

Energy Technology Data Exchange (ETDEWEB)

Ferreira, A. M.

2004-07-01

This study concerns about different methods for analysing and imputation missing values in wind speed series. The algorithm EM and a methodology derivated from the sequential hot deck have been utilized. Series with missing values imputed are compared with original and complete series, using several criteria, such the wind potential; and appears to exist a significant goodness of fit between the estimates and real values. (Author)
TRIP: An interactive retrieving-inferring data imputation approach

KAUST Repository

Li, Zhixu

2016-06-25

Data imputation aims at filling in missing attribute values in databases. Existing imputation approaches to nonquantitive string data can be roughly put into two categories: (1) inferring-based approaches [2], and (2) retrieving-based approaches [1]. Specifically, the inferring-based approaches find substitutes or estimations for the missing ones from the complete part of the data set. However, they typically fall short in filling in unique missing attribute values which do not exist in the complete part of the data set [1]. The retrieving-based approaches resort to external resources for help by formulating proper web search queries to retrieve web pages containing the missing values from the Web, and then extracting the missing values from the retrieved web pages [1]. This webbased retrieving approach reaches a high imputation precision and recall, but on the other hand, issues a large number of web search queries, which brings a large overhead [1]. © 2016 IEEE.
TRIP: An interactive retrieving-inferring data imputation approach

KAUST Repository

Li, Zhixu; Qin, Lu; Cheng, Hong; Zhang, Xiangliang; Zhou, Xiaofang

2016-01-01

Data imputation aims at filling in missing attribute values in databases. Existing imputation approaches to nonquantitive string data can be roughly put into two categories: (1) inferring-based approaches [2], and (2) retrieving-based approaches [1]. Specifically, the inferring-based approaches find substitutes or estimations for the missing ones from the complete part of the data set. However, they typically fall short in filling in unique missing attribute values which do not exist in the complete part of the data set [1]. The retrieving-based approaches resort to external resources for help by formulating proper web search queries to retrieve web pages containing the missing values from the Web, and then extracting the missing values from the retrieved web pages [1]. This webbased retrieving approach reaches a high imputation precision and recall, but on the other hand, issues a large number of web search queries, which brings a large overhead [1]. © 2016 IEEE.

Genotype Imputation for Latinos Using the HapMap and 1000 Genomes Project Reference Panels

Directory of Open Access Journals (Sweden)

Xiaoyi eGao

2012-06-01

Full Text Available Genotype imputation is a vital tool in genome-wide association studies (GWAS and meta-analyses of multiple GWAS results. Imputation enables researchers to increase genomic coverage and to pool data generated using different genotyping platforms. HapMap samples are often employed as the reference panel. More recently, the 1000 Genomes Project resource is becoming the primary source for reference panels. Multiple GWAS and meta-analyses are targeting Latinos, the most populous and fastest growing minority group in the US. However, genotype imputation resources for Latinos are rather limited compared to individuals of European ancestry at present, largely because of the lack of good reference data. One choice of reference panel for Latinos is one derived from the population of Mexican individuals in Los Angeles contained in the HapMap Phase 3 project and the 1000 Genomes Project. However, a detailed evaluation of the quality of the imputed genotypes derived from the public reference panels has not yet been reported. Using simulation studies, the Illumina OmniExpress GWAS data from the Los Angles Latino Eye Study and the MACH software package, we evaluated the accuracy of genotype imputation in Latinos. Our results show that the 1000 Genomes Project AMR+CEU+YRI reference panel provides the highest imputation accuracy for Latinos, and that also including Asian samples in the panel can reduce imputation accuracy. We also provide the imputation accuracy for each autosomal chromosome using the 1000 Genomes Project panel for Latinos. Our results serve as a guide to future imputation-based analysis in Latinos.
Imputation of genotypes in Danish two-way crossbred pigs using low density panels

DEFF Research Database (Denmark)

Xiang, Tao; Christensen, Ole Fredslund; Legarra, Andres

Genotype imputation is commonly used as an initial step of genomic selection. Studies on humans, plants and ruminants suggested many factors would affect the performance of imputation. However, studies rarely investigated pigs, especially crossbred pigs. In this study, different scenarios...... of imputation from 5K SNPs to 7K SNPs on Danish Landrace, Yorkshire, and crossbred Landrace-Yorkshire were compared. In conclusion, genotype imputation on crossbreds performs equally well as in purebreds, when parental breeds are used as the reference panel. When the size of reference is considerably large...... SNPs. This dataset will be analyzed for genomic selection in a future study...
A Time-Series Water Level Forecasting Model Based on Imputation and Variable Selection Method

Directory of Open Access Journals (Sweden)

Jun-He Yang

2017-01-01

Full Text Available Reservoirs are important for households and impact the national economy. This paper proposed a time-series forecasting model based on estimating a missing value followed by variable selection to forecast the reservoir’s water level. This study collected data from the Taiwan Shimen Reservoir as well as daily atmospheric data from 2008 to 2015. The two datasets are concatenated into an integrated dataset based on ordering of the data as a research dataset. The proposed time-series forecasting model summarily has three foci. First, this study uses five imputation methods to directly delete the missing value. Second, we identified the key variable via factor analysis and then deleted the unimportant variables sequentially via the variable selection method. Finally, the proposed model uses a Random Forest to build the forecasting model of the reservoir’s water level. This was done to compare with the listing method under the forecasting error. These experimental results indicate that the Random Forest forecasting model when applied to variable selection with full variables has better forecasting performance than the listing model. In addition, this experiment shows that the proposed variable selection can help determine five forecast methods used here to improve the forecasting capability.
[Imputing missing data in public health: general concepts and application to dichotomous variables].

Science.gov (United States)

Hernández, Gilma; Moriña, David; Navarro, Albert

The presence of missing data in collected variables is common in health surveys, but the subsequent imputation thereof at the time of analysis is not. Working with imputed data may have certain benefits regarding the precision of the estimators and the unbiased identification of associations between variables. The imputation process is probably still little understood by many non-statisticians, who view this process as highly complex and with an uncertain goal. To clarify these questions, this note aims to provide a straightforward, non-exhaustive overview of the imputation process to enable public health researchers ascertain its strengths. All this in the context of dichotomous variables which are commonplace in public health. To illustrate these concepts, an example in which missing data is handled by means of simple and multiple imputation is introduced. Copyright © 2017 SESPAS. Publicado por Elsevier España, S.L.U. All rights reserved.
Defining, evaluating, and removing bias induced by linear imputation in longitudinal clinical trials with MNAR missing data.

Science.gov (United States)

Helms, Ronald W; Reece, Laura Helms; Helms, Russell W; Helms, Mary W

2011-03-01

Missing not at random (MNAR) post-dropout missing data from a longitudinal clinical trial result in the collection of "biased data," which leads to biased estimators and tests of corrupted hypotheses. In a full rank linear model analysis the model equation, E[Y] = Xβ, leads to the definition of the primary parameter β = (X'X)(-1)X'E[Y], and the definition of linear secondary parameters of the form θ = Lβ = L(X'X)(-1)X'E[Y], including, for example, a parameter representing a "treatment effect." These parameters depend explicitly on E[Y], which raises the questions: What is E[Y] when some elements of the incomplete random vector Y are not observed and MNAR, or when such a Y is "completed" via imputation? We develop a rigorous, readily interpretable definition of E[Y] in this context that leads directly to definitions of β, Bias(β) = E[β] - β, Bias(θ) = E[θ] - Lβ, and the extent of hypothesis corruption. These definitions provide a basis for evaluating, comparing, and removing biases induced by various linear imputation methods for MNAR incomplete data from longitudinal clinical trials. Linear imputation methods use earlier data from a subject to impute values for post-dropout missing values and include "Last Observation Carried Forward" (LOCF) and "Baseline Observation Carried Forward" (BOCF), among others. We illustrate the methods of evaluating, comparing, and removing biases and the effects of testing corresponding corrupted hypotheses via a hypothetical but very realistic longitudinal analgesic clinical trial.
Imputing amino acid polymorphisms in human leukocyte antigens.

Directory of Open Access Journals (Sweden)

Xiaoming Jia

Full Text Available DNA sequence variation within human leukocyte antigen (HLA genes mediate susceptibility to a wide range of human diseases. The complex genetic structure of the major histocompatibility complex (MHC makes it difficult, however, to collect genotyping data in large cohorts. Long-range linkage disequilibrium between HLA loci and SNP markers across the major histocompatibility complex (MHC region offers an alternative approach through imputation to interrogate HLA variation in existing GWAS data sets. Here we describe a computational strategy, SNP2HLA, to impute classical alleles and amino acid polymorphisms at class I (HLA-A, -B, -C and class II (-DPA1, -DPB1, -DQA1, -DQB1, and -DRB1 loci. To characterize performance of SNP2HLA, we constructed two European ancestry reference panels, one based on data collected in HapMap-CEPH pedigrees (90 individuals and another based on data collected by the Type 1 Diabetes Genetics Consortium (T1DGC, 5,225 individuals. We imputed HLA alleles in an independent data set from the British 1958 Birth Cohort (N = 918 with gold standard four-digit HLA types and SNPs genotyped using the Affymetrix GeneChip 500 K and Illumina Immunochip microarrays. We demonstrate that the sample size of the reference panel, rather than SNP density of the genotyping platform, is critical to achieve high imputation accuracy. Using the larger T1DGC reference panel, the average accuracy at four-digit resolution is 94.7% using the low-density Affymetrix GeneChip 500 K, and 96.7% using the high-density Illumina Immunochip. For amino acid polymorphisms within HLA genes, we achieve 98.6% and 99.3% accuracy using the Affymetrix GeneChip 500 K and Illumina Immunochip, respectively. Finally, we demonstrate how imputation and association testing at amino acid resolution can facilitate fine-mapping of primary MHC association signals, giving a specific example from type 1 diabetes.
Imputation across genotyping arrays for genome-wide association studies: assessment of bias and a correction strategy.

Science.gov (United States)

Johnson, Eric O; Hancock, Dana B; Levy, Joshua L; Gaddis, Nathan C; Saccone, Nancy L; Bierut, Laura J; Page, Grier P

2013-05-01

A great promise of publicly sharing genome-wide association data is the potential to create composite sets of controls. However, studies often use different genotyping arrays, and imputation to a common set of SNPs has shown substantial bias: a problem which has no broadly applicable solution. Based on the idea that using differing genotyped SNP sets as inputs creates differential imputation errors and thus bias in the composite set of controls, we examined the degree to which each of the following occurs: (1) imputation based on the union of genotyped SNPs (i.e., SNPs available on one or more arrays) results in bias, as evidenced by spurious associations (type 1 error) between imputed genotypes and arbitrarily assigned case/control status; (2) imputation based on the intersection of genotyped SNPs (i.e., SNPs available on all arrays) does not evidence such bias; and (3) imputation quality varies by the size of the intersection of genotyped SNP sets. Imputations were conducted in European Americans and African Americans with reference to HapMap phase II and III data. Imputation based on the union of genotyped SNPs across the Illumina 1M and 550v3 arrays showed spurious associations for 0.2 % of SNPs: ~2,000 false positives per million SNPs imputed. Biases remained problematic for very similar arrays (550v1 vs. 550v3) and were substantial for dissimilar arrays (Illumina 1M vs. Affymetrix 6.0). In all instances, imputing based on the intersection of genotyped SNPs (as few as 30 % of the total SNPs genotyped) eliminated such bias while still achieving good imputation quality.
Missing Value Imputation Based on Gaussian Mixture Model for the Internet of Things

Directory of Open Access Journals (Sweden)

Xiaobo Yan

2015-01-01

Full Text Available This paper addresses missing value imputation for the Internet of Things (IoT. Nowadays, the IoT has been used widely and commonly by a variety of domains, such as transportation and logistics domain and healthcare domain. However, missing values are very common in the IoT for a variety of reasons, which results in the fact that the experimental data are incomplete. As a result of this, some work, which is related to the data of the IoT, can’t be carried out normally. And it leads to the reduction in the accuracy and reliability of the data analysis results. This paper, for the characteristics of the data itself and the features of missing data in IoT, divides the missing data into three types and defines three corresponding missing value imputation problems. Then, we propose three new models to solve the corresponding problems, and they are model of missing value imputation based on context and linear mean (MCL, model of missing value imputation based on binary search (MBS, and model of missing value imputation based on Gaussian mixture model (MGI. Experimental results showed that the three models can improve the accuracy, reliability, and stability of missing value imputation greatly and effectively.
Whole-Genome Sequencing Coupled to Imputation Discovers Genetic Signals for Anthropometric Traits

NARCIS (Netherlands)

I. Tachmazidou (Ioanna); Süveges, D. (Dániel); J. Min (Josine); G.R.S. Ritchie (Graham R.S.); Steinberg, J. (Julia); K. Walter (Klaudia); V. Iotchkova (Valentina); J.A. Schwartzentruber (Jeremy); J. Huang (Jian); Y. Memari (Yasin); McCarthy, S. (Shane); Crawford, A.A. (Andrew A.); C. Bombieri (Cristina); M. Cocca (Massimiliano); A.-E. Farmaki (Aliki-Eleni); T.R. Gaunt (Tom); P. Jousilahti (Pekka); M.N. Kooijman (Marjolein ); Lehne, B. (Benjamin); G. Malerba (Giovanni); S. Männistö (Satu); A. Matchan (Angela); M.C. Medina-Gomez (Carolina); S. Metrustry (Sarah); A. Nag (Abhishek); I. Ntalla (Ioanna); L. Paternoster (Lavinia); N.W. Rayner (Nigel William); C. Sala (Cinzia); W.R. Scott (William R.); H.A. Shihab (Hashem A.); L. Southam (Lorraine); B. St Pourcain (Beate); M. Traglia (Michela); K. Trajanoska (Katerina); Zaza, G. (Gialuigi); W. Zhang (Weihua); M.S. Artigas; Bansal, N. (Narinder); M. Benn (Marianne); Chen, Z. (Zhongsheng); P. Danecek (Petr); Lin, W.-Y. (Wei-Yu); A. Locke (Adam); J. Luan (Jian'An); A.K. Manning (Alisa); Mulas, A. (Antonella); C. Sidore (Carlo); A. Tybjaerg-Hansen; A. Varbo (Anette); M. Zoledziewska (Magdalena); C. Finan (Chris); Hatzikotoulas, K. (Konstantinos); A.E. Hendricks (Audrey E.); J.P. Kemp (John); A. Moayyeri (Alireza); Panoutsopoulou, K. (Kalliope); Szpak, M. (Michal); S.G. Wilson (Scott); M. Boehnke (Michael); F. Cucca (Francesco); Di Angelantonio, E. (Emanuele); C. Langenberg (Claudia); C.M. Lindgren (Cecilia M.); McCarthy, M.I. (Mark I.); A.P. Morris (Andrew); B.G. Nordestgaard (Børge); R.A. Scott (Robert); M.D. Tobin (Martin); N.J. Wareham (Nick); P.R. Burton (Paul); J.C. Chambers (John); Smith, G.D. (George Davey); G.V. Dedoussis (George); J.F. Felix (Janine); O.H. Franco (Oscar); Gambaro, G. (Giovanni); P. Gasparini (Paolo); C.J. Hammond (Christopher J.); A. Hofman (Albert); V.W.V. Jaddoe (Vincent); M.E. Kleber (Marcus); J.S. Kooner (Jaspal S.); M. Perola (Markus); C.L. Relton (Caroline); S.M. Ring (Susan); F. Rivadeneira Ramirez (Fernando); V. Salomaa (Veikko); T.D. Spector (Timothy); O. Stegle (Oliver); D. Toniolo (Daniela); A.G. Uitterlinden (André); I.E. Barroso (Inês); C.M.T. Greenwood (Celia); Perry, J.R.B. (John R.B.); Walker, B.R. (Brian R.); A.S. Butterworth (Adam); Y. Xue (Yali); R. Durbin (Richard); K.S. Small (Kerrin); N. Soranzo (Nicole); N.J. Timpson (Nicholas); E. Zeggini (Eleftheria)

2016-01-01

textabstractDeep sequence-based imputation can enhance the discovery power of genome-wide association studies by assessing previously unexplored variation across the common- and low-frequency spectra. We applied a hybrid whole-genome sequencing (WGS) and deep imputation approach to examine the
Whole-Genome Sequencing Coupled to Imputation Discovers Genetic Signals for Anthropometric Traits

DEFF Research Database (Denmark)

Tachmazidou, Ioanna; Süveges, Dániel; Min, Josine L

2017-01-01

Deep sequence-based imputation can enhance the discovery power of genome-wide association studies by assessing previously unexplored variation across the common- and low-frequency spectra. We applied a hybrid whole-genome sequencing (WGS) and deep imputation approach to examine the broader alleli...
FCMPSO: An Imputation for Missing Data Features in Heart Disease Classification

Science.gov (United States)

Salleh, Mohd Najib Mohd; Ashikin Samat, Nurul

2017-08-01

The application of data mining and machine learning in directing clinical research into possible hidden knowledge is becoming greatly influential in medical areas. Heart Disease is a killer disease around the world, and early prevention through efficient methods can help to reduce the mortality number. Medical data may contain many uncertainties, as they are fuzzy and vague in nature. Nonetheless, imprecise features data such as no values and missing values can affect quality of classification results. Nevertheless, the other complete features are still capable to give information in certain features. Therefore, an imputation approach based on Fuzzy C-Means and Particle Swarm Optimization (FCMPSO) is developed in preprocessing stage to help fill in the missing values. Then, the complete dataset is trained in classification algorithm, Decision Tree. The experiment is trained with Heart Disease dataset and the performance is analysed using accuracy, precision, and ROC values. Results show that the performance of Decision Tree is increased after the application of FCMSPO for imputation.
Multiple imputation to account for measurement error in marginal structural models

Science.gov (United States)

Edwards, Jessie K.; Cole, Stephen R.; Westreich, Daniel; Crane, Heidi; Eron, Joseph J.; Mathews, W. Christopher; Moore, Richard; Boswell, Stephen L.; Lesko, Catherine R.; Mugavero, Michael J.

2015-01-01

Background Marginal structural models are an important tool for observational studies. These models typically assume that variables are measured without error. We describe a method to account for differential and non-differential measurement error in a marginal structural model. Methods We illustrate the method estimating the joint effects of antiretroviral therapy initiation and current smoking on all-cause mortality in a United States cohort of 12,290 patients with HIV followed for up to 5 years between 1998 and 2011. Smoking status was likely measured with error, but a subset of 3686 patients who reported smoking status on separate questionnaires composed an internal validation subgroup. We compared a standard joint marginal structural model fit using inverse probability weights to a model that also accounted for misclassification of smoking status using multiple imputation. Results In the standard analysis, current smoking was not associated with increased risk of mortality. After accounting for misclassification, current smoking without therapy was associated with increased mortality [hazard ratio (HR): 1.2 (95% CI: 0.6, 2.3)]. The HR for current smoking and therapy (0.4 (95% CI: 0.2, 0.7)) was similar to the HR for no smoking and therapy (0.4; 95% CI: 0.2, 0.6). Conclusions Multiple imputation can be used to account for measurement error in concert with methods for causal inference to strengthen results from observational studies. PMID:26214338
GACT: a Genome build and Allele definition Conversion Tool for SNP imputation and meta-analysis in genetic association studies.

Science.gov (United States)

Sulovari, Arvis; Li, Dawei

2014-07-19

Genome-wide association studies (GWAS) have successfully identified genes associated with complex human diseases. Although much of the heritability remains unexplained, combining single nucleotide polymorphism (SNP) genotypes from multiple studies for meta-analysis will increase the statistical power to identify new disease-associated variants. Meta-analysis requires same allele definition (nomenclature) and genome build among individual studies. Similarly, imputation, commonly-used prior to meta-analysis, requires the same consistency. However, the genotypes from various GWAS are generated using different genotyping platforms, arrays or SNP-calling approaches, resulting in use of different genome builds and allele definitions. Incorrect assumptions of identical allele definition among combined GWAS lead to a large portion of discarded genotypes or incorrect association findings. There is no published tool that predicts and converts among all major allele definitions. In this study, we have developed a tool, GACT, which stands for Genome build and Allele definition Conversion Tool, that predicts and inter-converts between any of the common SNP allele definitions and between the major genome builds. In addition, we assessed several factors that may affect imputation quality, and our results indicated that inclusion of singletons in the reference had detrimental effects while ambiguous SNPs had no measurable effect. Unexpectedly, exclusion of genotypes with missing rate > 0.001 (40% of study SNPs) showed no significant decrease of imputation quality (even significantly higher when compared to the imputation with singletons in the reference), especially for rare SNPs. GACT is a new, powerful, and user-friendly tool with both command-line and interactive online versions that can accurately predict, and convert between any of the common allele definitions and between genome builds for genome-wide meta-analysis and imputation of genotypes from SNP-arrays or deep
Relative efficiency of joint-model and full-conditional-specification multiple imputation when conditional models are compatible: The general location model.

Science.gov (United States)

Seaman, Shaun R; Hughes, Rachael A

2018-06-01

Estimating the parameters of a regression model of interest is complicated by missing data on the variables in that model. Multiple imputation is commonly used to handle these missing data. Joint model multiple imputation and full-conditional specification multiple imputation are known to yield imputed data with the same asymptotic distribution when the conditional models of full-conditional specification are compatible with that joint model. We show that this asymptotic equivalence of imputation distributions does not imply that joint model multiple imputation and full-conditional specification multiple imputation will also yield asymptotically equally efficient inference about the parameters of the model of interest, nor that they will be equally robust to misspecification of the joint model. When the conditional models used by full-conditional specification multiple imputation are linear, logistic and multinomial regressions, these are compatible with a restricted general location joint model. We show that multiple imputation using the restricted general location joint model can be substantially more asymptotically efficient than full-conditional specification multiple imputation, but this typically requires very strong associations between variables. When associations are weaker, the efficiency gain is small. Moreover, full-conditional specification multiple imputation is shown to be potentially much more robust than joint model multiple imputation using the restricted general location model to mispecification of that model when there is substantial missingness in the outcome variable.
Towards a more efficient representation of imputation operators in TPOT

OpenAIRE

Garciarena, Unai; Mendiburu, Alexander; Santana, Roberto

2018-01-01

Automated Machine Learning encompasses a set of meta-algorithms intended to design and apply machine learning techniques (e.g., model selection, hyperparameter tuning, model assessment, etc.). TPOT, a software for optimizing machine learning pipelines based on genetic programming (GP), is a novel example of this kind of applications. Recently we have proposed a way to introduce imputation methods as part of TPOT. While our approach was able to deal with problems with missing data, it can prod...
On multivariate imputation and forecasting of decadal wind speed missing data.

Science.gov (United States)

Wesonga, Ronald

2015-01-01

This paper demonstrates the application of multiple imputations by chained equations and time series forecasting of wind speed data. The study was motivated by the high prevalence of missing wind speed historic data. Findings based on the fully conditional specification under multiple imputations by chained equations, provided reliable wind speed missing data imputations. Further, the forecasting model shows, the smoothing parameter, alpha (0.014) close to zero, confirming that recent past observations are more suitable for use to forecast wind speeds. The maximum decadal wind speed for Entebbe International Airport was estimated to be 17.6 metres per second at a 0.05 level of significance with a bound on the error of estimation of 10.8 metres per second. The large bound on the error of estimations confirms the dynamic tendencies of wind speed at the airport under study.
Multiple Imputation to Account for Measurement Error in Marginal Structural Models.

Science.gov (United States)

Edwards, Jessie K; Cole, Stephen R; Westreich, Daniel; Crane, Heidi; Eron, Joseph J; Mathews, W Christopher; Moore, Richard; Boswell, Stephen L; Lesko, Catherine R; Mugavero, Michael J

2015-09-01

Marginal structural models are an important tool for observational studies. These models typically assume that variables are measured without error. We describe a method to account for differential and nondifferential measurement error in a marginal structural model. We illustrate the method estimating the joint effects of antiretroviral therapy initiation and current smoking on all-cause mortality in a United States cohort of 12,290 patients with HIV followed for up to 5 years between 1998 and 2011. Smoking status was likely measured with error, but a subset of 3,686 patients who reported smoking status on separate questionnaires composed an internal validation subgroup. We compared a standard joint marginal structural model fit using inverse probability weights to a model that also accounted for misclassification of smoking status using multiple imputation. In the standard analysis, current smoking was not associated with increased risk of mortality. After accounting for misclassification, current smoking without therapy was associated with increased mortality (hazard ratio [HR]: 1.2 [95% confidence interval [CI] = 0.6, 2.3]). The HR for current smoking and therapy [0.4 (95% CI = 0.2, 0.7)] was similar to the HR for no smoking and therapy (0.4; 95% CI = 0.2, 0.6). Multiple imputation can be used to account for measurement error in concert with methods for causal inference to strengthen results from observational studies.
Auxiliary variables in multiple imputation in regression with missing X: a warning against including too many in small sample research

Directory of Open Access Journals (Sweden)

Hardt Jochen

2012-12-01

Full Text Available Abstract Background Multiple imputation is becoming increasingly popular. Theoretical considerations as well as simulation studies have shown that the inclusion of auxiliary variables is generally of benefit. Methods A simulation study of a linear regression with a response Y and two predictors X1 and X2 was performed on data with n = 50, 100 and 200 using complete cases or multiple imputation with 0, 10, 20, 40 and 80 auxiliary variables. Mechanisms of missingness were either 100% MCAR or 50% MAR + 50% MCAR. Auxiliary variables had low (r=.10 vs. moderate correlations (r=.50 with X’s and Y. Results The inclusion of auxiliary variables can improve a multiple imputation model. However, inclusion of too many variables leads to downward bias of regression coefficients and decreases precision. When the correlations are low, inclusion of auxiliary variables is not useful. Conclusion More research on auxiliary variables in multiple imputation should be performed. A preliminary rule of thumb could be that the ratio of variables to cases with complete data should not go below 1 : 3.
Multiple imputation of missing passenger boarding data in the national census of ferry operators

Science.gov (United States)

2008-08-01

This report presents findings from the 2006 National Census of Ferry Operators (NCFO) augmented with imputed values for passengers and passenger miles. Due to the imputation procedures used to calculate missing data, totals in Table 1 may not corresp...
Treatments of Missing Values in Large National Data Affect Conclusions: The Impact of Multiple Imputation on Arthroplasty Research.

Science.gov (United States)

Ondeck, Nathaniel T; Fu, Michael C; Skrip, Laura A; McLynn, Ryan P; Su, Edwin P; Grauer, Jonathan N

2018-03-01

Despite the advantages of large, national datasets, one continuing concern is missing data values. Complete case analysis, where only cases with complete data are analyzed, is commonly used rather than more statistically rigorous approaches such as multiple imputation. This study characterizes the potential selection bias introduced using complete case analysis and compares the results of common regressions using both techniques following unicompartmental knee arthroplasty. Patients undergoing unicompartmental knee arthroplasty were extracted from the 2005 to 2015 National Surgical Quality Improvement Program. As examples, the demographics of patients with and without missing preoperative albumin and hematocrit values were compared. Missing data were then treated with both complete case analysis and multiple imputation (an approach that reproduces the variation and associations that would have been present in a full dataset) and the conclusions of common regressions for adverse outcomes were compared. A total of 6117 patients were included, of which 56.7% were missing at least one value. Younger, female, and healthier patients were more likely to have missing preoperative albumin and hematocrit values. The use of complete case analysis removed 3467 patients from the study in comparison with multiple imputation which included all 6117 patients. The 2 methods of handling missing values led to differing associations of low preoperative laboratory values with commonly studied adverse outcomes. The use of complete case analysis can introduce selection bias and may lead to different conclusions in comparison with the statistically rigorous multiple imputation approach. Joint surgeons should consider the methods of handling missing values when interpreting arthroplasty research. Copyright © 2017 Elsevier Inc. All rights reserved.

An Imputation Model for Dropouts in Unemployment Data

Directory of Open Access Journals (Sweden)

Nilsson Petra

2016-09-01

Full Text Available Incomplete unemployment data is a fundamental problem when evaluating labour market policies in several countries. Many unemployment spells end for unknown reasons; in the Swedish Public Employment Service’s register as many as 20 percent. This leads to an ambiguity regarding destination states (employment, unemployment, retired, etc.. According to complete combined administrative data, the employment rate among dropouts was close to 50 for the years 1992 to 2006, but from 2007 the employment rate has dropped to 40 or less. This article explores an imputation approach. We investigate imputation models estimated both on survey data from 2005/2006 and on complete combined administrative data from 2005/2006 and 2011/2012. The models are evaluated in terms of their ability to make correct predictions. The models have relatively high predictive power.
Mapping wildland fuels and forest structure for land management: a comparison of nearest neighbor imputation and other methods

Science.gov (United States)

Kenneth B. Pierce; Janet L. Ohmann; Michael C. Wimberly; Matthew J. Gregory; Jeremy S. Fried

2009-01-01

Land managers need consistent information about the geographic distribution of wildland fuels and forest structure over large areas to evaluate fire risk and plan fuel treatments. We compared spatial predictions for 12 fuel and forest structure variables across three regions in the western United States using gradient nearest neighbor (GNN) imputation, linear models (...
Limitations in Using Multiple Imputation to Harmonize Individual Participant Data for Meta-Analysis.

Science.gov (United States)

Siddique, Juned; de Chavez, Peter J; Howe, George; Cruden, Gracelyn; Brown, C Hendricks

2018-02-01

Individual participant data (IPD) meta-analysis is a meta-analysis in which the individual-level data for each study are obtained and used for synthesis. A common challenge in IPD meta-analysis is when variables of interest are measured differently in different studies. The term harmonization has been coined to describe the procedure of placing variables on the same scale in order to permit pooling of data from a large number of studies. Using data from an IPD meta-analysis of 19 adolescent depression trials, we describe a multiple imputation approach for harmonizing 10 depression measures across the 19 trials by treating those depression measures that were not used in a study as missing data. We then apply diagnostics to address the fit of our imputation model. Even after reducing the scale of our application, we were still unable to produce accurate imputations of the missing values. We describe those features of the data that made it difficult to harmonize the depression measures and provide some guidelines for using multiple imputation for harmonization in IPD meta-analysis.
Improved Correction of Misclassification Bias With Bootstrap Imputation.

Science.gov (United States)

van Walraven, Carl

2018-07-01

Diagnostic codes used in administrative database research can create bias due to misclassification. Quantitative bias analysis (QBA) can correct for this bias, requires only code sensitivity and specificity, but may return invalid results. Bootstrap imputation (BI) can also address misclassification bias but traditionally requires multivariate models to accurately estimate disease probability. This study compared misclassification bias correction using QBA and BI. Serum creatinine measures were used to determine severe renal failure status in 100,000 hospitalized patients. Prevalence of severe renal failure in 86 patient strata and its association with 43 covariates was determined and compared with results in which renal failure status was determined using diagnostic codes (sensitivity 71.3%, specificity 96.2%). Differences in results (misclassification bias) were then corrected with QBA or BI (using progressively more complex methods to estimate disease probability). In total, 7.4% of patients had severe renal failure. Imputing disease status with diagnostic codes exaggerated prevalence estimates [median relative change (range), 16.6% (0.8%-74.5%)] and its association with covariates [median (range) exponentiated absolute parameter estimate difference, 1.16 (1.01-2.04)]. QBA produced invalid results 9.3% of the time and increased bias in estimates of both disease prevalence and covariate associations. BI decreased misclassification bias with increasingly accurate disease probability estimates. QBA can produce invalid results and increase misclassification bias. BI avoids invalid results and can importantly decrease misclassification bias when accurate disease probability estimates are used.
Combining item response theory with multiple imputation to equate health assessment questionnaires.

Science.gov (United States)

Gu, Chenyang; Gutman, Roee

2017-09-01

The assessment of patients' functional status across the continuum of care requires a common patient assessment tool. However, assessment tools that are used in various health care settings differ and cannot be easily contrasted. For example, the Functional Independence Measure (FIM) is used to evaluate the functional status of patients who stay in inpatient rehabilitation facilities, the Minimum Data Set (MDS) is collected for all patients who stay in skilled nursing facilities, and the Outcome and Assessment Information Set (OASIS) is collected if they choose home health care provided by home health agencies. All three instruments or questionnaires include functional status items, but the specific items, rating scales, and instructions for scoring different activities vary between the different settings. We consider equating different health assessment questionnaires as a missing data problem, and propose a variant of predictive mean matching method that relies on Item Response Theory (IRT) models to impute unmeasured item responses. Using real data sets, we simulated missing measurements and compared our proposed approach to existing methods for missing data imputation. We show that, for all of the estimands considered, and in most of the experimental conditions that were examined, the proposed approach provides valid inferences, and generally has better coverages, relatively smaller biases, and shorter interval estimates. The proposed method is further illustrated using a real data set. © 2016, The International Biometric Society.
Increasing imputation and prediction accuracy for Chinese Holsteins using joint Chinese-Nordic reference population

DEFF Research Database (Denmark)

Ma, Peipei; Lund, Mogens Sandø; Ding, X

2015-01-01

This study investigated the effect of including Nordic Holsteins in the reference population on the imputation accuracy and prediction accuracy for Chinese Holsteins. The data used in this study include 85 Chinese Holstein bulls genotyped with both 54K chip and 777K (HD) chip, 2862 Chinese cows...... was improved slightly when using the marker data imputed based on the combined HD reference data, compared with using the marker data imputed based on the Chinese HD reference data only. On the other hand, when using the combined reference population including 4398 Nordic Holstein bulls, the accuracy...... to increase reference population rather than increasing marker density...
Single-nucleotide polymorphism discovery by high-throughput sequencing in sorghum

Directory of Open Access Journals (Sweden)

White Frank F

2011-07-01

Full Text Available Abstract Background Eight diverse sorghum (Sorghum bicolor L. Moench accessions were subjected to short-read genome sequencing to characterize the distribution of single-nucleotide polymorphisms (SNPs. Two strategies were used for DNA library preparation. Missing SNP genotype data were imputed by local haplotype comparison. The effect of library type and genomic diversity on SNP discovery and imputation are evaluated. Results Alignment of eight genome equivalents (6 Gb to the public reference genome revealed 283,000 SNPs at ≥82% confirmation probability. Sequencing from libraries constructed to limit sequencing to start at defined restriction sites led to genotyping 10-fold more SNPs in all 8 accessions, and correctly imputing 11% more missing data, than from semirandom libraries. The SNP yield advantage of the reduced-representation method was less than expected, since up to one fifth of reads started at noncanonical restriction sites and up to one third of restriction sites predicted in silico to yield unique alignments were not sampled at near-saturation. For imputation accuracy, the availability of a genomically similar accession in the germplasm panel was more important than panel size or sequencing coverage. Conclusions A sequence quantity of 3 million 50-base reads per accession using a BsrFI library would conservatively provide satisfactory genotyping of 96,000 sorghum SNPs. For most reliable SNP-genotype imputation in shallowly sequenced genomes, germplasm panels should consist of pairs or groups of genomically similar entries. These results may help in designing strategies for economical genotyping-by-sequencing of large numbers of plant accessions.
A note on the relationships between multiple imputation, maximum likelihood and fully Bayesian methods for missing responses in linear regression models.

Science.gov (United States)

Chen, Qingxia; Ibrahim, Joseph G

2014-07-01

Multiple Imputation, Maximum Likelihood and Fully Bayesian methods are the three most commonly used model-based approaches in missing data problems. Although it is easy to show that when the responses are missing at random (MAR), the complete case analysis is unbiased and efficient, the aforementioned methods are still commonly used in practice for this setting. To examine the performance of and relationships between these three methods in this setting, we derive and investigate small sample and asymptotic expressions of the estimates and standard errors, and fully examine how these estimates are related for the three approaches in the linear regression model when the responses are MAR. We show that when the responses are MAR in the linear model, the estimates of the regression coefficients using these three methods are asymptotically equivalent to the complete case estimates under general conditions. One simulation and a real data set from a liver cancer clinical trial are given to compare the properties of these methods when the responses are MAR.
Multiple imputation for multivariate data with missing and below-threshold measurements: time-series concentrations of pollutants in the Arctic.

Science.gov (United States)

Hopke, P K; Liu, C; Rubin, D B

2001-03-01

Many chemical and environmental data sets are complicated by the existence of fully missing values or censored values known to lie below detection thresholds. For example, week-long samples of airborne particulate matter were obtained at Alert, NWT, Canada, between 1980 and 1991, where some of the concentrations of 24 particulate constituents were coarsened in the sense of being either fully missing or below detection limits. To facilitate scientific analysis, it is appealing to create complete data by filling in missing values so that standard complete-data methods can be applied. We briefly review commonly used strategies for handling missing values and focus on the multiple-imputation approach, which generally leads to valid inferences when faced with missing data. Three statistical models are developed for multiply imputing the missing values of airborne particulate matter. We expect that these models are useful for creating multiple imputations in a variety of incomplete multivariate time series data sets.
A suggested approach for imputation of missing dietary data for young children in daycare

OpenAIRE

Stevens, June; Ou, Fang-Shu; Truesdale, Kimberly P.; Zeng, Donglin; Vaughn, Amber E.; Pratt, Charlotte; Ward, Dianne S.

2015-01-01

Background: Parent-reported 24-h diet recalls are an accepted method of estimating intake in young children. However, many children eat while at childcare making accurate proxy reports by parents difficult.Objective: The goal of this study was to demonstrate a method to impute missing weekday lunch and daytime snack nutrient data for daycare children and to explore the concurrent predictive and criterion validity of the method.Design: Data were from children aged 2-5 years in the My Parenting...
Imputing forest carbon stock estimates from inventory plots to a nationally continuous coverage

Directory of Open Access Journals (Sweden)

Wilson Barry Tyler

2013-01-01

Full Text Available Abstract The U.S. has been providing national-scale estimates of forest carbon (C stocks and stock change to meet United Nations Framework Convention on Climate Change (UNFCCC reporting requirements for years. Although these currently are provided as national estimates by pool and year to meet greenhouse gas monitoring requirements, there is growing need to disaggregate these estimates to finer scales to enable strategic forest management and monitoring activities focused on various ecosystem services such as C storage enhancement. Through application of a nearest-neighbor imputation approach, spatially extant estimates of forest C density were developed for the conterminous U.S. using the U.S.’s annual forest inventory. Results suggest that an existing forest inventory plot imputation approach can be readily modified to provide raster maps of C density across a range of pools (e.g., live tree to soil organic carbon and spatial scales (e.g., sub-county to biome. Comparisons among imputed maps indicate strong regional differences across C pools. The C density of pools closely related to detrital input (e.g., dead wood is often highest in forests suffering from recent mortality events such as those in the northern Rocky Mountains (e.g., beetle infestations. In contrast, live tree carbon density is often highest on the highest quality forest sites such as those found in the Pacific Northwest. Validation results suggest strong agreement between the estimates produced from the forest inventory plots and those from the imputed maps, particularly when the C pool is closely associated with the imputation model (e.g., aboveground live biomass and live tree basal area, with weaker agreement for detrital pools (e.g., standing dead trees. Forest inventory imputed plot maps provide an efficient and flexible approach to monitoring diverse C pools at national (e.g., UNFCCC and regional scales (e.g., Reducing Emissions from Deforestation and Forest
Age at menopause: imputing age at menopause for women with a hysterectomy with application to risk of postmenopausal breast cancer

Science.gov (United States)

Rosner, Bernard; Colditz, Graham A.

2011-01-01

Purpose Age at menopause, a major marker in the reproductive life, may bias results for evaluation of breast cancer risk after menopause. Methods We follow 38,948 premenopausal women in 1980 and identify 2,586 who reported hysterectomy without bilateral oophorectomy, and 31,626 who reported natural menopause during 22 years of follow-up. We evaluate risk factors for natural menopause, impute age at natural menopause for women reporting hysterectomy without bilateral oophorectomy and estimate the hazard of reaching natural menopause in the next 2 years. We apply this imputed age at menopause to both increase sample size and to evaluate the relation between postmenopausal exposures and risk of breast cancer. Results Age, cigarette smoking, age at menarche, pregnancy history, body mass index, history of benign breast disease, and history of breast cancer were each significantly related to age at natural menopause; duration of oral contraceptive use and family history of breast cancer were not. The imputation increased sample size substantially and although some risk factors after menopause were weaker in the expanded model (height, and alcohol use), use of hormone therapy is less biased. Conclusions Imputing age at menopause increases sample size, broadens generalizability making it applicable to women with hysterectomy, and reduces bias. PMID:21441037
Inference for multivariate regression model based on multiply imputed synthetic data generated via posterior predictive sampling

Science.gov (United States)

Moura, Ricardo; Sinha, Bimal; Coelho, Carlos A.

2017-06-01

The recent popularity of the use of synthetic data as a Statistical Disclosure Control technique has enabled the development of several methods of generating and analyzing such data, but almost always relying in asymptotic distributions and in consequence being not adequate for small sample datasets. Thus, a likelihood-based exact inference procedure is derived for the matrix of regression coefficients of the multivariate regression model, for multiply imputed synthetic data generated via Posterior Predictive Sampling. Since it is based in exact distributions this procedure may even be used in small sample datasets. Simulation studies compare the results obtained from the proposed exact inferential procedure with the results obtained from an adaptation of Reiters combination rule to multiply imputed synthetic datasets and an application to the 2000 Current Population Survey is discussed.
Accuracy of hemoglobin A1c imputation using fasting plasma glucose in diabetes research using electronic health records data

Directory of Open Access Journals (Sweden)

Stanley Xu

2014-05-01

Full Text Available In studies that use electronic health record data, imputation of important data elements such as Glycated hemoglobin (A1c has become common. However, few studies have systematically examined the validity of various imputation strategies for missing A1c values. We derived a complete dataset using an incident diabetes population that has no missing values in A1c, fasting and random plasma glucose (FPG and RPG, age, and gender. We then created missing A1c values under two assumptions: missing completely at random (MCAR and missing at random (MAR. We then imputed A1c values, compared the imputed values to the true A1c values, and used these data to assess the impact of A1c on initiation of antihyperglycemic therapy. Under MCAR, imputation of A1c based on FPG 1 estimated a continuous A1c within ± 1.88% of the true A1c 68.3% of the time; 2 estimated a categorical A1c within ± one category from the true A1c about 50% of the time. Including RPG in imputation slightly improved the precision but did not improve the accuracy. Under MAR, including gender and age in addition to FPG improved the accuracy of imputed continuous A1c but not categorical A1c. Moreover, imputation of up to 33% of missing A1c values did not change the accuracy and precision and did not alter the impact of A1c on initiation of antihyperglycemic therapy. When using A1c values as a predictor variable, a simple imputation algorithm based only on age, sex, and fasting plasma glucose gave acceptable results.
Combination of individual tree detection and area-based approach in imputation of forest variables using airborne laser data

Science.gov (United States)

Vastaranta, Mikko; Kankare, Ville; Holopainen, Markus; Yu, Xiaowei; Hyyppä, Juha; Hyyppä, Hannu

2012-01-01

The two main approaches to deriving forest variables from laser-scanning data are the statistical area-based approach (ABA) and individual tree detection (ITD). With ITD it is feasible to acquire single tree information, as in field measurements. Here, ITD was used for measuring training data for the ABA. In addition to automatic ITD (ITD auto), we tested a combination of ITD auto and visual interpretation (ITD visual). ITD visual had two stages: in the first, ITD auto was carried out and in the second, the results of the ITD auto were visually corrected by interpreting three-dimensional laser point clouds. The field data comprised 509 circular plots ( r = 10 m) that were divided equally for testing and training. ITD-derived forest variables were used for training the ABA and the accuracies of the k-most similar neighbor ( k-MSN) imputations were evaluated and compared with the ABA trained with traditional measurements. The root-mean-squared error (RMSE) in the mean volume was 24.8%, 25.9%, and 27.2% with the ABA trained with field measurements, ITD auto, and ITD visual, respectively. When ITD methods were applied in acquiring training data, the mean volume, basal area, and basal area-weighted mean diameter were underestimated in the ABA by 2.7-9.2%. This project constituted a pilot study for using ITD measurements as training data for the ABA. Further studies are needed to reduce the bias and to determine the accuracy obtained in imputation of species-specific variables. The method could be applied in areas with sparse road networks or when the costs of fieldwork must be minimized.
Differential network analysis with multiply imputed lipidomic data.

Directory of Open Access Journals (Sweden)

Maiju Kujala

Full Text Available The importance of lipids for cell function and health has been widely recognized, e.g., a disorder in the lipid composition of cells has been related to atherosclerosis caused cardiovascular disease (CVD. Lipidomics analyses are characterized by large yet not a huge number of mutually correlated variables measured and their associations to outcomes are potentially of a complex nature. Differential network analysis provides a formal statistical method capable of inferential analysis to examine differences in network structures of the lipids under two biological conditions. It also guides us to identify potential relationships requiring further biological investigation. We provide a recipe to conduct permutation test on association scores resulted from partial least square regression with multiple imputed lipidomic data from the LUdwigshafen RIsk and Cardiovascular Health (LURIC study, particularly paying attention to the left-censored missing values typical for a wide range of data sets in life sciences. Left-censored missing values are low-level concentrations that are known to exist somewhere between zero and a lower limit of quantification. To make full use of the LURIC data with the missing values, we utilize state of the art multiple imputation techniques and propose solutions to the challenges that incomplete data sets bring to differential network analysis. The customized network analysis helps us to understand the complexities of the underlying biological processes by identifying lipids and lipid classes that interact with each other, and by recognizing the most important differentially expressed lipids between two subgroups of coronary artery disease (CAD patients, the patients that had a fatal CVD event and the ones who remained stable during two year follow-up.
Imputation of the rare HOXB13 G84E mutation and cancer risk in a large population-based cohort.

Directory of Open Access Journals (Sweden)

Thomas J Hoffmann

2015-01-01

Full Text Available An efficient approach to characterizing the disease burden of rare genetic variants is to impute them into large well-phenotyped cohorts with existing genome-wide genotype data using large sequenced referenced panels. The success of this approach hinges on the accuracy of rare variant imputation, which remains controversial. For example, a recent study suggested that one cannot adequately impute the HOXB13 G84E mutation associated with prostate cancer risk (carrier frequency of 0.0034 in European ancestry participants in the 1000 Genomes Project. We show that by utilizing the 1000 Genomes Project data plus an enriched reference panel of mutation carriers we were able to accurately impute the G84E mutation into a large cohort of 83,285 non-Hispanic White participants from the Kaiser Permanente Research Program on Genes, Environment and Health Genetic Epidemiology Research on Adult Health and Aging cohort. Imputation authenticity was confirmed via a novel classification and regression tree method, and then empirically validated analyzing a subset of these subjects plus an additional 1,789 men from Kaiser specifically genotyped for the G84E mutation (r2 = 0.57, 95% CI = 0.37–0.77. We then show the value of this approach by using the imputed data to investigate the impact of the G84E mutation on age-specific prostate cancer risk and on risk of fourteen other cancers in the cohort. The age-specific risk of prostate cancer among G84E mutation carriers was higher than among non-carriers. Risk estimates from Kaplan-Meier curves were 36.7% versus 13.6% by age 72, and 64.2% versus 24.2% by age 80, for G84E mutation carriers and non-carriers, respectively (p = 3.4x10-12. The G84E mutation was also associated with an increase in risk for the fourteen other most common cancers considered collectively (p = 5.8x10-4 and more so in cases diagnosed with multiple cancer types, both those including and not including prostate cancer, strongly suggesting
Imputation-based analysis of association studies: candidate regions and quantitative traits.

Directory of Open Access Journals (Sweden)

Bertrand Servin

2007-07-01

Full Text Available We introduce a new framework for the analysis of association studies, designed to allow untyped variants to be more effectively and directly tested for association with a phenotype. The idea is to combine knowledge on patterns of correlation among SNPs (e.g., from the International HapMap project or resequencing data in a candidate region of interest with genotype data at tag SNPs collected on a phenotyped study sample, to estimate ("impute" unmeasured genotypes, and then assess association between the phenotype and these estimated genotypes. Compared with standard single-SNP tests, this approach results in increased power to detect association, even in cases in which the causal variant is typed, with the greatest gain occurring when multiple causal variants are present. It also provides more interpretable explanations for observed associations, including assessing, for each SNP, the strength of the evidence that it (rather than another correlated SNP is causal. Although we focus on association studies with quantitative phenotype and a relatively restricted region (e.g., a candidate gene, the framework is applicable and computationally practical for whole genome association studies. Methods described here are implemented in a software package, Bim-Bam, available from the Stephens Lab website http://stephenslab.uchicago.edu/software.html.
Analyzing the changing gender wage gap based on multiply imputed right censored wages

OpenAIRE

Gartner, Hermann; Rässler, Susanne

2005-01-01

"In order to analyze the gender wage gap with the German IAB-employment register we have to solve the problem of censored wages at the upper limit of the social security system. We treat this problem as a missing data problem. We regard the missingness mechanism as not missing at random (NMAR, according to Little and Rubin, 1987, 2002) as well as missing by design. The censored wages are multiply imputed by draws of a random variable from a truncated distribution. The multiple imputation is b...
Simple nuclear norm based algorithms for imputing missing data and forecasting in time series

OpenAIRE

Butcher, Holly Louise; Gillard, Jonathan William

2017-01-01

There has been much recent progress on the use of the nuclear norm for the so-called matrix completion problem (the problem of imputing missing values of a matrix). In this paper we investigate the use of the nuclear norm for modelling time series, with particular attention to imputing missing data and forecasting. We introduce a simple alternating projections type algorithm based on the nuclear norm for these tasks, and consider a number of practical examples.

A spatial haplotype copying model with applications to genotype imputation.

Science.gov (United States)

Yang, Wen-Yun; Hormozdiari, Farhad; Eskin, Eleazar; Pasaniuc, Bogdan

2015-05-01

Ever since its introduction, the haplotype copy model has proven to be one of the most successful approaches for modeling genetic variation in human populations, with applications ranging from ancestry inference to genotype phasing and imputation. Motivated by coalescent theory, this approach assumes that any chromosome (haplotype) can be modeled as a mosaic of segments copied from a set of chromosomes sampled from the same population. At the core of the model is the assumption that any chromosome from the sample is equally likely to contribute a priori to the copying process. Motivated by recent works that model genetic variation in a geographic continuum, we propose a new spatial-aware haplotype copy model that jointly models geography and the haplotype copying process. We extend hidden Markov models of haplotype diversity such that at any given location, haplotypes that are closest in the genetic-geographic continuum map are a priori more likely to contribute to the copying process than distant ones. Through simulations starting from the 1000 Genomes data, we show that our model achieves superior accuracy in genotype imputation over the standard spatial-unaware haplotype copy model. In addition, we show the utility of our model in selecting a small personalized reference panel for imputation that leads to both improved accuracy as well as to a lower computational runtime than the standard approach. Finally, we show our proposed model can be used to localize individuals on the genetic-geographical map on the basis of their genotype data.
Assessment of Consequences of Replacement of System of the Uniform Tax on Imputed Income Patent System of the Taxation

Directory of Open Access Journals (Sweden)

Galina A. Manokhina

2012-11-01

Full Text Available The article highlights the main questions concerning possible consequences of replacement of nowadays operating system in the form of a single tax in reference to imputed income with patent system of the taxation. The main advantages and drawbacks of new system of the taxation are shown, including the opinion that not the replacement of one special mode of the taxation with another is more effective, but the introduction of patent a taxation system as an auxilary system.
Comparison of Imputation Methods for Handling Missing Categorical Data with Univariate Pattern|| Una comparación de métodos de imputación de variables categóricas con patrón univariado

Directory of Open Access Journals (Sweden)

Torres Munguía, Juan Armando

2014-06-01

Full Text Available This paper examines the sample proportions estimates in the presence of univariate missing categorical data. A database about smoking habits (2011 National Addiction Survey of Mexico was used to create simulated yet realistic datasets at rates 5% and 15% of missingness, each for MCAR, MAR and MNAR mechanisms. Then the performance of six methods for addressing missingness is evaluated: listwise, mode imputation, random imputation, hot-deck, imputation by polytomous regression and random forests. Results showed that the most effective methods for dealing with missing categorical data in most of the scenarios assessed in this paper were hot-deck and polytomous regression approaches. || El presente estudio examina la estimación de proporciones muestrales en la presencia de valores faltantes en una variable categórica. Se utiliza una encuesta de consumo de tabaco (Encuesta Nacional de Adicciones de México 2011 para crear bases de datos simuladas pero reales con 5% y 15% de valores perdidos para cada mecanismo de no respuesta MCAR, MAR y MNAR. Se evalúa el desempeño de seis métodos para tratar la falta de respuesta: listwise, imputación de moda, imputación aleatoria, hot-deck, imputación por regresión politómica y árboles de clasificación. Los resultados de las simulaciones indican que los métodos más efectivos para el tratamiento de la no respuesta en variables categóricas, bajo los escenarios simulados, son hot-deck y la regresión politómica.
Trend in BMI z-score among Private Schools’ Students in Delhi using Multiple Imputation for Growth Curve Model

Directory of Open Access Journals (Sweden)

Vinay K Gupta

2016-06-01

Full Text Available Objective: The aim of the study is to assess the trend in mean BMI z-score among private schools’ students from their anthropometric records when there were missing values in the outcome. Methodology: The anthropometric measurements of student from class 1 to 12 were taken from the records of two private schools in Delhi, India from 2005 to 2010. These records comprise of an unbalanced longitudinal data that is not all the students had measurements recorded at each year. The trend in mean BMI z-score was estimated through growth curve model. Prior to that, missing values of BMI z-score were imputed through multiple imputation using the same model. A complete case analysis was also performed after excluding missing values to compare the results with those obtained from analysis of multiply imputed data. Results: The mean BMI z-score among school student significantly decreased over time in imputed data (β= -0.2030, se=0.0889, p=0.0232 after adjusting age, gender, class and school. Complete case analysis also shows a decrease in mean BMI z-score though it was not statistically significant (β= -0.2861, se=0.0987, p=0.065. Conclusions: The estimates obtained from multiple imputation analysis were better than those of complete data after excluding missing values in terms of lower standard errors. We showed that anthropometric measurements from schools records can be used to monitor the weight status of children and adolescents and multiple imputation using growth curve model can be useful while analyzing such data
Cohort-specific imputation of gene expression improves prediction of warfarin dose for African Americans

Directory of Open Access Journals (Sweden)

Assaf Gottlieb

2017-11-01

Full Text Available Abstract Background Genome-wide association studies are useful for discovering genotype–phenotype associations but are limited because they require large cohorts to identify a signal, which can be population-specific. Mapping genetic variation to genes improves power and allows the effects of both protein-coding variation as well as variation in expression to be combined into “gene level” effects. Methods Previous work has shown that warfarin dose can be predicted using information from genetic variation that affects protein-coding regions. Here, we introduce a method that improves dose prediction by integrating tissue-specific gene expression. In particular, we use drug pathways and expression quantitative trait loci knowledge to impute gene expression—on the assumption that differential expression of key pathway genes may impact dose requirement. We focus on 116 genes from the pharmacokinetic and pharmacodynamic pathways of warfarin within training and validation sets comprising both European and African-descent individuals. Results We build gene-tissue signatures associated with warfarin dose in a cohort-specific manner and identify a signature of 11 gene-tissue pairs that significantly augments the International Warfarin Pharmacogenetics Consortium dosage-prediction algorithm in both populations. Conclusions Our results demonstrate that imputed expression can improve dose prediction and bridge population-specific compositions. MATLAB code is available at https://github.com/assafgo/warfarin-cohort
RIDDLE: Race and ethnicity Imputation from Disease history with Deep LEarning

KAUST Repository

Kim, Ji-Sung; Gao, Xin; Rzhetsky, Andrey

2018-01-01

are predictive of race and ethnicity. We used these characterizations of informative features to perform a systematic comparison of differential disease patterns by race and ethnicity. The fact that clinical histories are informative for imputing race
Nonparametric autocovariance estimation from censored time series by Gaussian imputation.

Science.gov (United States)

Park, Jung Wook; Genton, Marc G; Ghosh, Sujit K

2009-02-01

One of the most frequently used methods to model the autocovariance function of a second-order stationary time series is to use the parametric framework of autoregressive and moving average models developed by Box and Jenkins. However, such parametric models, though very flexible, may not always be adequate to model autocovariance functions with sharp changes. Furthermore, if the data do not follow the parametric model and are censored at a certain value, the estimation results may not be reliable. We develop a Gaussian imputation method to estimate an autocovariance structure via nonparametric estimation of the autocovariance function in order to address both censoring and incorrect model specification. We demonstrate the effectiveness of the technique in terms of bias and efficiency with simulations under various rates of censoring and underlying models. We describe its application to a time series of silicon concentrations in the Arctic.
Improving accuracy of genomic prediction in Brangus cattle by adding animals with imputed low-density SNP genotypes.

Science.gov (United States)

Lopes, F B; Wu, X-L; Li, H; Xu, J; Perkins, T; Genho, J; Ferretti, R; Tait, R G; Bauck, S; Rosa, G J M

2018-02-01

Reliable genomic prediction of breeding values for quantitative traits requires the availability of sufficient number of animals with genotypes and phenotypes in the training set. As of 31 October 2016, there were 3,797 Brangus animals with genotypes and phenotypes. These Brangus animals were genotyped using different commercial SNP chips. Of them, the largest group consisted of 1,535 animals genotyped by the GGP-LDV4 SNP chip. The remaining 2,262 genotypes were imputed to the SNP content of the GGP-LDV4 chip, so that the number of animals available for training the genomic prediction models was more than doubled. The present study showed that the pooling of animals with both original or imputed 40K SNP genotypes substantially increased genomic prediction accuracies on the ten traits. By supplementing imputed genotypes, the relative gains in genomic prediction accuracies on estimated breeding values (EBV) were from 12.60% to 31.27%, and the relative gain in genomic prediction accuracies on de-regressed EBV was slightly small (i.e. 0.87%-18.75%). The present study also compared the performance of five genomic prediction models and two cross-validation methods. The five genomic models predicted EBV and de-regressed EBV of the ten traits similarly well. Of the two cross-validation methods, leave-one-out cross-validation maximized the number of animals at the stage of training for genomic prediction. Genomic prediction accuracy (GPA) on the ten quantitative traits was validated in 1,106 newly genotyped Brangus animals based on the SNP effects estimated in the previous set of 3,797 Brangus animals, and they were slightly lower than GPA in the original data. The present study was the first to leverage currently available genotype and phenotype resources in order to harness genomic prediction in Brangus beef cattle. © 2018 Blackwell Verlag GmbH.
Using mi impute chained to fit ANCOVA models in randomized trials with censored dependent and independent variables

DEFF Research Database (Denmark)

Andersen, Andreas; Rieckmann, Andreas

2016-01-01

In this article, we illustrate how to use mi impute chained with intreg to fit an analysis of covariance analysis of censored and nondetectable immunological concentrations measured in a randomized pretest–posttest design.......In this article, we illustrate how to use mi impute chained with intreg to fit an analysis of covariance analysis of censored and nondetectable immunological concentrations measured in a randomized pretest–posttest design....
Missing data treatments matter: an analysis of multiple imputation for anterior cervical discectomy and fusion procedures.

Science.gov (United States)

Ondeck, Nathaniel T; Fu, Michael C; Skrip, Laura A; McLynn, Ryan P; Cui, Jonathan J; Basques, Bryce A; Albert, Todd J; Grauer, Jonathan N

2018-04-09

The presence of missing data is a limitation of large datasets, including the National Surgical Quality Improvement Program (NSQIP). In addressing this issue, most studies use complete case analysis, which excludes cases with missing data, thus potentially introducing selection bias. Multiple imputation, a statistically rigorous approach that approximates missing data and preserves sample size, may be an improvement over complete case analysis. The present study aims to evaluate the impact of using multiple imputation in comparison with complete case analysis for assessing the associations between preoperative laboratory values and adverse outcomes following anterior cervical discectomy and fusion (ACDF) procedures. This is a retrospective review of prospectively collected data. Patients undergoing one-level ACDF were identified in NSQIP 2012-2015. Perioperative adverse outcome variables assessed included the occurrence of any adverse event, severe adverse events, and hospital readmission. Missing preoperative albumin and hematocrit values were handled using complete case analysis and multiple imputation. These preoperative laboratory levels were then tested for associations with 30-day postoperative outcomes using logistic regression. A total of 11,999 patients were included. Of this cohort, 63.5% of patients had missing preoperative albumin and 9.9% had missing preoperative hematocrit. When using complete case analysis, only 4,311 patients were studied. The removed patients were significantly younger, healthier, of a common body mass index, and male. Logistic regression analysis failed to identify either preoperative hypoalbuminemia or preoperative anemia as significantly associated with adverse outcomes. When employing multiple imputation, all 11,999 patients were included. Preoperative hypoalbuminemia was significantly associated with the occurrence of any adverse event and severe adverse events. Preoperative anemia was significantly associated with the
Estimating past hepatitis C infection risk from reported risk factor histories: implications for imputing age of infection and modeling fibrosis progression

Directory of Open Access Journals (Sweden)

Busch Michael P

2007-12-01

Full Text Available Abstract Background Chronic hepatitis C virus infection is prevalent and often causes hepatic fibrosis, which can progress to cirrhosis and cause liver cancer or liver failure. Study of fibrosis progression often relies on imputing the time of infection, often as the reported age of first injection drug use. We sought to examine the accuracy of such imputation and implications for modeling factors that influence progression rates. Methods We analyzed cross-sectional data on hepatitis C antibody status and reported risk factor histories from two large studies, the Women's Interagency HIV Study and the Urban Health Study, using modern survival analysis methods for current status data to model past infection risk year by year. We compared fitted distributions of past infection risk to reported age of first injection drug use. Results Although injection drug use appeared to be a very strong risk factor, models for both studies showed that many subjects had considerable probability of having been infected substantially before or after their reported age of first injection drug use. Persons reporting younger age of first injection drug use were more likely to have been infected after, and persons reporting older age of first injection drug use were more likely to have been infected before. Conclusion In cross-sectional studies of fibrosis progression where date of HCV infection is estimated from risk factor histories, modern methods such as multiple imputation should be used to account for the substantial uncertainty about when infection occurred. The models presented here can provide the inputs needed by such methods. Using reported age of first injection drug use as the time of infection in studies of fibrosis progression is likely to produce a spuriously strong association of younger age of infection with slower rate of progression.
Cohort-specific imputation of gene expression improves prediction of warfarin dose for African Americans.

Science.gov (United States)

Gottlieb, Assaf; Daneshjou, Roxana; DeGorter, Marianne; Bourgeois, Stephane; Svensson, Peter J; Wadelius, Mia; Deloukas, Panos; Montgomery, Stephen B; Altman, Russ B

2017-11-24

Genome-wide association studies are useful for discovering genotype-phenotype associations but are limited because they require large cohorts to identify a signal, which can be population-specific. Mapping genetic variation to genes improves power and allows the effects of both protein-coding variation as well as variation in expression to be combined into "gene level" effects. Previous work has shown that warfarin dose can be predicted using information from genetic variation that affects protein-coding regions. Here, we introduce a method that improves dose prediction by integrating tissue-specific gene expression. In particular, we use drug pathways and expression quantitative trait loci knowledge to impute gene expression-on the assumption that differential expression of key pathway genes may impact dose requirement. We focus on 116 genes from the pharmacokinetic and pharmacodynamic pathways of warfarin within training and validation sets comprising both European and African-descent individuals. We build gene-tissue signatures associated with warfarin dose in a cohort-specific manner and identify a signature of 11 gene-tissue pairs that significantly augments the International Warfarin Pharmacogenetics Consortium dosage-prediction algorithm in both populations. Our results demonstrate that imputed expression can improve dose prediction and bridge population-specific compositions. MATLAB code is available at https://github.com/assafgo/warfarin-cohort.
Multiple imputation of rainfall missing data in the Iberian Mediterranean context

Science.gov (United States)

Miró, Juan Javier; Caselles, Vicente; Estrela, María José

2017-11-01

Given the increasing need for complete rainfall data networks, in recent years have been proposed diverse methods for filling gaps in observed precipitation series, progressively more advanced that traditional approaches to overcome the problem. The present study has consisted in validate 10 methods (6 linear, 2 non-linear and 2 hybrid) that allow multiple imputation, i.e., fill at the same time missing data of multiple incomplete series in a dense network of neighboring stations. These were applied for daily and monthly rainfall in two sectors in the Júcar River Basin Authority (east Iberian Peninsula), which is characterized by a high spatial irregularity and difficulty of rainfall estimation. A classification of precipitation according to their genetic origin was applied as pre-processing, and a quantile-mapping adjusting as post-processing technique. The results showed in general a better performance for the non-linear and hybrid methods, highlighting that the non-linear PCA (NLPCA) method outperforms considerably the Self Organizing Maps (SOM) method within non-linear approaches. On linear methods, the Regularized Expectation Maximization method (RegEM) was the best, but far from NLPCA. Applying EOF filtering as post-processing of NLPCA (hybrid approach) yielded the best results.
Nearest neighbor imputation using spatial-temporal correlations in wireless sensor networks.

Science.gov (United States)

Li, YuanYuan; Parker, Lynne E

2014-01-01

Missing data is common in Wireless Sensor Networks (WSNs), especially with multi-hop communications. There are many reasons for this phenomenon, such as unstable wireless communications, synchronization issues, and unreliable sensors. Unfortunately, missing data creates a number of problems for WSNs. First, since most sensor nodes in the network are battery-powered, it is too expensive to have the nodes retransmit missing data across the network. Data re-transmission may also cause time delays when detecting abnormal changes in an environment. Furthermore, localized reasoning techniques on sensor nodes (such as machine learning algorithms to classify states of the environment) are generally not robust enough to handle missing data. Since sensor data collected by a WSN is generally correlated in time and space, we illustrate how replacing missing sensor values with spatially and temporally correlated sensor values can significantly improve the network's performance. However, our studies show that it is important to determine which nodes are spatially and temporally correlated with each other. Simple techniques based on Euclidean distance are not sufficient for complex environmental deployments. Thus, we have developed a novel Nearest Neighbor (NN) imputation method that estimates missing data in WSNs by learning spatial and temporal correlations between sensor nodes. To improve the search time, we utilize a k d-tree data structure, which is a non-parametric, data-driven binary search tree. Instead of using traditional mean and variance of each dimension for k d-tree construction, and Euclidean distance for k d-tree search, we use weighted variances and weighted Euclidean distances based on measured percentages of missing data. We have evaluated this approach through experiments on sensor data from a volcano dataset collected by a network of Crossbow motes, as well as experiments using sensor data from a highway traffic monitoring application. Our experimental
Applying an efficient K-nearest neighbor search to forest attribute imputation

Science.gov (United States)

Andrew O. Finley; Ronald E. McRoberts; Alan R. Ek

2006-01-01

This paper explores the utility of an efficient nearest neighbor (NN) search algorithm for applications in multi-source kNN forest attribute imputation. The search algorithm reduces the number of distance calculations between a given target vector and each reference vector, thereby, decreasing the time needed to discover the NN subset. Results of five trials show gains...
Local exome sequences facilitate imputation of less common variants and increase power of genome wide association studies.

Directory of Open Access Journals (Sweden)

Peter K Joshi

Full Text Available The analysis of less common variants in genome-wide association studies promises to elucidate complex trait genetics but is hampered by low power to reliably detect association. We show that addition of population-specific exome sequence data to global reference data allows more accurate imputation, particularly of less common SNPs (minor allele frequency 1-10% in two very different European populations. The imputation improvement corresponds to an increase in effective sample size of 28-38%, for SNPs with a minor allele frequency in the range 1-3%.
Missing data in clinical trials: control-based mean imputation and sensitivity analysis.

Science.gov (United States)

Mehrotra, Devan V; Liu, Fang; Permutt, Thomas

2017-09-01

In some randomized (drug versus placebo) clinical trials, the estimand of interest is the between-treatment difference in population means of a clinical endpoint that is free from the confounding effects of "rescue" medication (e.g., HbA1c change from baseline at 24 weeks that would be observed without rescue medication regardless of whether or when the assigned treatment was discontinued). In such settings, a missing data problem arises if some patients prematurely discontinue from the trial or initiate rescue medication while in the trial, the latter necessitating the discarding of post-rescue data. We caution that the commonly used mixed-effects model repeated measures analysis with the embedded missing at random assumption can deliver an exaggerated estimate of the aforementioned estimand of interest. This happens, in part, due to implicit imputation of an overly optimistic mean for "dropouts" (i.e., patients with missing endpoint data of interest) in the drug arm. We propose an alternative approach in which the missing mean for the drug arm dropouts is explicitly replaced with either the estimated mean of the entire endpoint distribution under placebo (primary analysis) or a sequence of increasingly more conservative means within a tipping point framework (sensitivity analysis); patient-level imputation is not required. A supplemental "dropout = failure" analysis is considered in which a common poor outcome is imputed for all dropouts followed by a between-treatment comparison using quantile regression. All analyses address the same estimand and can adjust for baseline covariates. Three examples and simulation results are used to support our recommendations. Copyright © 2017 John Wiley & Sons, Ltd.
Using beta coefficients to impute missing correlations in meta-analysis research: Reasons for caution.

Science.gov (United States)

Roth, Philip L; Le, Huy; Oh, In-Sue; Van Iddekinge, Chad H; Bobko, Philip

2018-06-01

Meta-analysis has become a well-accepted method for synthesizing empirical research about a given phenomenon. Many meta-analyses focus on synthesizing correlations across primary studies, but some primary studies do not report correlations. Peterson and Brown (2005) suggested that researchers could use standardized regression weights (i.e., beta coefficients) to impute missing correlations. Indeed, their beta estimation procedures (BEPs) have been used in meta-analyses in a wide variety of fields. In this study, the authors evaluated the accuracy of BEPs in meta-analysis. We first examined how use of BEPs might affect results from a published meta-analysis. We then developed a series of Monte Carlo simulations that systematically compared the use of existing correlations (that were not missing) to data sets that incorporated BEPs (that impute missing correlations from corresponding beta coefficients). These simulations estimated ρ̄ (mean population correlation) and SDρ (true standard deviation) across a variety of meta-analytic conditions. Results from both the existing meta-analysis and the Monte Carlo simulations revealed that BEPs were associated with potentially large biases when estimating ρ̄ and even larger biases when estimating SDρ. Using only existing correlations often substantially outperformed use of BEPs and virtually never performed worse than BEPs. Overall, the authors urge a return to the standard practice of using only existing correlations in meta-analysis. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Estimation of Tree Lists from Airborne Laser Scanning Using Tree Model Clustering and k-MSN Imputation

Directory of Open Access Journals (Sweden)

Jörgen Wallerman

2013-04-01

Full Text Available Individual tree crowns may be delineated from airborne laser scanning (ALS data by segmentation of surface models or by 3D analysis. Segmentation of surface models benefits from using a priori knowledge about the proportions of tree crowns, which has not yet been utilized for 3D analysis to any great extent. In this study, an existing surface segmentation method was used as a basis for a new tree model 3D clustering method applied to ALS returns in 104 circular field plots with 12 m radius in pine-dominated boreal forest (64°14'N, 19°50'E. For each cluster below the tallest canopy layer, a parabolic surface was fitted to model a tree crown. The tree model clustering identified more trees than segmentation of the surface model, especially smaller trees below the tallest canopy layer. Stem attributes were estimated with k-Most Similar Neighbours (k-MSN imputation of the clusters based on field-measured trees. The accuracy at plot level from the k-MSN imputation (stem density root mean square error or RMSE 32.7%; stem volume RMSE 28.3% was similar to the corresponding results from the surface model (stem density RMSE 33.6%; stem volume RMSE 26.1% with leave-one-out cross-validation for one field plot at a time. Three-dimensional analysis of ALS data should also be evaluated in multi-layered forests since it identified a larger number of small trees below the tallest canopy layer.
Imputation of genotypes from low density (50,000 markers) to high density (700,000 markers) of cows from research herds in Europe, North America, and Australasia using 2 reference populations

DEFF Research Database (Denmark)

Pryce, J E; Johnston, J; Hayes, B J

2014-01-01

detection in genome-wide association studies and the accuracy of genomic selection may increase when the low-density genotypes are imputed to higher density. Genotype data were available from 10 research herds: 5 from Europe [Denmark, Germany, Ireland, the Netherlands, and the United Kingdom (UK)], 2 from...... reference populations. Although it was not possible to use a combined reference population, which would probably result in the highest accuracies of imputation, differences arising from using 2 high-density reference populations on imputing 50,000-marker genotypes of 583 animals (from the UK) were...... information exploited. The UK animals were also included in the North American data set (n = 1,579) that was imputed to high density using a reference population of 2,018 bulls. After editing, 591,213 genotypes on 5,999 animals from 10 research herds remained. The correlation between imputed allele...

Genome of the Netherlands population-specific imputations identify an ABCA6 variant associated with cholesterol levels

Science.gov (United States)

van Leeuwen, Elisabeth M.; Karssen, Lennart C.; Deelen, Joris; Isaacs, Aaron; Medina-Gomez, Carolina; Mbarek, Hamdi; Kanterakis, Alexandros; Trompet, Stella; Postmus, Iris; Verweij, Niek; van Enckevort, David J.; Huffman, Jennifer E.; White, Charles C.; Feitosa, Mary F.; Bartz, Traci M.; Manichaikul, Ani; Joshi, Peter K.; Peloso, Gina M.; Deelen, Patrick; van Dijk, Freerk; Willemsen, Gonneke; de Geus, Eco J.; Milaneschi, Yuri; Penninx, Brenda W.J.H.; Francioli, Laurent C.; Menelaou, Androniki; Pulit, Sara L.; Rivadeneira, Fernando; Hofman, Albert; Oostra, Ben A.; Franco, Oscar H.; Leach, Irene Mateo; Beekman, Marian; de Craen, Anton J.M.; Uh, Hae-Won; Trochet, Holly; Hocking, Lynne J.; Porteous, David J.; Sattar, Naveed; Packard, Chris J.; Buckley, Brendan M.; Brody, Jennifer A.; Bis, Joshua C.; Rotter, Jerome I.; Mychaleckyj, Josyf C.; Campbell, Harry; Duan, Qing; Lange, Leslie A.; Wilson, James F.; Hayward, Caroline; Polasek, Ozren; Vitart, Veronique; Rudan, Igor; Wright, Alan F.; Rich, Stephen S.; Psaty, Bruce M.; Borecki, Ingrid B.; Kearney, Patricia M.; Stott, David J.; Adrienne Cupples, L.; Neerincx, Pieter B.T.; Elbers, Clara C.; Francesco Palamara, Pier; Pe'er, Itsik; Abdellaoui, Abdel; Kloosterman, Wigard P.; van Oven, Mannis; Vermaat, Martijn; Li, Mingkun; Laros, Jeroen F.J.; Stoneking, Mark; de Knijff, Peter; Kayser, Manfred; Veldink, Jan H.; van den Berg, Leonard H.; Byelas, Heorhiy; den Dunnen, Johan T.; Dijkstra, Martijn; Amin, Najaf; Joeri van der Velde, K.; van Setten, Jessica; Kattenberg, Mathijs; van Schaik, Barbera D.C.; Bot, Jan; Nijman, Isaäc J.; Mei, Hailiang; Koval, Vyacheslav; Ye, Kai; Lameijer, Eric-Wubbo; Moed, Matthijs H.; Hehir-Kwa, Jayne Y.; Handsaker, Robert E.; Sunyaev, Shamil R.; Sohail, Mashaal; Hormozdiari, Fereydoun; Marschall, Tobias; Schönhuth, Alexander; Guryev, Victor; Suchiman, H. Eka D.; Wolffenbuttel, Bruce H.; Platteel, Mathieu; Pitts, Steven J.; Potluri, Shobha; Cox, David R.; Li, Qibin; Li, Yingrui; Du, Yuanping; Chen, Ruoyan; Cao, Hongzhi; Li, Ning; Cao, Sujie; Wang, Jun; Bovenberg, Jasper A.; Jukema, J. Wouter; van der Harst, Pim; Sijbrands, Eric J.; Hottenga, Jouke-Jan; Uitterlinden, Andre G.; Swertz, Morris A.; van Ommen, Gert-Jan B.; de Bakker, Paul I.W.; Eline Slagboom, P.; Boomsma, Dorret I.; Wijmenga, Cisca; van Duijn, Cornelia M.

2015-01-01

Variants associated with blood lipid levels may be population-specific. To identify low-frequency variants associated with this phenotype, population-specific reference panels may be used. Here we impute nine large Dutch biobanks (~35,000 samples) with the population-specific reference panel created by the Genome of the Netherlands Project and perform association testing with blood lipid levels. We report the discovery of five novel associations at four loci (P value <6.61 × 10−4), including a rare missense variant in ABCA6 (rs77542162, p.Cys1359Arg, frequency 0.034), which is predicted to be deleterious. The frequency of this ABCA6 variant is 3.65-fold increased in the Dutch and its effect (βLDL-C=0.135, βTC=0.140) is estimated to be very similar to those observed for single variants in well-known lipid genes, such as LDLR. PMID:25751400
Mapping change of older forest with nearest-neighbor imputation and Landsat time-series

Science.gov (United States)

Janet L. Ohmann; Matthew J. Gregory; Heather M. Roberts; Warren B. Cohen; Robert E. Kennedy; Zhiqiang. Yang

2012-01-01

The Northwest Forest Plan (NWFP), which aims to conserve late-successional and old-growth forests (older forests) and associated species, established new policies on federal lands in the Pacific Northwest USA. As part of monitoring for the NWFP, we tested nearest-neighbor imputation for mapping change in older forest, defined by threshold values for forest attributes...
Estimating Classification Errors Under Edit Restrictions in Composite Survey-Register Data Using Multiple Imputation Latent Class Modelling (MILC

Directory of Open Access Journals (Sweden)

Boeschoten Laura

2017-12-01

Full Text Available Both registers and surveys can contain classification errors. These errors can be estimated by making use of a composite data set. We propose a new method based on latent class modelling to estimate the number of classification errors across several sources while taking into account impossible combinations with scores on other variables. Furthermore, the latent class model, by multiply imputing a new variable, enhances the quality of statistics based on the composite data set. The performance of this method is investigated by a simulation study, which shows that whether or not the method can be applied depends on the entropy R2 of the latent class model and the type of analysis a researcher is planning to do. Finally, the method is applied to public data from Statistics Netherlands.
Estimating Stand Height and Tree Density in Pinus taeda plantations using in-situ data, airborne LiDAR and k-Nearest Neighbor Imputation.

Science.gov (United States)

Silva, Carlos Alberto; Klauberg, Carine; Hudak, Andrew T; Vierling, Lee A; Liesenberg, Veraldo; Bernett, Luiz G; Scheraiber, Clewerson F; Schoeninger, Emerson R

2018-01-01

Accurate forest inventory is of great economic importance to optimize the entire supply chain management in pulp and paper companies. The aim of this study was to estimate stand dominate and mean heights (HD and HM) and tree density (TD) of Pinus taeda plantations located in South Brazil using in-situ measurements, airborne Light Detection and Ranging (LiDAR) data and the non- k-nearest neighbor (k-NN) imputation. Forest inventory attributes and LiDAR derived metrics were calculated at 53 regular sample plots and we used imputation models to retrieve the forest attributes at plot and landscape-levels. The best LiDAR-derived metrics to predict HD, HM and TD were H99TH, HSD, SKE and HMIN. The Imputation model using the selected metrics was more effective for retrieving height than tree density. The model coefficients of determination (adj.R2) and a root mean squared difference (RMSD) for HD, HM and TD were 0.90, 0.94, 0.38m and 6.99, 5.70, 12.92%, respectively. Our results show that LiDAR and k-NN imputation can be used to predict stand heights with high accuracy in Pinus taeda. However, furthers studies need to be realized to improve the accuracy prediction of TD and to evaluate and compare the cost of acquisition and processing of LiDAR data against the conventional inventory procedures.
Optimized Use of Low-Depth Genotyping-by-Sequencing for Genomic Prediction Among Multi-Parental Family Pools and Single Plants in Perennial Ryegrass (Lolium perenne L.

Directory of Open Access Journals (Sweden)

Fabio Cericola

2018-03-01

Full Text Available Ryegrass single plants, bi-parental family pools, and multi-parental family pools are often genotyped, based on allele-frequencies using genotyping-by-sequencing (GBS assays. GBS assays can be performed at low-coverage depth to reduce costs. However, reducing the coverage depth leads to a higher proportion of missing data, and leads to a reduction in accuracy when identifying the allele-frequency at each locus. As a consequence of the latter, genomic relationship matrices (GRMs will be biased. This bias in GRMs affects variance estimates and the accuracy of GBLUP for genomic prediction (GBLUP-GP. We derived equations that describe the bias from low-coverage sequencing as an effect of binomial sampling of sequence reads, and allowed for any ploidy level of the sample considered. This allowed us to combine individual and pool genotypes in one GRM, treating pool-genotypes as a polyploid genotype, equal to the total ploidy-level of the parents of the pool. Using simulated data, we verified the magnitude of the GRM bias at different coverage depths for three different kinds of ryegrass breeding material: individual genotypes from single plants, pool-genotypes from F2 families, and pool-genotypes from synthetic varieties. To better handle missing data, we also tested imputation procedures, which are suited for analyzing allele-frequency genomic data. The relative advantages of the bias-correction and the imputation of missing data were evaluated using real data. We examined a large dataset, including single plants, F2 families, and synthetic varieties genotyped in three GBS assays, each with a different coverage depth, and evaluated them for heading date, crown rust resistance, and seed yield. Cross validations were used to test the accuracy using GBLUP approaches, demonstrating the feasibility of predicting among different breeding material. Bias-corrected GRMs proved to increase predictive accuracies when compared with standard approaches to
Performance of bias-correction methods for exposure measurement error using repeated measurements with and without missing data.

Science.gov (United States)

Batistatou, Evridiki; McNamee, Roseanne

2012-12-10

It is known that measurement error leads to bias in assessing exposure effects, which can however, be corrected if independent replicates are available. For expensive replicates, two-stage (2S) studies that produce data 'missing by design', may be preferred over a single-stage (1S) study, because in the second stage, measurement of replicates is restricted to a sample of first-stage subjects. Motivated by an occupational study on the acute effect of carbon black exposure on respiratory morbidity, we compare the performance of several bias-correction methods for both designs in a simulation study: an instrumental variable method (EVROS IV) based on grouping strategies, which had been recommended especially when measurement error is large, the regression calibration and the simulation extrapolation methods. For the 2S design, either the problem of 'missing' data was ignored or the 'missing' data were imputed using multiple imputations. Both in 1S and 2S designs, in the case of small or moderate measurement error, regression calibration was shown to be the preferred approach in terms of root mean square error. For 2S designs, regression calibration as implemented by Stata software is not recommended in contrast to our implementation of this method; the 'problematic' implementation of regression calibration although substantially improved with use of multiple imputations. The EVROS IV method, under a good/fairly good grouping, outperforms the regression calibration approach in both design scenarios when exposure mismeasurement is severe. Both in 1S and 2S designs with moderate or large measurement error, simulation extrapolation severely failed to correct for bias. Copyright © 2012 John Wiley & Sons, Ltd.
What are the appropriate methods for analyzing patient-reported outcomes in randomized trials when data are missing?

Science.gov (United States)

Hamel, J F; Sebille, V; Le Neel, T; Kubis, G; Boyer, F C; Hardouin, J B

2017-12-01

Subjective health measurements using Patient Reported Outcomes (PRO) are increasingly used in randomized trials, particularly for patient groups comparisons. Two main types of analytical strategies can be used for such data: Classical Test Theory (CTT) and Item Response Theory models (IRT). These two strategies display very similar characteristics when data are complete, but in the common case when data are missing, whether IRT or CTT would be the most appropriate remains unknown and was investigated using simulations. We simulated PRO data such as quality of life data. Missing responses to items were simulated as being completely random, depending on an observable covariate or on an unobserved latent trait. The considered CTT-based methods allowed comparing scores using complete-case analysis, personal mean imputations or multiple-imputations based on a two-way procedure. The IRT-based method was the Wald test on a Rasch model including a group covariate. The IRT-based method and the multiple-imputations-based method for CTT displayed the highest observed power and were the only unbiased method whatever the kind of missing data. Online software and Stata® modules compatibles with the innate mi impute suite are provided for performing such analyses. Traditional procedures (listwise deletion and personal mean imputations) should be avoided, due to inevitable problems of biases and lack of power.
Is missing geographic positioning system data in accelerometry studies a problem, and is imputation the solution?

DEFF Research Database (Denmark)

Meseck, Kristin; Jankowska, Marta M; Schipperijn, Jasper

2016-01-01

The main purpose of the present study was to assess the impact of global positioning system (GPS) signal lapse on physical activity analyses, discover any existing associations between missing GPS data and environmental and demographics attributes, and to determine whether imputation is an accurate...
Genome of the Netherlands population-specific imputations identify an ABCA6 variant associated with cholesterol levels

NARCIS (Netherlands)

van Leeuwen, E.M.; Karssen, L.C.; Deelen, J.; Isaacs, A.; Medina-Gomez, C.; Mbarek, H.; Kanterakis, A.; Trompet, S.; Postmus, I.; Verweij, N.; van Enckevort, D.; Huffman, J.E.; White, C.C.; Feitosa, M.F.; Bartz, T.M.; Manichaikul, A.; Joshi, P.K.; Peloso, G.M.; Deelen, P.; Dijk, F.; Willemsen, G.; de Geus, E.J.C.; Milaneschi, Y.; Penninx, B.W.J.H.; Francioli, L.C.; Menelaou, A.; Pulit, S.L.; Rivadeneira, F.; Hofman, A.; Oostra, B.A.; Franco, O.H.; Mateo Leach, I.; Beekman, M.; de Craen, A.J.; Uh, H.W.; Trochet, H.; Hocking, L.J.; Porteous, D.J.; Sattar, N.; Packard, C.J.; Buckley, B.M.; Brody, J.A.; Bis, J.C.; Rotter, J.I.; Mychaleckyj, J.C.; Campbell, H.; Duan, Q.; Lange, L.A.; Wilson, J.F.; Hayward, C.; Polasek, O.; Vitart, V.; Rudan, I.; Wright, A.F.; Rich, S.S.; Psaty, B.M.; Borecki, I.B.; Kearney, P.M.; Stott, D.J.; Cupples, L.A.; Jukema, J.W.; van der Harst, P.; Sijbrands, E.J.; Hottenga, J.J.; Uitterlinden, A.G.; Swertz, M.A.; van Ommen, G.J.B; Bakker, P.I.W.; Slagboom, P.E.; Boomsma, D.I.; Wijmenga, C.; van Duijn, C.M.

2015-01-01

Variants associated with blood lipid levels may be population-specific. To identify low-frequency variants associated with this phenotype, population-specific reference panels may be used. Here we impute nine large Dutch biobanks (∼35,000 samples) with the population-specific reference panel created
Estimating Stand Height and Tree Density in Pinus taeda plantations using in-situ data, airborne LiDAR and k-Nearest Neighbor Imputation

Directory of Open Access Journals (Sweden)

CARLOS ALBERTO SILVA

Full Text Available ABSTRACT Accurate forest inventory is of great economic importance to optimize the entire supply chain management in pulp and paper companies. The aim of this study was to estimate stand dominate and mean heights (HD and HM and tree density (TD of Pinus taeda plantations located in South Brazil using in-situ measurements, airborne Light Detection and Ranging (LiDAR data and the non- k-nearest neighbor (k-NN imputation. Forest inventory attributes and LiDAR derived metrics were calculated at 53 regular sample plots and we used imputation models to retrieve the forest attributes at plot and landscape-levels. The best LiDAR-derived metrics to predict HD, HM and TD were H99TH, HSD, SKE and HMIN. The Imputation model using the selected metrics was more effective for retrieving height than tree density. The model coefficients of determination (adj.R2 and a root mean squared difference (RMSD for HD, HM and TD were 0.90, 0.94, 0.38m and 6.99, 5.70, 12.92%, respectively. Our results show that LiDAR and k-NN imputation can be used to predict stand heights with high accuracy in Pinus taeda. However, furthers studies need to be realized to improve the accuracy prediction of TD and to evaluate and compare the cost of acquisition and processing of LiDAR data against the conventional inventory procedures.
Imputation of Baseline LDL Cholesterol Concentration in Patients with Familial Hypercholesterolemia on Statins or Ezetimibe.

Science.gov (United States)

Ruel, Isabelle; Aljenedil, Sumayah; Sadri, Iman; de Varennes, Émilie; Hegele, Robert A; Couture, Patrick; Bergeron, Jean; Wanneh, Eric; Baass, Alexis; Dufour, Robert; Gaudet, Daniel; Brisson, Diane; Brunham, Liam R; Francis, Gordon A; Cermakova, Lubomira; Brophy, James M; Ryomoto, Arnold; Mancini, G B John; Genest, Jacques

2018-02-01

Familial hypercholesterolemia (FH) is the most frequent genetic disorder seen clinically and is characterized by increased LDL cholesterol (LDL-C) (>95th percentile), family history of increased LDL-C, premature atherosclerotic cardiovascular disease (ASCVD) in the patient or in first-degree relatives, presence of tendinous xanthomas or premature corneal arcus, or presence of a pathogenic mutation in the LDLR , PCSK9 , or APOB genes. A diagnosis of FH has important clinical implications with respect to lifelong risk of ASCVD and requirement for intensive pharmacological therapy. The concentration of baseline LDL-C (untreated) is essential for the diagnosis of FH but is often not available because the individual is already on statin therapy. To validate a new algorithm to impute baseline LDL-C, we examined 1297 patients. The baseline LDL-C was compared with the imputed baseline obtained within 18 months of the initiation of therapy. We compared the percent reduction in LDL-C on treatment from baseline with the published percent reductions. After eliminating individuals with missing data, nonstandard doses of statins, or medications other than statins or ezetimibe, we provide data on 951 patients. The mean ± SE baseline LDL-C was 243.0 (2.2) mg/dL [6.28 (0.06) mmol/L], and the mean ± SE imputed baseline LDL-C was 244.2 (2.6) mg/dL [6.31 (0.07) mmol/L] ( P = 0.48). There was no difference in response according to the patient's sex or in percent reduction between observed and expected for individual doses or types of statin or ezetimibe. We provide a validated estimation of baseline LDL-C for patients with FH that may help clinicians in making a diagnosis. © 2017 American Association for Clinical Chemistry.
The population genomics of archaeological transition in west Iberia: Investigation of ancient substructure using imputation and haplotype-based methods.

Directory of Open Access Journals (Sweden)

Rui Martiniano

2017-07-01

Full Text Available We analyse new genomic data (0.05-2.95x from 14 ancient individuals from Portugal distributed from the Middle Neolithic (4200-3500 BC to the Middle Bronze Age (1740-1430 BC and impute genomewide diploid genotypes in these together with published ancient Eurasians. While discontinuity is evident in the transition to agriculture across the region, sensitive haplotype-based analyses suggest a significant degree of local hunter-gatherer contribution to later Iberian Neolithic populations. A more subtle genetic influx is also apparent in the Bronze Age, detectable from analyses including haplotype sharing with both ancient and modern genomes, D-statistics and Y-chromosome lineages. However, the limited nature of this introgression contrasts with the major Steppe migration turnovers within third Millennium northern Europe and echoes the survival of non-Indo-European language in Iberia. Changes in genomic estimates of individual height across Europe are also associated with these major cultural transitions, and ancestral components continue to correlate with modern differences in stature.
Handling missing data in cluster randomized trials: A demonstration of multiple imputation with PAN through SAS

Directory of Open Access Journals (Sweden)

Jiangxiu Zhou

2014-09-01

Full Text Available The purpose of this study is to demonstrate a way of dealing with missing data in clustered randomized trials by doing multiple imputation (MI with the PAN package in R through SAS. The procedure for doing MI with PAN through SAS is demonstrated in detail in order for researchers to be able to use this procedure with their own data. An illustration of the technique with empirical data was also included. In this illustration thePAN results were compared with pairwise deletion and three types of MI: (1 Normal Model (NM-MI ignoring the cluster structure; (2 NM-MI with dummy-coded cluster variables (fixed cluster structure; and (3 a hybrid NM-MI which imputes half the time ignoring the cluster structure, and the other half including the dummy-coded cluster variables. The empirical analysis showed that using PAN and the other strategies produced comparable parameter estimates. However, the dummy-coded MI overestimated the intraclass correlation, whereas MI ignoring the cluster structure and the hybrid MI underestimated the intraclass correlation. When compared with PAN, the p-value and standard error for the treatment effect were higher with dummy-coded MI, and lower with MI ignoring the clusterstructure, the hybrid MI approach, and pairwise deletion. Previous studies have shown that NM-MI is not appropriate for handling missing data in clustered randomized trials. This approach, in addition to the pairwise deletion approach, leads to a biased intraclass correlation and faultystatistical conclusions. Imputation in clustered randomized trials should be performed with PAN. We have demonstrated an easy way for using PAN through SAS.
Multiple imputation for estimating the risk of developing dementia and its impact on survival.

Science.gov (United States)

Yu, Binbing; Saczynski, Jane S; Launer, Lenore

2010-10-01

Dementia, Alzheimer's disease in particular, is one of the major causes of disability and decreased quality of life among the elderly and a leading obstacle to successful aging. Given the profound impact on public health, much research has focused on the age-specific risk of developing dementia and the impact on survival. Early work has discussed various methods of estimating age-specific incidence of dementia, among which the illness-death model is popular for modeling disease progression. In this article we use multiple imputation to fit multi-state models for survival data with interval censoring and left truncation. This approach allows semi-Markov models in which survival after dementia depends on onset age. Such models can be used to estimate the cumulative risk of developing dementia in the presence of the competing risk of dementia-free death. Simulations are carried out to examine the performance of the proposed method. Data from the Honolulu Asia Aging Study are analyzed to estimate the age-specific and cumulative risks of dementia and to examine the effect of major risk factors on dementia onset and death.
Single-Case Designs and Qualitative Methods: Applying a Mixed Methods Research Perspective

Science.gov (United States)

Hitchcock, John H.; Nastasi, Bonnie K.; Summerville, Meredith

2010-01-01

The purpose of this conceptual paper is to describe a design that mixes single-case (sometimes referred to as single-subject) and qualitative methods, hereafter referred to as a single-case mixed methods design (SCD-MM). Minimal attention has been given to the topic of applying qualitative methods to SCD work in the literature. These two…
Imputing historical statistics, soils information, and other land-use data to crop area

Science.gov (United States)

Perry, C. R., Jr.; Willis, R. W.; Lautenschlager, L.

1982-01-01

In foreign crop condition monitoring, satellite acquired imagery is routinely used. To facilitate interpretation of this imagery, it is advantageous to have estimates of the crop types and their extent for small area units, i.e., grid cells on a map represent, at 60 deg latitude, an area nominally 25 by 25 nautical miles in size. The feasibility of imputing historical crop statistics, soils information, and other ancillary data to crop area for a province in Argentina is studied.
Granatum: a graphical single-cell RNA-Seq analysis pipeline for genomics scientists.

Science.gov (United States)

Zhu, Xun; Wolfgruber, Thomas K; Tasato, Austin; Arisdakessian, Cédric; Garmire, David G; Garmire, Lana X

2017-12-05

Single-cell RNA sequencing (scRNA-Seq) is an increasingly popular platform to study heterogeneity at the single-cell level. Computational methods to process scRNA-Seq data are not very accessible to bench scientists as they require a significant amount of bioinformatic skills. We have developed Granatum, a web-based scRNA-Seq analysis pipeline to make analysis more broadly accessible to researchers. Without a single line of programming code, users can click through the pipeline, setting parameters and visualizing results via the interactive graphical interface. Granatum conveniently walks users through various steps of scRNA-Seq analysis. It has a comprehensive list of modules, including plate merging and batch-effect removal, outlier-sample removal, gene-expression normalization, imputation, gene filtering, cell clustering, differential gene expression analysis, pathway/ontology enrichment analysis, protein network interaction visualization, and pseudo-time cell series construction. Granatum enables broad adoption of scRNA-Seq technology by empowering bench scientists with an easy-to-use graphical interface for scRNA-Seq data analysis. The package is freely available for research use at http://garmiregroup.org/granatum/app.
Computing tools for implementing standards for single-case designs.

Science.gov (United States)

Chen, Li-Ting; Peng, Chao-Ying Joanne; Chen, Ming-E

2015-11-01

In the single-case design (SCD) literature, five sets of standards have been formulated and distinguished: design standards, assessment standards, analysis standards, reporting standards, and research synthesis standards. This article reviews computing tools that can assist researchers and practitioners in meeting the analysis standards recommended by the What Works Clearinghouse: Procedures and Standards Handbook-the WWC standards. These tools consist of specialized web-based calculators or downloadable software for SCD data, and algorithms or programs written in Excel, SAS procedures, SPSS commands/Macros, or the R programming language. We aligned these tools with the WWC standards and evaluated them for accuracy and treatment of missing data, using two published data sets. All tools were tested to be accurate. When missing data were present, most tools either gave an error message or conducted analysis based on the available data. Only one program used a single imputation method. This article concludes with suggestions for an inclusive computing tool or environment, additional research on the treatment of missing data, and reasonable and flexible interpretations of the WWC standards. © The Author(s) 2015.
31 CFR 19.630 - May the Department of the Treasury impute conduct of one person to another?

Science.gov (United States)

2010-07-01

... 31 Money and Finance: Treasury 1 2010-07-01 2010-07-01 false May the Department of the Treasury impute conduct of one person to another? 19.630 Section 19.630 Money and Finance: Treasury Office of the Secretary of the Treasury GOVERNMENTWIDE DEBARMENT AND SUSPENSION (NONPROCUREMENT) General Principles...
The Use of Imputed Sibling Genotypes in Sibship-Based Association Analysis: On Modeling Alternatives, Power and Model Misspecification

NARCIS (Netherlands)

Minica, C.C.; Dolan, C.V.; Willemsen, G.; Vink, J.M.; Boomsma, D.I.

2013-01-01

When phenotypic, but no genotypic data are available for relatives of participants in genetic association studies, previous research has shown that family-based imputed genotypes can boost the statistical power when included in such studies. Here, using simulations, we compared the performance of

Semiautomatic imputation of activity travel diaries : use of global positioning system traces, prompted recall, and context-sensitive learning algorithms

NARCIS (Netherlands)

Moiseeva, A.; Jessurun, A.J.; Timmermans, H.J.P.; Stopher, P.

2016-01-01

Anastasia Moiseeva, Joran Jessurun and Harry Timmermans (2010), ‘Semiautomatic Imputation of Activity Travel Diaries: Use of Global Positioning System Traces, Prompted Recall, and Context-Sensitive Learning Algorithms’, Transportation Research Record: Journal of the Transportation Research Board,
Non-imputability, criminal dangerousness and curative safety measures: myths and realities

Directory of Open Access Journals (Sweden)

Frank Harbottle Quirós

2017-04-01

Full Text Available The curative safety measures are imposed in a criminal proceeding to the non-imputable people provided that through a prognosis it is concluded in an affirmative way about its criminal dangerousness. Although this statement seems very elementary, in judicial practice several myths remain in relation to these legal institutes whose versions may vary, to a greater or lesser extent, between the different countries of the world. In this context, the present article formulates ten myths based on the experience of Costa Rica and provides an explanation that seeks to weaken or knock them down, inviting the reader to reflect on them.
29 CFR 1471.630 - May the Federal Mediation and Conciliation Service impute conduct of one person to another?

Science.gov (United States)

2010-07-01

... 29 Labor 4 2010-07-01 2010-07-01 false May the Federal Mediation and Conciliation Service impute...) FEDERAL MEDIATION AND CONCILIATION SERVICE GOVERNMENTWIDE DEBARMENT AND SUSPENSION (NONPROCUREMENT) General Principles Relating to Suspension and Debarment Actions § 1471.630 May the Federal Mediation and...
Sensitivity analysis in multiple imputation in effectiveness studies of psychotherapy.

Science.gov (United States)

Crameri, Aureliano; von Wyl, Agnes; Koemeda, Margit; Schulthess, Peter; Tschuschke, Volker

2015-01-01

The importance of preventing and treating incomplete data in effectiveness studies is nowadays emphasized. However, most of the publications focus on randomized clinical trials (RCT). One flexible technique for statistical inference with missing data is multiple imputation (MI). Since methods such as MI rely on the assumption of missing data being at random (MAR), a sensitivity analysis for testing the robustness against departures from this assumption is required. In this paper we present a sensitivity analysis technique based on posterior predictive checking, which takes into consideration the concept of clinical significance used in the evaluation of intra-individual changes. We demonstrate the possibilities this technique can offer with the example of irregular longitudinal data collected with the Outcome Questionnaire-45 (OQ-45) and the Helping Alliance Questionnaire (HAQ) in a sample of 260 outpatients. The sensitivity analysis can be used to (1) quantify the degree of bias introduced by missing not at random data (MNAR) in a worst reasonable case scenario, (2) compare the performance of different analysis methods for dealing with missing data, or (3) detect the influence of possible violations to the model assumptions (e.g., lack of normality). Moreover, our analysis showed that ratings from the patient's and therapist's version of the HAQ could significantly improve the predictive value of the routine outcome monitoring based on the OQ-45. Since analysis dropouts always occur, repeated measurements with the OQ-45 and the HAQ analyzed with MI are useful to improve the accuracy of outcome estimates in quality assurance assessments and non-randomized effectiveness studies in the field of outpatient psychotherapy.
A Review of Methods for Missing Data.

Science.gov (United States)

Pigott, Therese D.

2001-01-01

Reviews methods for handling missing data in a research study. Model-based methods, such as maximum likelihood using the EM algorithm and multiple imputation, hold more promise than ad hoc methods. Although model-based methods require more specialized computer programs and assumptions about the nature of missing data, these methods are appropriate…
Missing Value Imputation Based on Gaussian Mixture Model for the Internet of Things

OpenAIRE

Yan, Xiaobo; Xiong, Weiqing; Hu, Liang; Wang, Feng; Zhao, Kuo

2015-01-01

This paper addresses missing value imputation for the Internet of Things (IoT). Nowadays, the IoT has been used widely and commonly by a variety of domains, such as transportation and logistics domain and healthcare domain. However, missing values are very common in the IoT for a variety of reasons, which results in the fact that the experimental data are incomplete. As a result of this, some work, which is related to the data of the IoT, can’t be carried out normally. And it leads to the red...
Impute DC link (IDCL) cell based power converters and control thereof

Science.gov (United States)

Divan, Deepakraj M.; Prasai, Anish; Hernendez, Jorge; Moghe, Rohit; Iyer, Amrit; Kandula, Rajendra Prasad

2016-04-26

Power flow controllers based on Imputed DC Link (IDCL) cells are provided. The IDCL cell is a self-contained power electronic building block (PEBB). The IDCL cell may be stacked in series and parallel to achieve power flow control at higher voltage and current levels. Each IDCL cell may comprise a gate drive, a voltage sharing module, and a thermal management component in order to facilitate easy integration of the cell into a variety of applications. By providing direct AC conversion, the IDCL cell based AC/AC converters reduce device count, eliminate the use of electrolytic capacitors that have life and reliability issues, and improve system efficiency compared with similarly rated back-to-back inverter system.
Trends in study design and the statistical methods employed in a leading general medicine journal.

Science.gov (United States)

Gosho, M; Sato, Y; Nagashima, K; Takahashi, S

2018-02-01

Study design and statistical methods have become core components of medical research, and the methodology has become more multifaceted and complicated over time. The study of the comprehensive details and current trends of study design and statistical methods is required to support the future implementation of well-planned clinical studies providing information about evidence-based medicine. Our purpose was to illustrate study design and statistical methods employed in recent medical literature. This was an extension study of Sato et al. (N Engl J Med 2017; 376: 1086-1087), which reviewed 238 articles published in 2015 in the New England Journal of Medicine (NEJM) and briefly summarized the statistical methods employed in NEJM. Using the same database, we performed a new investigation of the detailed trends in study design and individual statistical methods that were not reported in the Sato study. Due to the CONSORT statement, prespecification and justification of sample size are obligatory in planning intervention studies. Although standard survival methods (eg Kaplan-Meier estimator and Cox regression model) were most frequently applied, the Gray test and Fine-Gray proportional hazard model for considering competing risks were sometimes used for a more valid statistical inference. With respect to handling missing data, model-based methods, which are valid for missing-at-random data, were more frequently used than single imputation methods. These methods are not recommended as a primary analysis, but they have been applied in many clinical trials. Group sequential design with interim analyses was one of the standard designs, and novel design, such as adaptive dose selection and sample size re-estimation, was sometimes employed in NEJM. Model-based approaches for handling missing data should replace single imputation methods for primary analysis in the light of the information found in some publications. Use of adaptive design with interim analyses is increasing
GRIMP: A web- and grid-based tool for high-speed analysis of large-scale genome-wide association using imputed data.

NARCIS (Netherlands)

K. Estrada Gil (Karol); A. Abuseiris (Anis); F.G. Grosveld (Frank); A.G. Uitterlinden (André); T.A. Knoch (Tobias); F. Rivadeneira Ramirez (Fernando)

2009-01-01

textabstractThe current fast growth of genome-wide association studies (GWAS) combined with now common computationally expensive imputation requires the online access of large user groups to high-performance computing resources capable of analyzing rapidly and efficiently millions of genetic
Using the Superpopulation Model for Imputations and Variance Computation in Survey Sampling

Directory of Open Access Journals (Sweden)

Petr Novák

2012-03-01

Full Text Available This study is aimed at variance computation techniques for estimates of population characteristics based on survey sampling and imputation. We use the superpopulation regression model, which means that the target variable values for each statistical unit are treated as random realizations of a linear regression model with weighted variance. We focus on regression models with one auxiliary variable and no intercept, which have many applications and straightforward interpretation in business statistics. Furthermore, we deal with caseswhere the estimates are not independent and thus the covariance must be computed. We also consider chained regression models with auxiliary variables as random variables instead of constants.
Double sampling with multiple imputation to answer large sample meta-research questions: Introduction and illustration by evaluating adherence to two simple CONSORT guidelines

Directory of Open Access Journals (Sweden)

Patrice L. Capers

2015-03-01

Full Text Available BACKGROUND: Meta-research can involve manual retrieval and evaluation of research, which is resource intensive. Creation of high throughput methods (e.g., search heuristics, crowdsourcing has improved feasibility of large meta-research questions, but possibly at the cost of accuracy. OBJECTIVE: To evaluate the use of double sampling combined with multiple imputation (DS+MI to address meta-research questions, using as an example adherence of PubMed entries to two simple Consolidated Standards of Reporting Trials (CONSORT guidelines for titles and abstracts. METHODS: For the DS large sample, we retrieved all PubMed entries satisfying the filters: RCT; human; abstract available; and English language (n=322,107. For the DS subsample, we randomly sampled 500 entries from the large sample. The large sample was evaluated with a lower rigor, higher throughput (RLOTHI method using search heuristics, while the subsample was evaluated using a higher rigor, lower throughput (RHITLO human rating method. Multiple imputation of the missing-completely-at-random RHITLO data for the large sample was informed by: RHITLO data from the subsample; RLOTHI data from the large sample; whether a study was an RCT; and country and year of publication. RESULTS: The RHITLO and RLOTHI methods in the subsample largely agreed (phi coefficients: title=1.00, abstract=0.92. Compliance with abstract and title criteria has increased over time, with non-US countries improving more rapidly. DS+MI logistic regression estimates were more precise than subsample estimates (e.g., 95% CI for change in title and abstract compliance by Year: subsample RHITLO 1.050-1.174 vs. DS+MI 1.082-1.151. As evidence of improved accuracy, DS+MI coefficient estimates were closer to RHITLO than the large sample RLOTHI. CONCLUSIONS: Our results support our hypothesis that DS+MI would result in improved precision and accuracy. This method is flexible and may provide a practical way to examine large corpora of
A reliable method for the counting and control of single ions for single-dopant controlled devices

International Nuclear Information System (INIS)

Shinada, T; Kurosawa, T; Nakayama, H; Zhu, Y; Hori, M; Ohdomari, I

2008-01-01

By 2016, transistor device size will be just 10 nm. However, a transistor that is doped at a typical concentration of 10 18 atoms cm -3 has only one dopant atom in the active channel region. Therefore, it can be predicted that conventional doping methods such as ion implantation and thermal diffusion will not be available ten years from now. We have been developing a single-ion implantation (SII) method that enables us to implant dopant ions one-by-one into semiconductors until the desired number is reached. Here we report a simple but reliable method to control the number of single-dopant atoms by detecting the change in drain current induced by single-ion implantation. The drain current decreases in a stepwise fashion as a result of the clusters of displaced Si atoms created by every single-ion incidence. This result indicates that the single-ion detection method we have developed is capable of detecting single-ion incidence with 100% efficiency. Our method potentially could pave the way to future single-atom devices, including a solid-state quantum computer
Methods for forming particles from single source precursors

Science.gov (United States)

Fox, Robert V [Idaho Falls, ID; Rodriguez, Rene G [Pocatello, ID; Pak, Joshua [Pocatello, ID

2011-08-23

Single source precursors are subjected to carbon dioxide to form particles of material. The carbon dioxide may be in a supercritical state. Single source precursors also may be subjected to supercritical fluids other than supercritical carbon dioxide to form particles of material. The methods may be used to form nanoparticles. In some embodiments, the methods are used to form chalcopyrite materials. Devices such as, for example, semiconductor devices may be fabricated that include such particles. Methods of forming semiconductor devices include subjecting single source precursors to carbon dioxide to form particles of semiconductor material, and establishing electrical contact between the particles and an electrode.
New insights into the pharmacogenomics of antidepressant response from the GENDEP and STAR*D studies: rare variant analysis and high-density imputation.

Science.gov (United States)

Fabbri, C; Tansey, K E; Perlis, R H; Hauser, J; Henigsberg, N; Maier, W; Mors, O; Placentino, A; Rietschel, M; Souery, D; Breen, G; Curtis, C; Sang-Hyuk, L; Newhouse, S; Patel, H; Guipponi, M; Perroud, N; Bondolfi, G; O'Donovan, M; Lewis, G; Biernacka, J M; Weinshilboum, R M; Farmer, A; Aitchison, K J; Craig, I; McGuffin, P; Uher, R; Lewis, C M

2017-11-21

Genome-wide association studies have generally failed to identify polymorphisms associated with antidepressant response. Possible reasons include limited coverage of genetic variants that this study tried to address by exome genotyping and dense imputation. A meta-analysis of Genome-Based Therapeutic Drugs for Depression (GENDEP) and Sequenced Treatment Alternatives to Relieve Depression (STAR*D) studies was performed at the single-nucleotide polymorphism (SNP), gene and pathway levels. Coverage of genetic variants was increased compared with previous studies by adding exome genotypes to previously available genome-wide data and using the Haplotype Reference Consortium panel for imputation. Standard quality control was applied. Phenotypes were symptom improvement and remission after 12 weeks of antidepressant treatment. Significant findings were investigated in NEWMEDS consortium samples and Pharmacogenomic Research Network Antidepressant Medication Pharmacogenomic Study (PGRN-AMPS) for replication. A total of 7062 950 SNPs were analyzed in GENDEP (n=738) and STAR*D (n=1409). rs116692768 (P=1.80e-08, ITGA9 (integrin α9)) and rs76191705 (P=2.59e-08, NRXN3 (neurexin 3)) were significantly associated with symptom improvement during citalopram/escitalopram treatment. At the gene level, no consistent effect was found. At the pathway level, the Gene Ontology (GO) terms GO: 0005694 (chromosome) and GO: 0044427 (chromosomal part) were associated with improvement (corrected P=0.007 and 0.045, respectively). The association between rs116692768 and symptom improvement was replicated in PGRN-AMPS (P=0.047), whereas rs76191705 was not. The two SNPs did not replicate in NEWMEDS. ITGA9 codes for a membrane receptor for neurotrophins and NRXN3 is a transmembrane neuronal adhesion receptor involved in synaptic differentiation. Despite their meaningful biological rationale for being involved in antidepressant effect, replication was partial. Further studies may help in clarifying
RIDDLE: Race and ethnicity Imputation from Disease history with Deep LEarning.

Directory of Open Access Journals (Sweden)

Ji-Sung Kim

2018-04-01

Full Text Available Anonymized electronic medical records are an increasingly popular source of research data. However, these datasets often lack race and ethnicity information. This creates problems for researchers modeling human disease, as race and ethnicity are powerful confounders for many health exposures and treatment outcomes; race and ethnicity are closely linked to population-specific genetic variation. We showed that deep neural networks generate more accurate estimates for missing racial and ethnic information than competing methods (e.g., logistic regression, random forest, support vector machines, and gradient-boosted decision trees. RIDDLE yielded significantly better classification performance across all metrics that were considered: accuracy, cross-entropy loss (error, precision, recall, and area under the curve for receiver operating characteristic plots (all p < 10-9. We made specific efforts to interpret the trained neural network models to identify, quantify, and visualize medical features which are predictive of race and ethnicity. We used these characterizations of informative features to perform a systematic comparison of differential disease patterns by race and ethnicity. The fact that clinical histories are informative for imputing race and ethnicity could reflect (1 a skewed distribution of blue- and white-collar professions across racial and ethnic groups, (2 uneven accessibility and subjective importance of prophylactic health, (3 possible variation in lifestyle, such as dietary habits, and (4 differences in background genetic variation which predispose to diseases.
RIDDLE: Race and ethnicity Imputation from Disease history with Deep LEarning

KAUST Repository

Kim, Ji-Sung

2018-04-26

Anonymized electronic medical records are an increasingly popular source of research data. However, these datasets often lack race and ethnicity information. This creates problems for researchers modeling human disease, as race and ethnicity are powerful confounders for many health exposures and treatment outcomes; race and ethnicity are closely linked to population-specific genetic variation. We showed that deep neural networks generate more accurate estimates for missing racial and ethnic information than competing methods (e.g., logistic regression, random forest, support vector machines, and gradient-boosted decision trees). RIDDLE yielded significantly better classification performance across all metrics that were considered: accuracy, cross-entropy loss (error), precision, recall, and area under the curve for receiver operating characteristic plots (all p < 10-9). We made specific efforts to interpret the trained neural network models to identify, quantify, and visualize medical features which are predictive of race and ethnicity. We used these characterizations of informative features to perform a systematic comparison of differential disease patterns by race and ethnicity. The fact that clinical histories are informative for imputing race and ethnicity could reflect (1) a skewed distribution of blue- and white-collar professions across racial and ethnic groups, (2) uneven accessibility and subjective importance of prophylactic health, (3) possible variation in lifestyle, such as dietary habits, and (4) differences in background genetic variation which predispose to diseases.
A brief introduction to single-molecule fluorescence methods

NARCIS (Netherlands)

Wildenberg, S.M.J.L.; Prevo, B.; Peterman, E.J.G.; Peterman, EJG; Wuite, GJL

2011-01-01

One of the more popular single-molecule approaches in biological science is single-molecule fluorescence microscopy, which is the subject of the following section of this volume. Fluorescence methods provide the sensitivity required to study biology on the single-molecule level, but they also allow
A brief introduction to single-molecule fluorescence methods

NARCIS (Netherlands)

van den Wildenberg, Siet M.J.L.; Prevo, Bram; Peterman, Erwin J.G.

2018-01-01

One of the more popular single-molecule approaches in biological science is single-molecule fluorescence microscopy, which will be the subject of the following section of this volume. Fluorescence methods provide the sensitivity required to study biology on the single-molecule level, but they also
Servizi finanziari imputati e interdipendenze settoriali: un'analisi settoriale del ruolo del credito nel sistema economico. (Imputed bank services and sectoral interdependences: a structural analysis of the role of credit in the economy

Directory of Open Access Journals (Sweden)

C. BIANCHI

2013-12-01

Full Text Available I sistemi di contabilità nazionale in base alla metodologia SEC sono soliti comportarsi in modo da garantire l'impossibilità pratica di effettuare qualsiasi assegnazione significativa di servizi bancari imputati tra i singoli rami di attività economica. Il presente lavoro mostra come questo vieta l'analisi strutturale del ruolo del credito nel sistema di interdipendenze. . L'analisi mette in evidenza la duplice natura del credito come contenuti a valore aggiunto altamente intermedio e alto. È in grado di influenzare forte su i costi di produzione degli altri rami, senza essere influenzato da loro. Queste proprietà conferiscono al settore bancario un potenziale molto elevato per l'inflazione.National accounts systems based on the SEC methodology are usually thought to comport the practical impossibility of carrying out any meaningful allocation of imputed bank services among the single branches of economic activity. As a consequence, the total value of the net interest earned by the credit system as a whole is considered as a cost entry and a negative component of added value in an ad-hoc additional industry, to be aggregated to the main credit one in the typical input-output analysis. The present work shows how this prohibits the structural analysis of the role of credit in the system of interdependencies. A method of is proposed in which imputed services of credit are distributed by branches, on the basis of existing statistics, proving valuable in assessing the significance of certain quantities of national accounts, such as operating results. The analysis highlights the dual nature of credit as highly intermediate and high value-added content. It is able to strongly influence the production costs of the other branches, without being influenced by them. These properties give the banking industry a very high potential for inflation. JEL: E51, G21
Genomic Selection for Predicting Fusarium Head Blight Resistance in a Wheat Breeding Program

Directory of Open Access Journals (Sweden)

Marcio P. Arruda

2015-11-01

Full Text Available Genomic selection (GS is a breeding method that uses marker–trait models to predict unobserved phenotypes. This study developed GS models for predicting traits associated with resistance to head blight (FHB in wheat ( L.. We used genotyping-by-sequencing (GBS to identify 5054 single-nucleotide polymorphisms (SNPs, which were then treated as predictor variables in GS analysis. We compared how the prediction accuracy of the genomic-estimated breeding values (GEBVs was affected by (i five genotypic imputation methods (random forest imputation [RFI], expectation maximization imputation [EMI], -nearest neighbor imputation [kNNI], singular value decomposition imputation [SVDI], and the mean imputation [MNI]; (ii three statistical models (ridge-regression best linear unbiased predictor [RR-BLUP], least absolute shrinkage and operator selector [LASSO], and elastic net; (iii marker density ( = 500, 1500, 3000, and 4500 SNPs; (iv training population (TP size ( = 96, 144, 192, and 218; (v marker-based and pedigree-based relationship matrices; and (vi control for relatedness in TPs and validation populations (VPs. No discernable differences in prediction accuracy were observed among imputation methods. The RR-BLUP outperformed other models in nearly all scenarios. Accuracies decreased substantially when marker number decreased to 3000 or 1500 SNPs, depending on the trait; when sample size of the training set was less than 192; when using pedigree-based instead of marker-based matrix; or when no control for relatedness was implemented. Overall, moderate to high prediction accuracies were observed in this study, suggesting that GS is a very promising breeding strategy for FHB resistance in wheat.

Avoid Filling Swiss Cheese with Whipped Cream; Imputation Techniques and Evaluation Procedures for Cross-Country Time Series

OpenAIRE

Michael Weber; Michaela Denk

2011-01-01

International organizations collect data from national authorities to create multivariate cross-sectional time series for their analyses. As data from countries with not yet well-established statistical systems may be incomplete, the bridging of data gaps is a crucial challenge. This paper investigates data structures and missing data patterns in the cross-sectional time series framework, reviews missing value imputation techniques used for micro data in official statistics, and discusses the...
Family-based Association Analyses of Imputed Genotypes Reveal Genome-Wide Significant Association of Alzheimer’s disease with OSBPL6, PTPRG and PDCL3

Science.gov (United States)

Herold, Christine; Hooli, Basavaraj V.; Mullin, Kristina; Liu, Tian; Roehr, Johannes T; Mattheisen, Manuel; Parrado, Antonio R.; Bertram, Lars; Lange, Christoph; Tanzi, Rudolph E.

2015-01-01

The genetic basis of Alzheimer's disease (AD) is complex and heterogeneous. Over 200 highly penetrant pathogenic variants in the genes APP, PSEN1 and PSEN2 cause a subset of early-onset familial Alzheimer's disease (EOFAD). On the other hand, susceptibility to late-onset forms of AD (LOAD) is indisputably associated to the ε4 allele in the gene APOE, and more recently to variants in more than two-dozen additional genes identified in the large-scale genome-wide association studies (GWAS) and meta-analyses reports. Taken together however, although the heritability in AD is estimated to be as high as 80%, a large proportion of the underlying genetic factors still remain to be elucidated. In this study we performed a systematic family-based genome-wide association and meta-analysis on close to 15 million imputed variants from three large collections of AD families (~3,500 subjects from 1,070 families). Using a multivariate phenotype combining affection status and onset age, meta-analysis of the association results revealed three single nucleotide polymorphisms (SNPs) that achieved genome-wide significance for association with AD risk: rs7609954 in the gene PTPRG (P-value = 3.98·10−08), rs1347297 in the gene OSBPL6 (P-value = 4.53·10−08), and rs1513625 near PDCL3 (P-value = 4.28·10−08). In addition, rs72953347 in OSBPL6 (P-value = 6.36·10−07) and two SNPs in the gene CDKAL1 showed marginally significant association with LOAD (rs10456232, P-value: 4.76·10−07; rs62400067, P-value: 3.54·10−07). In summary, family-based GWAS meta-analysis of imputed SNPs revealed novel genomic variants in (or near) PTPRG, OSBPL6, and PDCL3 that influence risk for AD with genome-wide significance. PMID:26830138
Missing data methods for dealing with missing items in quality of life questionnaires. A comparison by simulation of personal mean score, full information maximum likelihood, multiple imputation, and hot deck techniques applied to the SF-36 in the French 2003 decennial health survey.

Science.gov (United States)

Peyre, Hugo; Leplège, Alain; Coste, Joël

2011-03-01

Missing items are common in quality of life (QoL) questionnaires and present a challenge for research in this field. It remains unclear which of the various methods proposed to deal with missing data performs best in this context. We compared personal mean score, full information maximum likelihood, multiple imputation, and hot deck techniques using various realistic simulation scenarios of item missingness in QoL questionnaires constructed within the framework of classical test theory. Samples of 300 and 1,000 subjects were randomly drawn from the 2003 INSEE Decennial Health Survey (of 23,018 subjects representative of the French population and having completed the SF-36) and various patterns of missing data were generated according to three different item non-response rates (3, 6, and 9%) and three types of missing data (Little and Rubin's "missing completely at random," "missing at random," and "missing not at random"). The missing data methods were evaluated in terms of accuracy and precision for the analysis of one descriptive and one association parameter for three different scales of the SF-36. For all item non-response rates and types of missing data, multiple imputation and full information maximum likelihood appeared superior to the personal mean score and especially to hot deck in terms of accuracy and precision; however, the use of personal mean score was associated with insignificant bias (relative bias personal mean score appears nonetheless appropriate for dealing with items missing from completed SF-36 questionnaires in most situations of routine use. These results can reasonably be extended to other questionnaires constructed according to classical test theory.
On Matrix Sampling and Imputation of Context Questionnaires with Implications for the Generation of Plausible Values in Large-Scale Assessments

Science.gov (United States)

Kaplan, David; Su, Dan

2016-01-01

This article presents findings on the consequences of matrix sampling of context questionnaires for the generation of plausible values in large-scale assessments. Three studies are conducted. Study 1 uses data from PISA 2012 to examine several different forms of missing data imputation within the chained equations framework: predictive mean…
21 CFR 1404.630 - May the Office of National Drug Control Policy impute conduct of one person to another?

Science.gov (United States)

2010-04-01

... 21 Food and Drugs 9 2010-04-01 2010-04-01 false May the Office of National Drug Control Policy impute conduct of one person to another? 1404.630 Section 1404.630 Food and Drugs OFFICE OF NATIONAL DRUG CONTROL POLICY GOVERNMENTWIDE DEBARMENT AND SUSPENSION (NONPROCUREMENT) General Principles Relating to Suspension and Debarment Actions § 1404.630...
A new strategy for enhancing imputation quality of rare variants from next-generation sequencing data via combining SNP and exome chip data

NARCIS (Netherlands)

Y.J. Kim (Young Jin); J. Lee (Juyoung); B.-J. Kim (Bong-Jo); T. Park (Taesung); G.R. Abecasis (Gonçalo); M.A.A. De Almeida (Marcio); D. Altshuler (David); J.L. Asimit (Jennifer L.); G. Atzmon (Gil); M. Barber (Mathew); A. Barzilai (Ari); N.L. Beer (Nicola L.); G.I. Bell (Graeme I.); J. Below (Jennifer); T. Blackwell (Tom); J. Blangero (John); M. Boehnke (Michael); D.W. Bowden (Donald W.); N.P. Burtt (Noël); J.C. Chambers (John); H. Chen (Han); P. Chen (Ping); P.S. Chines (Peter); S. Choi (Sungkyoung); C. Churchhouse (Claire); P. Cingolani (Pablo); B.K. Cornes (Belinda); N.J. Cox (Nancy); A.G. Day-Williams (Aaron); A. Duggirala (Aparna); J. Dupuis (Josée); T. Dyer (Thomas); S. Feng (Shuang); J. Fernandez-Tajes (Juan); T. Ferreira (Teresa); T.E. Fingerlin (Tasha E.); J. Flannick (Jason); J.C. Florez (Jose); P. Fontanillas (Pierre); T.M. Frayling (Timothy); C. Fuchsberger (Christian); E. Gamazon (Eric); K. Gaulton (Kyle); S. Ghosh (Saurabh); B. Glaser (Benjamin); A.L. Gloyn (Anna); R.L. Grossman (Robert L.); J. Grundstad (Jason); C. Hanis (Craig); A. Heath (Allison); H. Highland (Heather); M. Horikoshi (Momoko); I.-S. Huh (Ik-Soo); J.R. Huyghe (Jeroen R.); M.K. Ikram (Kamran); K.A. Jablonski (Kathleen); Y. Jun (Yang); N. Kato (Norihiro); J. Kim (Jayoun); Y.J. Kim (Young Jin); B.-J. Kim (Bong-Jo); J. Lee (Juyoung); C.R. King (C. Ryan); J.S. Kooner (Jaspal S.); M.-S. Kwon (Min-Seok); H.K. Im (Hae Kyung); M. Laakso (Markku); K.K.-Y. Lam (Kevin Koi-Yau); J. Lee (Jaehoon); S. Lee (Selyeong); S. Lee (Sungyoung); D.M. Lehman (Donna M.); H. Li (Heng); C.M. Lindgren (Cecilia); X. Liu (Xuanyao); O.E. Livne (Oren E.); A.E. Locke (Adam E.); A. Mahajan (Anubha); J.B. Maller (Julian B.); A.K. Manning (Alisa K.); T.J. Maxwell (Taylor J.); A. Mazoure (Alexander); M.I. McCarthy (Mark); J.B. Meigs (James B.); B. Min (Byungju); K.L. Mohlke (Karen); A.P. Morris (Andrew); S. Musani (Solomon); Y. Nagai (Yoshihiko); M.C.Y. Ng (Maggie C.Y.); D. Nicolae (Dan); S. Oh (Sohee); N.D. Palmer (Nicholette); T. Park (Taesung); T.I. Pollin (Toni I.); I. Prokopenko (Inga); D. Reich (David); M.A. Rivas (Manuel); L.J. Scott (Laura); M. Seielstad (Mark); Y.S. Cho (Yoon Shin); X. Sim (Xueling); R. Sladek (Rob); P. Smith (Philip); I. Tachmazidou (Ioanna); E.S. Tai (Shyong); Y.Y. Teo (Yik Ying); T.M. Teslovich (Tanya M.); J. Torres (Jason); V. Trubetskoy (Vasily); S.M. Willems (Sara); A.L. Williams (Amy L.); J.G. Wilson (James); S. Wiltshire (Steven); S. Won (Sungho); A.R. Wood (Andrew); W. Xu (Wang); J. Yoon (Joon); M. Zawistowski (Matthew); E. Zeggini (Eleftheria); W. Zhang (Weihua); S. Zöllner (Sebastian)

2015-01-01

textabstractBackground: Rare variants have gathered increasing attention as a possible alternative source of missing heritability. Since next generation sequencing technology is not yet cost-effective for large-scale genomic studies, a widely used alternative approach is imputation. However, the
Methods library of embedded R functions at Statistics Norway

Directory of Open Access Journals (Sweden)

Øyvind Langsrud

2017-11-01

Full Text Available Statistics Norway is modernising the production processes. An important element in this work is a library of functions for statistical computations. In principle, the functions in such a methods library can be programmed in several languages. A modernised production environment demand that these functions can be reused for different statistics products, and that they are embedded within a common IT system. The embedding should be done in such a way that the users of the methods do not need to know the underlying programming language. As a proof of concept, Statistics Norway soon has established a methods library offering a limited number of methods for macro-editing, imputation and confidentiality. This is done within an area of municipal statistics with R as the only programming language. This paper presents the details and experiences from this work. The problem of fitting real word applications to simple and strict standards is discussed and exemplified by the development of solutions to regression imputation and table suppression.
A quantitative comparison of single-cell whole genome amplification methods.

Directory of Open Access Journals (Sweden)

Charles F A de Bourcy

Full Text Available Single-cell sequencing is emerging as an important tool for studies of genomic heterogeneity. Whole genome amplification (WGA is a key step in single-cell sequencing workflows and a multitude of methods have been introduced. Here, we compare three state-of-the-art methods on both bulk and single-cell samples of E. coli DNA: Multiple Displacement Amplification (MDA, Multiple Annealing and Looping Based Amplification Cycles (MALBAC, and the PicoPLEX single-cell WGA kit (NEB-WGA. We considered the effects of reaction gain on coverage uniformity, error rates and the level of background contamination. We compared the suitability of the different WGA methods for the detection of copy-number variations, for the detection of single-nucleotide polymorphisms and for de-novo genome assembly. No single method performed best across all criteria and significant differences in characteristics were observed; the choice of which amplifier to use will depend strongly on the details of the type of question being asked in any given experiment.
Twinning processes in Cu-Al-Ni martensite single crystals investigated by neutron single crystal diffraction method

International Nuclear Information System (INIS)

Molnar, P.; Sittner, P.; Novak, V.; Lukas, P.

2008-01-01

A neutron single crystal diffraction method for inspecting the quality of martensite single crystals is introduced. True interface-free martensite single crystals are indispensable for, e.g. measurement of elastic constants of phases by ultrasonic techniques. The neutron diffraction method was used to detect and distinguish the presence of individual lattice correspondence variants of the 2H orthorhombic martensite phase in Cu-Al-Ni as well as to follow the activity of twinning processes during the deformation test on the martensite variant single crystals. When preparing the martensite single variant prism-shaped crystals by compression deformation method, typically a small fraction of second unwanted martensitic variant (compound twin) remains in the prism samples. Due to the very low stress (∼1 MPa) for the compound twinning in many shape memory alloys, it is quite difficult not only to deplete the martensite prisms of all internal interfaces but mainly to keep them in the martensite single variant state for a long time needed for further investigations
Comparison of Model Reliabilities from Single-Step and Bivariate Blending Methods

DEFF Research Database (Denmark)

Taskinen, Matti; Mäntysaari, Esa; Lidauer, Martin

2013-01-01

Model based reliabilities in genetic evaluation are compared between three methods: animal model BLUP, single-step BLUP, and bivariate blending after genomic BLUP. The original bivariate blending is revised in this work to better account animal models. The study data is extracted from...... be calculated. Model reliabilities by the single-step and the bivariate blending methods were higher than by animal model due to genomic information. Compared to the single-step method, the bivariate blending method reliability estimates were, in general, lower. Computationally bivariate blending method was......, on the other hand, lighter than the single-step method....
Treating pre-instrumental data as "missing" data: using a tree-ring-based paleoclimate record and imputations to reconstruct streamflow in the Missouri River Basin

Science.gov (United States)

Ho, M. W.; Lall, U.; Cook, E. R.

2015-12-01

Advances in paleoclimatology in the past few decades have provided opportunities to expand the temporal perspective of the hydrological and climatological variability across the world. The North American region is particularly fortunate in this respect where a relatively dense network of high resolution paleoclimate proxy records have been assembled. One such network is the annually-resolved Living Blended Drought Atlas (LBDA): a paleoclimate reconstruction of the Palmer Drought Severity Index (PDSI) that covers North America on a 0.5° × 0.5° grid based on tree-ring chronologies. However, the use of the LBDA to assess North American streamflow variability requires a model by which streamflow may be reconstructed. Paleoclimate reconstructions have typically used models that first seek to quantify the relationship between the paleoclimate variable and the environmental variable of interest before extrapolating the relationship back in time. In contrast, the pre-instrumental streamflow is here considered as "missing" data. A method of imputing the "missing" streamflow data, prior to the instrumental record, is applied through multiple imputation using chained equations for streamflow in the Missouri River Basin. In this method, the distribution of the instrumental streamflow and LBDA is used to estimate sets of plausible values for the "missing" streamflow data resulting in a ~600 year-long streamflow reconstruction. Past research into external climate forcings, oceanic-atmospheric variability and its teleconnections, and assessments of rare multi-centennial instrumental records demonstrate that large temporal oscillations in hydrological conditions are unlikely to be captured in most instrumental records. The reconstruction of multi-centennial records of streamflow will enable comprehensive assessments of current and future water resource infrastructure and operations under the existing scope of natural climate variability.
Comparison of methods for dealing with missing values in the EPV-R.

Science.gov (United States)

Paniagua, David; Amor, Pedro J; Echeburúa, Enrique; Abad, Francisco J

2017-08-01

The development of an effective instrument to assess the risk of partner violence is a topic of great social relevance. This study evaluates the scale of “Predicción del Riesgo de Violencia Grave Contra la Pareja” –Revisada– (EPV-R - Severe Intimate Partner Violence Risk Prediction Scale-Revised), a tool developed in Spain, which is facing the problem of how to treat the high rate of missing values, as is usual in this type of scale. First, responses to the EPV-R in a sample of 1215 male abusers who were reported to the police were used to analyze the patterns of occurrence of missing values, as well as the factor structure. Second, we analyzed the performance of various imputation methods using simulated data that emulates the missing data mechanism found in the empirical database. The imputation procedure originally proposed by the authors of the scale provides acceptable results, although the application of a method based on the Item Response Theory could provide greater accuracy and offers some additional advantages. Item Response Theory appears to be a useful tool for imputing missing data in this type of questionnaire.
Methods of Mitigating Double Taxation

OpenAIRE

Lindhe, Tobias

2002-01-01

This paper presents a comprehensive overview of existing methods of mitigating double taxation of corporate income within a standard cost of capital model. Two of the most well-known and most utilized methods, the imputation and the split rate systems, do not mitigate double taxation in corporations where the marginal investment is financed with retained earnings. However, all methods are effective when the marginal investment is financed with new share issues. The corporate tax rate, fiscal ...
Czochralski method of growing single crystals. State-of-art

International Nuclear Information System (INIS)

Bukowski, A.; Zabierowski, P.

1999-01-01

Modern Czochralski method of single crystal growing has been described. The example of Czochralski process is given. The advantages that caused the rapid progress of the method have been presented. The method limitations that motivated the further research and new solutions are also presented. As the example two different ways of the technique development has been described: silicon single crystals growth in the magnetic field; continuous liquid feed of silicon crystals growth. (author)
The UIC 406 capacity method used on single track sections

DEFF Research Database (Denmark)

Landex, Alex; Kaas, Anders H.; Jacobsen, Erik M.

2007-01-01

This paper describes the relatively new UIC 406 capacity method which is an easy and effective way of calculating capacity consumption on railway lines. However, it is possible to expound the method in different ways which can lead to different capacity consumptions. This paper describes the UIC...... 406 method for single track lines and how it is expounded in Denmark. Many capacity analyses using the UIC 406 capacity method for double track lines have been carried out and presented internationally but only few capacity analyses using the UIC 406 capacity method on single track lines have been...... presented. Therefore, the differences between capacity analysis for double track lines and single track lines are discussed in the beginning of this paper. Many of the principles of the UIC 406 capacity analyses on double track lines can be used on single track lines – at least when more than one train...
Application of Multiple Imputation for Missing Values in Three-Way Three-Mode Multi-Environment Trial Data.

Science.gov (United States)

Tian, Ting; McLachlan, Geoffrey J; Dieters, Mark J; Basford, Kaye E

2015-01-01

It is a common occurrence in plant breeding programs to observe missing values in three-way three-mode multi-environment trial (MET) data. We proposed modifications of models for estimating missing observations for these data arrays, and developed a novel approach in terms of hierarchical clustering. Multiple imputation (MI) was used in four ways, multiple agglomerative hierarchical clustering, normal distribution model, normal regression model, and predictive mean match. The later three models used both Bayesian analysis and non-Bayesian analysis, while the first approach used a clustering procedure with randomly selected attributes and assigned real values from the nearest neighbour to the one with missing observations. Different proportions of data entries in six complete datasets were randomly selected to be missing and the MI methods were compared based on the efficiency and accuracy of estimating those values. The results indicated that the models using Bayesian analysis had slightly higher accuracy of estimation performance than those using non-Bayesian analysis but they were more time-consuming. However, the novel approach of multiple agglomerative hierarchical clustering demonstrated the overall best performances.
Development of nondestructive screening methods for single kernel characterization of wheat

DEFF Research Database (Denmark)

Nielsen, J.P.; Pedersen, D.K.; Munck, L.

2003-01-01

predictability. However, by applying an averaging approach, in which single seed replicate measurements are mathematically simulated, a very good NIT prediction model was achieved. This suggests that the single seed NIT spectra contain hardness information, but that a single seed hardness method with higher......The development of nondestructive screening methods for single seed protein, vitreousness, density, and hardness index has been studied for single kernels of European wheat. A single kernel procedure was applied involving, image analysis, near-infrared transmittance (NIT) spectroscopy, laboratory...
ADALIMUMAB FOR MAINTENANCE THERAPY FOR ONE YEAR IN CROHN?S DISEASE: results of a Latin American single-center observational study

Directory of Open Access Journals (Sweden)

Paulo Gustavo KOTZE

2014-03-01

Full Text Available Context Adalimumab is a fully-human antibody that inhibits TNF alpha, with a significant efficacy for long-term maintenance of remission. Studies with this agent in Latin American Crohn’s disease patients are scarce. Objectives The objective of this study was to outline clinical remission rates after 12 months of adalimumab therapy for Crohn’s disease patients. Methods Retrospective, single-center, observational study of a Brazilian case series of Crohn’s disease patients under adalimumab therapy. Variables analyzed: demographic data, Montreal classification, concomitant medication, remission rates after 1, 4, 6 and 12 months. Remission was defined as Harvey-Bradshaw Index ≤4, and non-responder-imputation and last-observation-carried-forward analysis were used. The influence of infliximab on remission rates was analyzed by Fischer and Chi-square tests (P<0.05. Results Fifty patients, with median age of 35 years at therapy initiation, were included. Remission rates after 12 months of therapy were 54% under non-responder-imputation and 88% under last-observation-carried-forward analysis. After 12 months, remission on patients with previous infliximab occurred in 69.23% as compared to 94.59% in infliximab-naïve patients (P = 0.033. Conclusions Adalimumab was effective in maintaining clinical remission after 12 months of therapy, with an adequate safety profile, and was also more effective in infliximab naïve patients.
Development of two dimensional electrophoresis method using single chain DNA

International Nuclear Information System (INIS)

Ikeda, Junichi; Hidaka, So

1998-01-01

By combining a separation method due to molecular weight and a method to distinguish difference of mono-bases, it was aimed to develop a two dimensional single chain DNA labeled with Radioisotope (RI). From electrophoretic pattern difference of parent and variant strands, it was investigated to isolate the root module implantation control gene. At first, a Single Strand Conformation Polymorphism (SSCP) method using concentration gradient gel was investigated. As a result, it was formed that intervals between double chain and single chain DNAs expanded, but intervals of both single chain DNAs did not expand. On next, combination of non-modified acrylic amide electrophoresis method and Denaturing Gradient-Gel Electrophoresis (DGGE) method was examined. As a result, hybrid DNA developed by two dimensional electrophoresis arranged on two lines. But, among them a band of DNA modified by high concentration of urea could not be found. Therefore, in this fiscal year's experiments, no preferable result could be obtained. By the used method, it was thought to be impossible to detect the differences. (G.K.)
GNSS Single Frequency, Single Epoch Reliable Attitude Determination Method with Baseline Vector Constraint

Directory of Open Access Journals (Sweden)

Ang Gong

2015-12-01

Full Text Available For Global Navigation Satellite System (GNSS single frequency, single epoch attitude determination, this paper proposes a new reliable method with baseline vector constraint. First, prior knowledge of baseline length, heading, and pitch obtained from other navigation equipment or sensors are used to reconstruct objective function rigorously. Then, searching strategy is improved. It substitutes gradually Enlarged ellipsoidal search space for non-ellipsoidal search space to ensure correct ambiguity candidates are within it and make the searching process directly be carried out by least squares ambiguity decorrelation algorithm (LAMBDA method. For all vector candidates, some ones are further eliminated by derived approximate inequality, which accelerates the searching process. Experimental results show that compared to traditional method with only baseline length constraint, this new method can utilize a priori baseline three-dimensional knowledge to fix ambiguity reliably and achieve a high success rate. Experimental tests also verify it is not very sensitive to baseline vector error and can perform robustly when angular error is not great.

Methods of forming single source precursors, methods of forming polymeric single source precursors, and single source precursors formed by such methods

Science.gov (United States)

Fox, Robert V.; Rodriguez, Rene G.; Pak, Joshua J.; Sun, Chivin; Margulieux, Kelsey R.; Holland, Andrew W.

2014-09-09

Methods of forming single source precursors (SSPs) include forming intermediate products having the empirical formula 1/2{L.sub.2N(.mu.-X).sub.2M'X.sub.2}.sub.2, and reacting MER with the intermediate products to form SSPs of the formula L.sub.2N(.mu.-ER).sub.2M'(ER).sub.2, wherein L is a Lewis base, M is a Group IA atom, N is a Group IB atom, M' is a Group IIIB atom, each E is a Group VIB atom, each X is a Group VIIA atom or a nitrate group, and each R group is an alkyl, aryl, vinyl, (per)fluoro alkyl, (per)fluoro aryl, silane, or carbamato group. Methods of forming polymeric or copolymeric SSPs include reacting at least one of HE.sup.1R.sup.1E.sup.1H and MER with one or more substances having the empirical formula L.sub.2N(.mu.-ER).sub.2M'(ER).sub.2 or L.sub.2N(.mu.-X).sub.2M'(X).sub.2 to form a polymeric or copolymeric SSP. New SSPs and intermediate products are formed by such methods.
The Experiment Method for Manufacturing Grid Development on Single Computer

Institute of Scientific and Technical Information of China (English)

XIAO Youan; ZHOU Zude

2006-01-01

In this paper, an experiment method for the Manufacturing Grid application system development in the single personal computer environment is proposed. The characteristic of the proposed method is constructing a full prototype Manufacturing Grid application system which is hosted on a single personal computer with the virtual machine technology. Firstly, it builds all the Manufacturing Grid physical resource nodes on an abstraction layer of a single personal computer with the virtual machine technology. Secondly, all the virtual Manufacturing Grid resource nodes will be connected with virtual network and the application software will be deployed on each Manufacturing Grid nodes. Then, we can obtain a prototype Manufacturing Grid application system which is working in the single personal computer, and can carry on the experiment on this foundation. Compared with the known experiment methods for the Manufacturing Grid application system development, the proposed method has the advantages of the known methods, such as cost inexpensively, operation simple, and can get the confidence experiment result easily. The Manufacturing Grid application system constructed with the proposed method has the high scalability, stability and reliability. It is can be migrated to the real application environment rapidly.
Discovery and Fine-Mapping of Glycaemic and Obesity-Related Trait Loci Using High-Density Imputation.

Science.gov (United States)

Horikoshi, Momoko; Mӓgi, Reedik; van de Bunt, Martijn; Surakka, Ida; Sarin, Antti-Pekka; Mahajan, Anubha; Marullo, Letizia; Thorleifsson, Gudmar; Hӓgg, Sara; Hottenga, Jouke-Jan; Ladenvall, Claes; Ried, Janina S; Winkler, Thomas W; Willems, Sara M; Pervjakova, Natalia; Esko, Tõnu; Beekman, Marian; Nelson, Christopher P; Willenborg, Christina; Wiltshire, Steven; Ferreira, Teresa; Fernandez, Juan; Gaulton, Kyle J; Steinthorsdottir, Valgerdur; Hamsten, Anders; Magnusson, Patrik K E; Willemsen, Gonneke; Milaneschi, Yuri; Robertson, Neil R; Groves, Christopher J; Bennett, Amanda J; Lehtimӓki, Terho; Viikari, Jorma S; Rung, Johan; Lyssenko, Valeriya; Perola, Markus; Heid, Iris M; Herder, Christian; Grallert, Harald; Müller-Nurasyid, Martina; Roden, Michael; Hypponen, Elina; Isaacs, Aaron; van Leeuwen, Elisabeth M; Karssen, Lennart C; Mihailov, Evelin; Houwing-Duistermaat, Jeanine J; de Craen, Anton J M; Deelen, Joris; Havulinna, Aki S; Blades, Matthew; Hengstenberg, Christian; Erdmann, Jeanette; Schunkert, Heribert; Kaprio, Jaakko; Tobin, Martin D; Samani, Nilesh J; Lind, Lars; Salomaa, Veikko; Lindgren, Cecilia M; Slagboom, P Eline; Metspalu, Andres; van Duijn, Cornelia M; Eriksson, Johan G; Peters, Annette; Gieger, Christian; Jula, Antti; Groop, Leif; Raitakari, Olli T; Power, Chris; Penninx, Brenda W J H; de Geus, Eco; Smit, Johannes H; Boomsma, Dorret I; Pedersen, Nancy L; Ingelsson, Erik; Thorsteinsdottir, Unnur; Stefansson, Kari; Ripatti, Samuli; Prokopenko, Inga; McCarthy, Mark I; Morris, Andrew P

2015-07-01

Reference panels from the 1000 Genomes (1000G) Project Consortium provide near complete coverage of common and low-frequency genetic variation with minor allele frequency ≥0.5% across European ancestry populations. Within the European Network for Genetic and Genomic Epidemiology (ENGAGE) Consortium, we have undertaken the first large-scale meta-analysis of genome-wide association studies (GWAS), supplemented by 1000G imputation, for four quantitative glycaemic and obesity-related traits, in up to 87,048 individuals of European ancestry. We identified two loci for body mass index (BMI) at genome-wide significance, and two for fasting glucose (FG), none of which has been previously reported in larger meta-analysis efforts to combine GWAS of European ancestry. Through conditional analysis, we also detected multiple distinct signals of association mapping to established loci for waist-hip ratio adjusted for BMI (RSPO3) and FG (GCK and G6PC2). The index variant for one association signal at the G6PC2 locus is a low-frequency coding allele, H177Y, which has recently been demonstrated to have a functional role in glucose regulation. Fine-mapping analyses revealed that the non-coding variants most likely to drive association signals at established and novel loci were enriched for overlap with enhancer elements, which for FG mapped to promoter and transcription factor binding sites in pancreatic islets, in particular. Our study demonstrates that 1000G imputation and genetic fine-mapping of common and low-frequency variant association signals at GWAS loci, integrated with genomic annotation in relevant tissues, can provide insight into the functional and regulatory mechanisms through which their effects on glycaemic and obesity-related traits are mediated.
A Brief Introduction to Single-Molecule Fluorescence Methods.

Science.gov (United States)

van den Wildenberg, Siet M J L; Prevo, Bram; Peterman, Erwin J G

2018-01-01

One of the more popular single-molecule approaches in biological science is single-molecule fluorescence microscopy, which will be the subject of the following section of this volume. Fluorescence methods provide the sensitivity required to study biology on the single-molecule level, but they also allow access to useful measurable parameters on time and length scales relevant for the biomolecular world. Before several detailed experimental approaches will be addressed, we will first give a general overview of single-molecule fluorescence microscopy. We start with discussing the phenomenon of fluorescence in general and the history of single-molecule fluorescence microscopy. Next, we will review fluorescent probes in more detail and the equipment required to visualize them on the single-molecule level. We will end with a description of parameters measurable with such approaches, ranging from protein counting and tracking, single-molecule localization super-resolution microscopy, to distance measurements with Förster Resonance Energy Transfer and orientation measurements with fluorescence polarization.
Optimizing the calculation of DM,CO and VC via the single breath single oxygen tension DLCO/NO method.

Science.gov (United States)

Coffman, Kirsten E; Taylor, Bryan J; Carlson, Alex R; Wentz, Robert J; Johnson, Bruce D

2016-01-15

Alveolar-capillary membrane conductance (D(M,CO)) and pulmonary-capillary blood volume (V(C)) are calculated via lung diffusing capacity for carbon monoxide (DL(CO)) and nitric oxide (DL(NO)) using the single breath, single oxygen tension (single-FiO2) method. However, two calculation parameters, the reaction rate of carbon monoxide with blood (θ(CO)) and the D(M,NO)/D(M,CO) ratio (α-ratio), are controversial. This study systematically determined optimal θ(CO) and α-ratio values to be used in the single-FiO2 method that yielded the most similar D(M,CO) and V(C) values compared to the 'gold-standard' multiple-FiO2 method. Eleven healthy subjects performed single breath DL(CO)/DL(NO) maneuvers at rest and during exercise. D(M,CO) and V(C) were calculated via the single-FiO2 and multiple-FiO2 methods by implementing seven θ(CO) equations and a range of previously reported α-ratios. The RP θ(CO) equation (Reeves, R.B., Park, H.K., 1992. Respiration Physiology 88 1-21) and an α-ratio of 4.0-4.4 yielded DM,CO and VC values that were most similar between methods. The RP θ(CO) equation and an experimental α-ratio should be used in future studies. Copyright © 2015 Elsevier B.V. All rights reserved.
Single beam pass migmacell method and apparatus

International Nuclear Information System (INIS)

Maglich, B.C.; Nering, J.E.; Mazarakis, M.G.; Miller, R.A.

1976-01-01

The invention provides improvements in migmacell apparatus and method by dispensing with the need for metastable confinement of injected molecular ions for multiple precession periods. Injected molecular ions undergo a 'single pass' through the reaction volume. By preconditioning the injected beam such that it contains a population distribution of molecules in higher vibrational states than in the case of a normal distribution, injected molecules in the single pass exper-ience collisionless dissociation in the migmacell under magnetic influence, i.e., so-called Lorentz dissociation. Dissociationions then form atomic migma
Discovery and Fine-Mapping of Glycaemic and Obesity-Related Trait Loci Using High-Density Imputation.

Directory of Open Access Journals (Sweden)

Momoko Horikoshi

2015-07-01

Full Text Available Reference panels from the 1000 Genomes (1000G Project Consortium provide near complete coverage of common and low-frequency genetic variation with minor allele frequency ≥0.5% across European ancestry populations. Within the European Network for Genetic and Genomic Epidemiology (ENGAGE Consortium, we have undertaken the first large-scale meta-analysis of genome-wide association studies (GWAS, supplemented by 1000G imputation, for four quantitative glycaemic and obesity-related traits, in up to 87,048 individuals of European ancestry. We identified two loci for body mass index (BMI at genome-wide significance, and two for fasting glucose (FG, none of which has been previously reported in larger meta-analysis efforts to combine GWAS of European ancestry. Through conditional analysis, we also detected multiple distinct signals of association mapping to established loci for waist-hip ratio adjusted for BMI (RSPO3 and FG (GCK and G6PC2. The index variant for one association signal at the G6PC2 locus is a low-frequency coding allele, H177Y, which has recently been demonstrated to have a functional role in glucose regulation. Fine-mapping analyses revealed that the non-coding variants most likely to drive association signals at established and novel loci were enriched for overlap with enhancer elements, which for FG mapped to promoter and transcription factor binding sites in pancreatic islets, in particular. Our study demonstrates that 1000G imputation and genetic fine-mapping of common and low-frequency variant association signals at GWAS loci, integrated with genomic annotation in relevant tissues, can provide insight into the functional and regulatory mechanisms through which their effects on glycaemic and obesity-related traits are mediated.
Validation of single-sample doubly labeled water method

International Nuclear Information System (INIS)

Webster, M.D.; Weathers, W.W.

1989-01-01

We have experimentally validated a single-sample variant of the doubly labeled water method for measuring metabolic rate and water turnover in a very small passerine bird, the verdin (Auriparus flaviceps). We measured CO 2 production using the Haldane gravimetric technique and compared these values with estimates derived from isotopic data. Doubly labeled water results based on the one-sample calculations differed from Haldane values by less than 0.5% on average (range -8.3 to 11.2%, n = 9). Water flux computed by the single-sample method differed by -1.5% on average from results for the same birds based on the standard, two-sample technique (range -13.7 to 2.0%, n = 9)
Extending the use of GWAS data by combining data from different genetic platforms

NARCIS (Netherlands)

van Iperen, E. P. A.; Hovingh, G. K.; Asselbergs, F. W.; Zwinderman, A. H.

2017-01-01

In the past decade many Genome-wide Association Studies (GWAS) were performed that discovered new associations between single-nucleotide polymorphisms (SNPs) and various phenotypes. Imputation methods are widely used in GWAS. They facilitate the phenotype association with variants that are not
A new method of preparing single-walled carbon nanotubes

Indian Academy of Sciences (India)

A novel method of purification for single-walled carbon nanotubes, prepared by an arc-discharge method, is described. The method involves a combination of acid washing followed by high temperature hydrogen treatment to remove the metal nanoparticles and amorphous carbon present in the as-synthesized singlewalled ...
Improved Ancestry Estimation for both Genotyping and Sequencing Data using Projection Procrustes Analysis and Genotype Imputation

Science.gov (United States)

Wang, Chaolong; Zhan, Xiaowei; Liang, Liming; Abecasis, Gonçalo R.; Lin, Xihong

2015-01-01

Accurate estimation of individual ancestry is important in genetic association studies, especially when a large number of samples are collected from multiple sources. However, existing approaches developed for genome-wide SNP data do not work well with modest amounts of genetic data, such as in targeted sequencing or exome chip genotyping experiments. We propose a statistical framework to estimate individual ancestry in a principal component ancestry map generated by a reference set of individuals. This framework extends and improves upon our previous method for estimating ancestry using low-coverage sequence reads (LASER 1.0) to analyze either genotyping or sequencing data. In particular, we introduce a projection Procrustes analysis approach that uses high-dimensional principal components to estimate ancestry in a low-dimensional reference space. Using extensive simulations and empirical data examples, we show that our new method (LASER 2.0), combined with genotype imputation on the reference individuals, can substantially outperform LASER 1.0 in estimating fine-scale genetic ancestry. Specifically, LASER 2.0 can accurately estimate fine-scale ancestry within Europe using either exome chip genotypes or targeted sequencing data with off-target coverage as low as 0.05×. Under the framework of LASER 2.0, we can estimate individual ancestry in a shared reference space for samples assayed at different loci or by different techniques. Therefore, our ancestry estimation method will accelerate discovery in disease association studies not only by helping model ancestry within individual studies but also by facilitating combined analysis of genetic data from multiple sources. PMID:26027497
Methods for Gas Sensing with Single-Walled Carbon Nanotubes

Science.gov (United States)

Kaul, Anupama B. (Inventor)

2013-01-01

Methods for gas sensing with single-walled carbon nanotubes are described. The methods comprise biasing at least one carbon nanotube and exposing to a gas environment to detect variation in temperature as an electrical response.
Method for manufacturing a single crystal nanowire

NARCIS (Netherlands)

van den Berg, Albert; Bomer, Johan G.; Carlen, Edwin; Chen, S.; Kraaijenhagen, Roderik Adriaan; Pinedo, Herbert Michael

2013-01-01

A method for manufacturing a single crystal nano-structure is provided comprising the steps of providing a device layer with a 100 structure on a substrate; providing a stress layer onto the device layer; patterning the stress layer along the 110 direction of the device layer; selectively removing
Method for manufacturing a single crystal nanowire

NARCIS (Netherlands)

van den Berg, Albert; Bomer, Johan G.; Carlen, Edwin; Chen, S.; Kraaijenhagen, R.A.; Pinedo, Herbert Michael

2010-01-01

A method for manufacturing a single crystal nano-structure is provided comprising the steps of providing a device layer with a 100 structure on a substrate; providing a stress layer onto the device layer; patterning the stress layer along the 110 direction of the device layer; selectively removing
Self-seeded single-frequency laser peening method

Science.gov (United States)

DAne, C Brent; Hackey, Lloyd A; Harris, Fritz B

2012-06-26

A method of operating a laser to obtain an output pulse having a single wavelength, comprises inducing an intracavity loss into a laser resonator having an amount that prevents oscillation during a time that energy from the pump source is being stored in the gain medium. Gain is built up in the gain medium with energy from the pump source until formation of a single-frequency relaxation oscillation pulse in the resonator. Upon detection of the onset of the relaxation oscillation pulse, the intracavity loss is reduced, such as by Q-switching, so that the built-up gain stored in the gain medium is output from the resonator in the form of an output pulse at a single frequency. An electronically controllable output coupler is controlled to affect output pulse characteristics. The laser acts a master oscillator in a master oscillator power amplifier configuration. The laser is used for laser peening.
Imaging by the SSFSE single slice method at different viscosities of bile

International Nuclear Information System (INIS)

Kubo, Hiroya; Usui, Motoki; Fukunaga, Kenichi; Yamamoto, Naruto; Ikegami, Toshimi

2001-01-01

The single shot fast spin echo single thick slice method (single slice method) is a technique that visualizes the water component alone using a heavy T 2 . However, this method is considered to be markedly affected by changes in the viscosity of the material because a very long TE is used, and changes in the T 2 value, which are related to viscosity, directly affect imaging. In this study, we evaluated the relationship between the effects of TE and the T 2 value of bile in the single slice method and also examined the relationship between the signal intensity of bile on T 1 - and T 2 -weighted images and imaging by MR cholangiography (MRC). It was difficult to image bile with high viscosities at a usual effective TE level of 700-1,500 ms. With regard to the relationship between the signal intensity of bile and MRC imaging, all T 2 values of the bile samples showing relatively high signal intensities on the T 1 -weighted images suggested high viscosities, and MRC imaging of these bile samples was poor. In conclusion, MRC imaging of bile with high viscosities was poor with the single slice method. Imaging by the single slice method alone of bile showing a relatively high signal intensity on T 1 -weighted images should be avoided, and combination with other MRC sequences should be used. (author)
Statistical methods for change-point detection in surface temperature records

Science.gov (United States)

Pintar, A. L.; Possolo, A.; Zhang, N. F.

2013-09-01

We describe several statistical methods to detect possible change-points in a time series of values of surface temperature measured at a meteorological station, and to assess the statistical significance of such changes, taking into account the natural variability of the measured values, and the autocorrelations between them. These methods serve to determine whether the record may suffer from biases unrelated to the climate signal, hence whether there may be a need for adjustments as considered by M. J. Menne and C. N. Williams (2009) "Homogenization of Temperature Series via Pairwise Comparisons", Journal of Climate 22 (7), 1700-1717. We also review methods to characterize patterns of seasonality (seasonal decomposition using monthly medians or robust local regression), and explain the role they play in the imputation of missing values, and in enabling robust decompositions of the measured values into a seasonal component, a possible climate signal, and a station-specific remainder. The methods for change-point detection that we describe include statistical process control, wavelet multi-resolution analysis, adaptive weights smoothing, and a Bayesian procedure, all of which are applicable to single station records.
Methods of forming single source precursors, methods of forming polymeric single source precursors, and single source precursors and intermediate products formed by such methods

Science.gov (United States)

Fox, Robert V.; Rodriguez, Rene G.; Pak, Joshua J.; Sun, Chivin; Margulieux, Kelsey R.; Holland, Andrew W.

2012-12-04

Methods of forming single source precursors (SSPs) include forming intermediate products having the empirical formula 1/2{L.sub.2N(.mu.-X).sub.2M'X.sub.2}.sub.2, and reacting MER with the intermediate products to form SSPs of the formula L.sub.2N(.mu.-ER).sub.2M'(ER).sub.2, wherein L is a Lewis base, M is a Group IA atom, N is a Group IB atom, M' is a Group IIIB atom, each E is a Group VIB atom, each X is a Group VIIA atom or a nitrate group, and each R group is an alkyl, aryl, vinyl, (per)fluoro alkyl, (per)fluoro aryl, silane, or carbamato group. Methods of forming polymeric or copolymeric SSPs include reacting at least one of HE.sup.1R.sup.1E.sup.1H and MER with one or more substances having the empirical formula L.sub.2N(.mu.-ER).sub.2M'(ER).sub.2 or L.sub.2N(.mu.-X).sub.2M'(X).sub.2 to form a polymeric or copolymeric SSP. New SSPs and intermediate products are formed by such methods.
Quantitative Single-letter Sequencing: a method for simultaneously monitoring numerous known allelic variants in single DNA samples

Directory of Open Access Journals (Sweden)

Duborjal Hervé

2008-02-01

Full Text Available Abstract Background Pathogens such as fungi, bacteria and especially viruses, are highly variable even within an individual host, intensifying the difficulty of distinguishing and accurately quantifying numerous allelic variants co-existing in a single nucleic acid sample. The majority of currently available techniques are based on real-time PCR or primer extension and often require multiplexing adjustments that impose a practical limitation of the number of alleles that can be monitored simultaneously at a single locus. Results Here, we describe a novel method that allows the simultaneous quantification of numerous allelic variants in a single reaction tube and without multiplexing. Quantitative Single-letter Sequencing (QSS begins with a single PCR amplification step using a pair of primers flanking the polymorphic region of interest. Next, PCR products are submitted to single-letter sequencing with a fluorescently-labelled primer located upstream of the polymorphic region. The resulting monochromatic electropherogram shows numerous specific diagnostic peaks, attributable to specific variants, signifying their presence/absence in the DNA sample. Moreover, peak fluorescence can be quantified and used to estimate the frequency of the corresponding variant in the DNA population. Using engineered allelic markers in the genome of Cauliflower mosaic virus, we reliably monitored six different viral genotypes in DNA extracted from infected plants. Evaluation of the intrinsic variance of this method, as applied to both artificial plasmid DNA mixes and viral genome populations, demonstrates that QSS is a robust and reliable method of detection and quantification for variants with a relative frequency of between 0.05 and 1. Conclusion This simple method is easily transferable to many other biological systems and questions, including those involving high throughput analysis, and can be performed in any laboratory since it does not require specialized
Imaging by the SSFSE single slice method at different viscosities of bile

Energy Technology Data Exchange (ETDEWEB)

Kubo, Hiroya; Usui, Motoki; Fukunaga, Kenichi; Yamamoto, Naruto; Ikegami, Toshimi [Kawasaki Hospital, Kobe (Japan)

2001-11-01

The single shot fast spin echo single thick slice method (single slice method) is a technique that visualizes the water component alone using a heavy T{sub 2}. However, this method is considered to be markedly affected by changes in the viscosity of the material because a very long TE is used, and changes in the T{sub 2} value, which are related to viscosity, directly affect imaging. In this study, we evaluated the relationship between the effects of TE and the T{sub 2} value of bile in the single slice method and also examined the relationship between the signal intensity of bile on T{sub 1}- and T{sub 2}-weighted images and imaging by MR cholangiography (MRC). It was difficult to image bile with high viscosities at a usual effective TE level of 700-1,500 ms. With regard to the relationship between the signal intensity of bile and MRC imaging, all T{sub 2} values of the bile samples showing relatively high signal intensities on the T{sub 1}-weighted images suggested high viscosities, and MRC imaging of these bile samples was poor. In conclusion, MRC imaging of bile with high viscosities was poor with the single slice method. Imaging by the single slice method alone of bile showing a relatively high signal intensity on T{sub 1}-weighted images should be avoided, and combination with other MRC sequences should be used. (author)

Method: a single nucleotide polymorphism genotyping method for Wheat streak mosaic virus

Science.gov (United States)

2012-01-01

Background The September 11, 2001 attacks on the World Trade Center and the Pentagon increased the concern about the potential for terrorist attacks on many vulnerable sectors of the US, including agriculture. The concentrated nature of crops, easily obtainable biological agents, and highly detrimental impacts make agroterrorism a potential threat. Although procedures for an effective criminal investigation and attribution following such an attack are available, important enhancements are still needed, one of which is the capability for fine discrimination among pathogen strains. The purpose of this study was to develop a molecular typing assay for use in a forensic investigation, using Wheat streak mosaic virus (WSMV) as a model plant virus. Method This genotyping technique utilizes single base primer extension to generate a genetic fingerprint. Fifteen single nucleotide polymorphisms (SNPs) within the coat protein and helper component-protease genes were selected as the genetic markers for this assay. Assay optimization and sensitivity testing was conducted using synthetic targets. WSMV strains and field isolates were collected from regions around the world and used to evaluate the assay for discrimination. The assay specificity was tested against a panel of near-neighbors consisting of genetic and environmental near-neighbors. Result Each WSMV strain or field isolate tested produced a unique SNP fingerprint, with the exception of three isolates collected within the same geographic location that produced indistinguishable fingerprints. The results were consistent among replicates, demonstrating the reproducibility of the assay. No SNP fingerprints were generated from organisms included in the near-neighbor panel, suggesting the assay is specific for WSMV. Using synthetic targets, a complete profile could be generated from as low as 7.15 fmoles of cDNA. Conclusion The molecular typing method presented is one tool that could be incorporated into the forensic
Method of stabilizing single channel analyzers

International Nuclear Information System (INIS)

Fasching, G.E.; Patton, G.H.

1975-01-01

A method and the apparatus to reduce the drift of single channel analyzers are described. Essentially, this invention employs a time-sharing or multiplexing technique to insure that the outputs from two single channel analyzers (SCAS) maintain the same count ratio regardless of variations in the threshold voltage source or voltage changes, the multiplexing technique is accomplished when a flip flop, actuated by a clock, changes state to switch the output from the individual SCAS before these outputs are sent to a ratio counting scalar. In the particular system embodiment disclosed that illustrates this invention, the sulfur content of coal is determined by subjecting the coal to radiation from a neutron producing source. A photomultiplier and detector system equates the transmitted gamma radiation to an analog voltage signal and sends the same signal after amplification, to a SCA system that contains the invention. Therein, at least two single channel analyzers scan the analog signal over different parts of a spectral region. The two outputs may then be sent to a digital multiplexer so that the output from the multiplexer contains counts falling within two distinct segments of the region. By dividing the counts from the multiplexer by each other, the percentage of sulfur within the coal sample under observation may be determined. (U.S.)
The comparison of cardiovascular risk scores using two methods of substituting missing risk factor data in patient medical records

Directory of Open Access Journals (Sweden)

Andrew Dalton

2011-07-01

Conclusions A simple method of substituting missing risk factor data can produce reliable estimates of CVD risk scores. Targeted screening for high CVD risk, using pre-existing electronic medical record data, does not require multiple imputation methods in risk estimation.
Villa Marie Nursing Home, Grange, Templemore Road, Roscrea, Tipperary.

LENUS (Irish Health Repository)

Hardouin, Jean-Benoit

2011-07-14

Abstract Background Nowadays, more and more clinical scales consisting in responses given by the patients to some items (Patient Reported Outcomes - PRO), are validated with models based on Item Response Theory, and more specifically, with a Rasch model. In the validation sample, presence of missing data is frequent. The aim of this paper is to compare sixteen methods for handling the missing data (mainly based on simple imputation) in the context of psychometric validation of PRO by a Rasch model. The main indexes used for validation by a Rasch model are compared. Methods A simulation study was performed allowing to consider several cases, notably the possibility for the missing values to be informative or not and the rate of missing data. Results Several imputations methods produce bias on psychometrical indexes (generally, the imputation methods artificially improve the psychometric qualities of the scale). In particular, this is the case with the method based on the Personal Mean Score (PMS) which is the most commonly used imputation method in practice. Conclusions Several imputation methods should be avoided, in particular PMS imputation. From a general point of view, it is important to use an imputation method that considers both the ability of the patient (measured for example by his\\/her score), and the difficulty of the item (measured for example by its rate of favourable responses). Another recommendation is to always consider the addition of a random process in the imputation method, because such a process allows reducing the bias. Last, the analysis realized without imputation of the missing data (available case analyses) is an interesting alternative to the simple imputation in this context.
Frequency guided methods for demodulation of a single fringe pattern.

Science.gov (United States)

Wang, Haixia; Kemao, Qian

2009-08-17

Phase demodulation from a single fringe pattern is a challenging task but of interest. A frequency-guided regularized phase tracker and a frequency-guided sequential demodulation method with Levenberg-Marquardt optimization are proposed to demodulate a single fringe pattern. Demodulation path guided by the local frequency from the highest to the lowest is applied in both methods. Since critical points have low local frequency values, they are processed last so that the spurious sign problem caused by these points is avoided. These two methods can be considered as alternatives to the effective fringe follower regularized phase tracker. Demodulation results from one computer-simulated and two experimental fringe patterns using the proposed methods will be demonstrated. (c) 2009 Optical Society of America
Method: a single nucleotide polymorphism genotyping method for Wheat streak mosaic virus.

Science.gov (United States)

Rogers, Stephanie M; Payton, Mark; Allen, Robert W; Melcher, Ulrich; Carver, Jesse; Fletcher, Jacqueline

2012-05-17

The September 11, 2001 attacks on the World Trade Center and the Pentagon increased the concern about the potential for terrorist attacks on many vulnerable sectors of the US, including agriculture. The concentrated nature of crops, easily obtainable biological agents, and highly detrimental impacts make agroterrorism a potential threat. Although procedures for an effective criminal investigation and attribution following such an attack are available, important enhancements are still needed, one of which is the capability for fine discrimination among pathogen strains. The purpose of this study was to develop a molecular typing assay for use in a forensic investigation, using Wheat streak mosaic virus (WSMV) as a model plant virus. This genotyping technique utilizes single base primer extension to generate a genetic fingerprint. Fifteen single nucleotide polymorphisms (SNPs) within the coat protein and helper component-protease genes were selected as the genetic markers for this assay. Assay optimization and sensitivity testing was conducted using synthetic targets. WSMV strains and field isolates were collected from regions around the world and used to evaluate the assay for discrimination. The assay specificity was tested against a panel of near-neighbors consisting of genetic and environmental near-neighbors. Each WSMV strain or field isolate tested produced a unique SNP fingerprint, with the exception of three isolates collected within the same geographic location that produced indistinguishable fingerprints. The results were consistent among replicates, demonstrating the reproducibility of the assay. No SNP fingerprints were generated from organisms included in the near-neighbor panel, suggesting the assay is specific for WSMV. Using synthetic targets, a complete profile could be generated from as low as 7.15 fmoles of cDNA. The molecular typing method presented is one tool that could be incorporated into the forensic science tool box after a thorough
Real stabilization method for nuclear single-particle resonances

International Nuclear Information System (INIS)

Zhang Li; Zhou Shangui; Meng Jie; Zhao Enguang

2008-01-01

We develop the real stabilization method within the framework of the relativistic mean-field (RMF) model. With the self-consistent nuclear potentials from the RMF model, the real stabilization method is used to study single-particle resonant states in spherical nuclei. As examples, the energies, widths, and wave functions of low-lying neutron resonant states in 120 Sn are obtained. These results are compared with those from the scattering phase-shift method and the analytic continuation in the coupling constant approach and satisfactory agreements are found
Comparing a single-stage geocoding method to a multi-stage geocoding method: how much and where do they disagree?

Directory of Open Access Journals (Sweden)

Rice Kenneth

2007-03-01

Full Text Available Abstract Background Geocoding methods vary among spatial epidemiology studies. Errors in the geocoding process and differential match rates may reduce study validity. We compared two geocoding methods using 8,157 Washington State addresses. The multi-stage geocoding method implemented by the state health department used a sequence of local and national reference files. The single-stage method used a single national reference file. For each address geocoded by both methods, we measured the distance between the locations assigned by each method. Area-level characteristics were collected from census data, and modeled as predictors of the discordance between geocoded address coordinates. Results The multi-stage method had a higher match rate than the single-stage method: 99% versus 95%. Of 7,686 addresses were geocoded by both methods, 96% were geocoded to the same census tract by both methods and 98% were geocoded to locations within 1 km of each other by the two methods. The distance between geocoded coordinates for the same address was higher in sparsely populated and low poverty areas, and counties with local reference files. Conclusion The multi-stage geocoding method had a higher match rate than the single-stage method. An examination of differences in the location assigned to the same address suggested that study results may be most sensitive to the choice of geocoding method in sparsely populated or low-poverty areas.
Cheap arbitrary high order methods for single integrand SDEs

DEFF Research Database (Denmark)

Debrabant, Kristian; Kværnø, Anne

2017-01-01

For a particular class of Stratonovich SDE problems, here denoted as single integrand SDEs, we prove that by applying a deterministic Runge-Kutta method of order $p_d$ we obtain methods converging in the mean-square and weak sense with order $\\lfloor p_d/2\\rfloor$. The reason is that the B-series...
Principles of crystallization, and methods of single crystal growth

International Nuclear Information System (INIS)

Chacra, T.

2010-01-01

Most of single crystals (monocrystals), have distinguished optical, electrical, or magnetic properties, which make from single crystals, key elements in most of technical modern devices, as they may be used as lenses, Prisms, or grating sin optical devises, or Filters in X-Ray and spectrographic devices, or conductors and semiconductors in electronic, and computer industries. Furthermore, Single crystals are used in transducer devices. Moreover, they are indispensable elements in Laser and Maser emission technology.Crystal Growth Technology (CGT), has started, and developed in the international Universities and scientific institutions, aiming at some of single crystals, which may have significant properties and industrial applications, that can attract the attention of international crystal growth centers, to adopt the industrial production and marketing of such crystals. Unfortunately, Arab universities generally, and Syrian universities specifically, do not give even the minimum interest, to this field of Science.The purpose of this work is to attract the attention of Crystallographers, Physicists and Chemists in the Arab universities and research centers to the importance of crystal growth, and to work on, in the first stage to establish simple, uncomplicated laboratories for the growth of single crystal. Such laboratories can be supplied with equipment, which are partly available or can be manufactured in the local market. Many references (Articles, Papers, Diagrams, etc..) has been studied, to conclude the most important theoretical principles of Phase transitions,especially of crystallization. The conclusions of this study, are summarized in three Principles; Thermodynamic-, Morphologic-, and Kinetic-Principles. The study is completed by a brief description of the main single crystal growth methods with sketches, of equipment used in each method, which can be considered as primary designs for the equipment, of a new crystal growth laboratory. (author)
Single-mismatch 2LSB embedding method of steganography

OpenAIRE

Khalind, Omed; Aziz, Benjamin

2013-01-01

This paper proposes a new method of 2LSB embedding steganography in still images. The proposed method considers a single mismatch in each 2LSB embedding between the 2LSB of the pixel value and the 2-bits of the secret message, while the 2LSB replacement overwrites the 2LSB of the image’s pixel value with 2-bits of the secret message. The number of bit-changes needed for the proposed method is 0.375 bits from the 2LSBs of the cover image, and is much less than the 2LSB replacement which is 0.5...
METHOD FOR MANUFACTURING A SINGLE CRYSTAL NANO-WIRE

NARCIS (Netherlands)

Van Den Berg, Albert; Bomer, Johan; Carlen Edwin, Thomas; Chen, Songyue; Kraaijenhagen Roderik, Adriaan; Pinedo Herbert, Michael

2012-01-01

A method for manufacturing a single crystal nano-structure includes providing a device layer with a 100 structure on a substrate; providing a stress layer onto the device layer; patterning the stress layer along the 110 direction of the device layer; selectively removing parts of the stress layer to
METHOD FOR MANUFACTURING A SINGLE CRYSTAL NANO-WIRE.

NARCIS (Netherlands)

Van Den Berg, Albert; Bomer, Johan; Carlen Edwin, Thomas; Chen, Songyue; Kraaijenhagen Roderik, Adriaan; Pinedo Herbert, Michael

2011-01-01

A method for manufacturing a single crystal nano-structure is provided comprising the steps of providing a device layer with a 100 structure on a substrate; providing a stress layer onto the device layer; patterning the stress layer along the 110 direction of the device layer; selectively removing
Nanolithography based contacting method for electrical measurements on single template synthesized nanowires

DEFF Research Database (Denmark)

Fusil, S.; Piraux, L.; Mátéfi-Tempfli, Stefan

2005-01-01

A reliable method enabling electrical measurements on single nanowires prepared by electrodeposition in an alumina template is described. This technique is based on electrically controlled nanoindentation of a thin insulating resist deposited on the top face of the template filled by the nanowires....... We show that this method is very flexible, allowing us to electrically address single nanowires of controlled length down to 100 nm and of desired composition. Using this approach, current densities as large as 10 A cm were successfully injected through a point contact on a single magnetic...
A Synchronization Method for Single-Phase Grid-Tied Inverters

DEFF Research Database (Denmark)

Hadjidemetriou, Lenos; Kyriakides, Elias; Yang, Yongheng

2016-01-01

The controllers of single-phase grid-tied inverters require improvements to enable distribution generation systems to meet the grid codes/standards with respect to power quality and the fault ride through capability. In that case, the response of the selected synchronization technique is crucial...... for the performance of the entire grid-tied inverter. In this paper, a new synchronization method with good dynamics and high accuracy under a highly distorted voltage is proposed. This method uses a Multi-Harmonic Decoupling Cell (MHDC), which thus can cancel out the oscillations on the synchronization signals due...... to the harmonic voltage distortion while maintaining the dynamic response of the synchronization. Therefore, the accurate and dynamic response of the proposed MHDC-PLL can be beneficial for the performance of the whole single-phase grid-tied inverter....
A MISO-ARX-Based Method for Single-Trial Evoked Potential Extraction

Directory of Open Access Journals (Sweden)

Nannan Yu

2017-01-01

Full Text Available In this paper, we propose a novel method for solving the single-trial evoked potential (EP estimation problem. In this method, the single-trial EP is considered as a complex containing many components, which may originate from different functional brain sites; these components can be distinguished according to their respective latencies and amplitudes and are extracted simultaneously by multiple-input single-output autoregressive modeling with exogenous input (MISO-ARX. The extraction process is performed in three stages: first, we use a reference EP as a template and decompose it into a set of components, which serve as subtemplates for the remaining steps. Then, a dictionary is constructed with these subtemplates, and EPs are preliminarily extracted by sparse coding in order to roughly estimate the latency of each component. Finally, the single-trial measurement is parametrically modeled by MISO-ARX while characterizing spontaneous electroencephalographic activity as an autoregression model driven by white noise and with each component of the EP modeled by autoregressive-moving-average filtering of the subtemplates. Once optimized, all components of the EP can be extracted. Compared with ARX, our method has greater tracking capabilities of specific components of the EP complex as each component is modeled individually in MISO-ARX. We provide exhaustive experimental results to show the effectiveness and feasibility of our method.
TODIM Method for Single-Valued Neutrosophic Multiple Attribute Decision Making

Directory of Open Access Journals (Sweden)

Dong-Sheng Xu

2017-10-01

Full Text Available Recently, the TODIM has been used to solve multiple attribute decision making (MADM problems. The single-valued neutrosophic sets (SVNSs are useful tools to depict the uncertainty of the MADM. In this paper, we will extend the TODIM method to the MADM with the single-valued neutrosophic numbers (SVNNs. Firstly, the definition, comparison, and distance of SVNNs are briefly presented, and the steps of the classical TODIM method for MADM problems are introduced. Then, the extended classical TODIM method is proposed to deal with MADM problems with the SVNNs, and its significant characteristic is that it can fully consider the decision makers’ bounded rationality which is a real action in decision making. Furthermore, we extend the proposed model to interval neutrosophic sets (INSs. Finally, a numerical example is proposed.
Improvement of Source Number Estimation Method for Single Channel Signal.

Directory of Open Access Journals (Sweden)

Zhi Dong

Full Text Available Source number estimation methods for single channel signal have been investigated and the improvements for each method are suggested in this work. Firstly, the single channel data is converted to multi-channel form by delay process. Then, algorithms used in the array signal processing, such as Gerschgorin's disk estimation (GDE and minimum description length (MDL, are introduced to estimate the source number of the received signal. The previous results have shown that the MDL based on information theoretic criteria (ITC obtains a superior performance than GDE at low SNR. However it has no ability to handle the signals containing colored noise. On the contrary, the GDE method can eliminate the influence of colored noise. Nevertheless, its performance at low SNR is not satisfactory. In order to solve these problems and contradictions, the work makes remarkable improvements on these two methods on account of the above consideration. A diagonal loading technique is employed to ameliorate the MDL method and a jackknife technique is referenced to optimize the data covariance matrix in order to improve the performance of the GDE method. The results of simulation have illustrated that the performance of original methods have been promoted largely.
Imputing Variants in HLA-DR Beta Genes Reveals That HLA-DRB1 Is Solely Associated with Rheumatoid Arthritis and Systemic Lupus Erythematosus.

Directory of Open Access Journals (Sweden)

Kwangwoo Kim

Full Text Available The genetic association of HLA-DRB1 with rheumatoid arthritis (RA and systemic lupus erythematosus (SLE is well documented, but association with other HLA-DR beta genes (HLA-DRB3, HLA-DRB4 and HLA-DRB5 has not been thoroughly studied, despite their similar functions and chromosomal positions. We examined variants in all functional HLA-DR beta genes in RA and SLE patients and controls, down to the amino-acid level, to better understand disease association with the HLA-DR locus. To this end, we improved an existing HLA reference panel to impute variants in all protein-coding HLA-DR beta genes. Using the reference panel, HLA variants were inferred from high-density SNP data of 9,271 RA-control subjects and 5,342 SLE-control subjects. Disease association tests were performed by logistic regression and log-likelihood ratio tests. After imputation using the newly constructed HLA reference panel and statistical analysis, we observed that HLA-DRB1 variants better accounted for the association between MHC and susceptibility to RA and SLE than did the other three HLA-DRB variants. Moreover, there were no secondary effects in HLA-DRB3, HLA-DRB4, or HLA-DRB5 in RA or SLE. Of all the HLA-DR beta chain paralogs, those encoded by HLA-DRB1 solely or dominantly influence susceptibility to RA and SLE.
An Asymmetrical Space Vector Method for Single Phase Induction Motor

DEFF Research Database (Denmark)

Cui, Yuanhai; Blaabjerg, Frede; Andersen, Gert Karmisholt

2002-01-01

Single phase induction motors are the workhorses in low-power applications in the world, and also the variable speed is necessary. Normally it is achieved either by the mechanical method or by controlling the capacitor connected with the auxiliary winding. Any above method has some drawback which...

A method of combined single-cell electrophysiology and electroporation.

Science.gov (United States)

Graham, Lyle J; Del Abajo, Ricardo; Gener, Thomas; Fernandez, Eduardo

2007-02-15

This paper describes a method of extracellular recording and subsequent electroporation with the same electrode in single retinal ganglion cells in vitro. We demonstrate anatomical identification of neurons whose receptive fields were measured quantitatively. We discuss how this simple method should also be applicable for the delivery of a variety of intracellular agents, including gene delivery, to physiologically characterized neurons, both in vitro and in vivo.
A New Power Calculation Method for Single-Phase Grid-Connected Systems

DEFF Research Database (Denmark)

Yang, Yongheng; Blaabjerg, Frede

2013-01-01

A new method to calculate average active power and reactive power for single-phase systems is proposed in this paper. It can be used in different applications where the output active power and reactive power need to be calculated accurately and fast. For example, a grid-connected photovoltaic...... system in low voltage ride through operation mode requires a power feedback for the power control loop. Commonly, a Discrete Fourier Transform (DFT) based power calculation method can be adopted in such systems. However, the DFT method introduces at least a one-cycle time delay. The new power calculation...... method, which is based on the adaptive filtering technique, can achieve a faster response. The performance of the proposed method is verified by experiments and demonstrated in a 1 kW single-phase grid-connected system operating under different conditions.Experimental results show the effectiveness...
Protein structural model selection by combining consensus and single scoring methods.

Directory of Open Access Journals (Sweden)

Zhiquan He

Full Text Available Quality assessment (QA for predicted protein structural models is an important and challenging research problem in protein structure prediction. Consensus Global Distance Test (CGDT methods assess each decoy (predicted structural model based on its structural similarity to all others in a decoy set and has been proved to work well when good decoys are in a majority cluster. Scoring functions evaluate each single decoy based on its structural properties. Both methods have their merits and limitations. In this paper, we present a novel method called PWCom, which consists of two neural networks sequentially to combine CGDT and single model scoring methods such as RW, DDFire and OPUS-Ca. Specifically, for every pair of decoys, the difference of the corresponding feature vectors is input to the first neural network which enables one to predict whether the decoy-pair are significantly different in terms of their GDT scores to the native. If yes, the second neural network is used to decide which one of the two is closer to the native structure. The quality score for each decoy in the pool is based on the number of winning times during the pairwise comparisons. Test results on three benchmark datasets from different model generation methods showed that PWCom significantly improves over consensus GDT and single scoring methods. The QA server (MUFOLD-Server applying this method in CASP 10 QA category was ranked the second place in terms of Pearson and Spearman correlation performance.
Single- versus multiple-sample method to measure glomerular filtration rate.

Science.gov (United States)

Delanaye, Pierre; Flamant, Martin; Dubourg, Laurence; Vidal-Petiot, Emmanuelle; Lemoine, Sandrine; Cavalier, Etienne; Schaeffner, Elke; Ebert, Natalie; Pottel, Hans

2018-01-08

There are many different ways to measure glomerular filtration rate (GFR) using various exogenous filtration markers, each having their own strengths and limitations. However, not only the marker, but also the methodology may vary in many ways, including the use of urinary or plasma clearance, and, in the case of plasma clearance, the number of time points used to calculate the area under the concentration-time curve, ranging from only one (Jacobsson method) to eight (or more) blood samples. We collected the results obtained from 5106 plasma clearances (iohexol or 51Cr-ethylenediaminetetraacetic acid (EDTA)) using three to four time points, allowing GFR calculation using the slope-intercept method and the Bröchner-Mortensen correction. For each time point, the Jacobsson formula was applied to obtain the single-sample GFR. We used Bland-Altman plots to determine the accuracy of the Jacobsson method at each time point. The single-sample method showed within 10% concordances with the multiple-sample method of 66.4%, 83.6%, 91.4% and 96.0% at the time points 120, 180, 240 and ≥300 min, respectively. Concordance was poorer at lower GFR levels, and this trend is in parallel with increasing age. Results were similar in males and females. Some discordance was found in the obese subjects. Single-sample GFR is highly concordant with a multiple-sample strategy, except in the low GFR range (<30 mL/min). © The Author 2018. Published by Oxford University Press on behalf of ERA-EDTA. All rights reserved.
METHODS FOR CLUSTERING TIME SERIES DATA ACQUIRED FROM MOBILE HEALTH APPS.

Science.gov (United States)

Tignor, Nicole; Wang, Pei; Genes, Nicholas; Rogers, Linda; Hershman, Steven G; Scott, Erick R; Zweig, Micol; Yvonne Chan, Yu-Feng; Schadt, Eric E

2017-01-01

In our recent Asthma Mobile Health Study (AMHS), thousands of asthma patients across the country contributed medical data through the iPhone Asthma Health App on a daily basis for an extended period of time. The collected data included daily self-reported asthma symptoms, symptom triggers, and real time geographic location information. The AMHS is just one of many studies occurring in the context of now many thousands of mobile health apps aimed at improving wellness and better managing chronic disease conditions, leveraging the passive and active collection of data from mobile, handheld smart devices. The ability to identify patient groups or patterns of symptoms that might predict adverse outcomes such as asthma exacerbations or hospitalizations from these types of large, prospectively collected data sets, would be of significant general interest. However, conventional clustering methods cannot be applied to these types of longitudinally collected data, especially survey data actively collected from app users, given heterogeneous patterns of missing values due to: 1) varying survey response rates among different users, 2) varying survey response rates over time of each user, and 3) non-overlapping periods of enrollment among different users. To handle such complicated missing data structure, we proposed a probability imputation model to infer missing data. We also employed a consensus clustering strategy in tandem with the multiple imputation procedure. Through simulation studies under a range of scenarios reflecting real data conditions, we identified favorable performance of the proposed method over other strategies that impute the missing value through low-rank matrix completion. When applying the proposed new method to study asthma triggers and symptoms collected as part of the AMHS, we identified several patient groups with distinct phenotype patterns. Further validation of the methods described in this paper might be used to identify clinically important
Processing multiphoton states through operation on a single photon: Methods and applications

International Nuclear Information System (INIS)

Lin Qing; He Bing; Bergou, Janos A.; Ren, Yuhang

2009-01-01

Multiphoton states are widely applied in quantum information technology. By the methods presented in this paper, the structure of a multiphoton state in the form of multiple single-photon qubit products can be mapped to a single-photon qudit, which could also be in a separable product with other photons. This makes possible the manipulation of such multiphoton states by processing single-photon states. The optical realization of unknown qubit discrimination [B. He, J. A. Bergou, and Y.-H. Ren, Phys. Rev. A 76, 032301 (2007)] is simplified with the transformation methods. Another application is the construction of quantum logic gates, where the inverse transformations back to the input state spaces are also necessary. We especially show that the modified setups to implement the transformations can realize the deterministic multicontrol gates (including Toffoli gate) operating directly on the products of single-photon qubits.
A high-order positivity-preserving single-stage single-step method for the ideal magnetohydrodynamic equations

Science.gov (United States)

Christlieb, Andrew J.; Feng, Xiao; Seal, David C.; Tang, Qi

2016-07-01

We propose a high-order finite difference weighted ENO (WENO) method for the ideal magnetohydrodynamics (MHD) equations. The proposed method is single-stage (i.e., it has no internal stages to store), single-step (i.e., it has no time history that needs to be stored), maintains a discrete divergence-free condition on the magnetic field, and has the capacity to preserve the positivity of the density and pressure. To accomplish this, we use a Taylor discretization of the Picard integral formulation (PIF) of the finite difference WENO method proposed in Christlieb et al. (2015) [23], where the focus is on a high-order discretization of the fluxes (as opposed to the conserved variables). We use the version where fluxes are expanded to third-order accuracy in time, and for the fluid variables space is discretized using the classical fifth-order finite difference WENO discretization. We use constrained transport in order to obtain divergence-free magnetic fields, which means that we simultaneously evolve the magnetohydrodynamic (that has an evolution equation for the magnetic field) and magnetic potential equations alongside each other, and set the magnetic field to be the (discrete) curl of the magnetic potential after each time step. In this work, we compute these derivatives to fourth-order accuracy. In order to retain a single-stage, single-step method, we develop a novel Lax-Wendroff discretization for the evolution of the magnetic potential, where we start with technology used for Hamilton-Jacobi equations in order to construct a non-oscillatory magnetic field. The end result is an algorithm that is similar to our previous work Christlieb et al. (2014) [8], but this time the time stepping is replaced through a Taylor method with the addition of a positivity-preserving limiter. Finally, positivity preservation is realized by introducing a parameterized flux limiter that considers a linear combination of high and low-order numerical fluxes. The choice of the free
Missing Value Imputation Improves Mortality Risk Prediction Following Cardiac Surgery: An Investigation of an Australian Patient Cohort.

Science.gov (United States)

Karim, Md Nazmul; Reid, Christopher M; Tran, Lavinia; Cochrane, Andrew; Billah, Baki

2017-03-01

The aim of this study was to evaluate the impact of missing values on the prediction performance of the model predicting 30-day mortality following cardiac surgery as an example. Information from 83,309 eligible patients, who underwent cardiac surgery, recorded in the Australia and New Zealand Society of Cardiac and Thoracic Surgeons (ANZSCTS) database registry between 2001 and 2014, was used. An existing 30-day mortality risk prediction model developed from ANZSCTS database was re-estimated using the complete cases (CC) analysis and using multiple imputation (MI) analysis. Agreement between the risks generated by the CC and MI analysis approaches was assessed by the Bland-Altman method. Performances of the two models were compared. One or more missing predictor variables were present in 15.8% of the patients in the dataset. The Bland-Altman plot demonstrated significant disagreement between the risk scores (prisk of mortality. Compared to CC analysis, MI analysis resulted in an average of 8.5% decrease in standard error, a measure of uncertainty. The MI model provided better prediction of mortality risk (observed: 2.69%; MI: 2.63% versus CC: 2.37%, Pvalues improved the 30-day mortality risk prediction following cardiac surgery. Copyright © 2016 Australian and New Zealand Society of Cardiac and Thoracic Surgeons (ANZSCTS) and the Cardiac Society of Australia and New Zealand (CSANZ). Published by Elsevier B.V. All rights reserved.
Discovery and fine-mapping of adiposity loci using high density imputation of genome-wide association studies in individuals of African ancestry: African Ancestry Anthropometry Genetics Consortium.

Science.gov (United States)

Ng, Maggie C Y; Graff, Mariaelisa; Lu, Yingchang; Justice, Anne E; Mudgal, Poorva; Liu, Ching-Ti; Young, Kristin; Yanek, Lisa R; Feitosa, Mary F; Wojczynski, Mary K; Rand, Kristin; Brody, Jennifer A; Cade, Brian E; Dimitrov, Latchezar; Duan, Qing; Guo, Xiuqing; Lange, Leslie A; Nalls, Michael A; Okut, Hayrettin; Tajuddin, Salman M; Tayo, Bamidele O; Vedantam, Sailaja; Bradfield, Jonathan P; Chen, Guanjie; Chen, Wei-Min; Chesi, Alessandra; Irvin, Marguerite R; Padhukasahasram, Badri; Smith, Jennifer A; Zheng, Wei; Allison, Matthew A; Ambrosone, Christine B; Bandera, Elisa V; Bartz, Traci M; Berndt, Sonja I; Bernstein, Leslie; Blot, William J; Bottinger, Erwin P; Carpten, John; Chanock, Stephen J; Chen, Yii-Der Ida; Conti, David V; Cooper, Richard S; Fornage, Myriam; Freedman, Barry I; Garcia, Melissa; Goodman, Phyllis J; Hsu, Yu-Han H; Hu, Jennifer; Huff, Chad D; Ingles, Sue A; John, Esther M; Kittles, Rick; Klein, Eric; Li, Jin; McKnight, Barbara; Nayak, Uma; Nemesure, Barbara; Ogunniyi, Adesola; Olshan, Andrew; Press, Michael F; Rohde, Rebecca; Rybicki, Benjamin A; Salako, Babatunde; Sanderson, Maureen; Shao, Yaming; Siscovick, David S; Stanford, Janet L; Stevens, Victoria L; Stram, Alex; Strom, Sara S; Vaidya, Dhananjay; Witte, John S; Yao, Jie; Zhu, Xiaofeng; Ziegler, Regina G; Zonderman, Alan B; Adeyemo, Adebowale; Ambs, Stefan; Cushman, Mary; Faul, Jessica D; Hakonarson, Hakon; Levin, Albert M; Nathanson, Katherine L; Ware, Erin B; Weir, David R; Zhao, Wei; Zhi, Degui; Arnett, Donna K; Grant, Struan F A; Kardia, Sharon L R; Oloapde, Olufunmilayo I; Rao, D C; Rotimi, Charles N; Sale, Michele M; Williams, L Keoki; Zemel, Babette S; Becker, Diane M; Borecki, Ingrid B; Evans, Michele K; Harris, Tamara B; Hirschhorn, Joel N; Li, Yun; Patel, Sanjay R; Psaty, Bruce M; Rotter, Jerome I; Wilson, James G; Bowden, Donald W; Cupples, L Adrienne; Haiman, Christopher A; Loos, Ruth J F; North, Kari E

2017-04-01

Genome-wide association studies (GWAS) have identified >300 loci associated with measures of adiposity including body mass index (BMI) and waist-to-hip ratio (adjusted for BMI, WHRadjBMI), but few have been identified through screening of the African ancestry genomes. We performed large scale meta-analyses and replications in up to 52,895 individuals for BMI and up to 23,095 individuals for WHRadjBMI from the African Ancestry Anthropometry Genetics Consortium (AAAGC) using 1000 Genomes phase 1 imputed GWAS to improve coverage of both common and low frequency variants in the low linkage disequilibrium African ancestry genomes. In the sex-combined analyses, we identified one novel locus (TCF7L2/HABP2) for WHRadjBMI and eight previously established loci at P African ancestry individuals. An additional novel locus (SPRYD7/DLEU2) was identified for WHRadjBMI when combined with European GWAS. In the sex-stratified analyses, we identified three novel loci for BMI (INTS10/LPL and MLC1 in men, IRX4/IRX2 in women) and four for WHRadjBMI (SSX2IP, CASC8, PDE3B and ZDHHC1/HSD11B2 in women) in individuals of African ancestry or both African and European ancestry. For four of the novel variants, the minor allele frequency was low (African ancestry sex-combined and sex-stratified analyses, 26 BMI loci and 17 WHRadjBMI loci contained ≤ 20 variants in the credible sets that jointly account for 99% posterior probability of driving the associations. The lead variants in 13 of these loci had a high probability of being causal. As compared to our previous HapMap imputed GWAS for BMI and WHRadjBMI including up to 71,412 and 27,350 African ancestry individuals, respectively, our results suggest that 1000 Genomes imputation showed modest improvement in identifying GWAS loci including low frequency variants. Trans-ethnic meta-analyses further improved fine mapping of putative causal variants in loci shared between the African and European ancestry populations.
Multi-Level Wavelet Shannon Entropy-Based Method for Single-Sensor Fault Location

Directory of Open Access Journals (Sweden)

Qiaoning Yang

2015-10-01

Full Text Available In actual application, sensors are prone to failure because of harsh environments, battery drain, and sensor aging. Sensor fault location is an important step for follow-up sensor fault detection. In this paper, two new multi-level wavelet Shannon entropies (multi-level wavelet time Shannon entropy and multi-level wavelet time-energy Shannon entropy are defined. They take full advantage of sensor fault frequency distribution and energy distribution across multi-subband in wavelet domain. Based on the multi-level wavelet Shannon entropy, a method is proposed for single sensor fault location. The method firstly uses a criterion of maximum energy-to-Shannon entropy ratio to select the appropriate wavelet base for signal analysis. Then multi-level wavelet time Shannon entropy and multi-level wavelet time-energy Shannon entropy are used to locate the fault. The method is validated using practical chemical gas concentration data from a gas sensor array. Compared with wavelet time Shannon entropy and wavelet energy Shannon entropy, the experimental results demonstrate that the proposed method can achieve accurate location of a single sensor fault and has good anti-noise ability. The proposed method is feasible and effective for single-sensor fault location.
Evaluation of single and double centrifugation tube methods for concentrating equine platelets.

Science.gov (United States)

Argüelles, D; Carmona, J U; Pastor, J; Iborra, A; Viñals, L; Martínez, P; Bach, E; Prades, M

2006-10-01

The aim of this study was to evaluate single and double centrifugation tube methods for concentrating equine platelets. Whole blood samples were collected from clinically normal horses and processed by use of single and double centrifugation tube methods to obtain four platelet concentrates (PCs): PC-A, PC-B, PC-C, and PC-D, which were analyzed using a flow cytometry hematology system for hemogram and additional platelet parameters (mean platelet volume, platelet distribution width, mean platelet component concentration, mean platelet component distribution width). Concentrations of transforming growth factor beta 1 (TGF-beta(1)) were determined in all the samples. Platelet concentrations for PC-A, PC-B, PC-C, and PC-D were 45%, 44%, 71%, and 21% higher, respectively, compared to the same values for citrated whole blood samples. TGF-beta(1) concentrations for PC-A, PC-B, PC-C, and PC-D were 38%, 44%, 44%, and 37% higher, respectively, compared to citrated whole blood sample values. In conclusion, the single and double centrifugation tube methods are reliable methods for concentrating equine platelets and for obtaining potentially therapeutic TGF-beta(1) levels.
Estimation of missing values in solar radiation data using piecewise interpolation methods: Case study at Penang city

International Nuclear Information System (INIS)

Zainudin, Mohd Lutfi; Saaban, Azizan; Bakar, Mohd Nazari Abu

2015-01-01

The solar radiation values have been composed by automatic weather station using the device that namely pyranometer. The device is functions to records all the radiation values that have been dispersed, and these data are very useful for it experimental works and solar device’s development. In addition, for modeling and designing on solar radiation system application is needed for complete data observation. Unfortunately, lack for obtained the complete solar radiation data frequently occur due to several technical problems, which mainly contributed by monitoring device. Into encountering this matter, estimation missing values in an effort to substitute absent values with imputed data. This paper aimed to evaluate several piecewise interpolation techniques likes linear, splines, cubic, and nearest neighbor into dealing missing values in hourly solar radiation data. Then, proposed an extendable work into investigating the potential used of cubic Bezier technique and cubic Said-ball method as estimator tools. As result, methods for cubic Bezier and Said-ball perform the best compare to another piecewise imputation technique
Estimation of missing values in solar radiation data using piecewise interpolation methods: Case study at Penang city

Science.gov (United States)

Zainudin, Mohd Lutfi; Saaban, Azizan; Bakar, Mohd Nazari Abu

2015-12-01

The solar radiation values have been composed by automatic weather station using the device that namely pyranometer. The device is functions to records all the radiation values that have been dispersed, and these data are very useful for it experimental works and solar device's development. In addition, for modeling and designing on solar radiation system application is needed for complete data observation. Unfortunately, lack for obtained the complete solar radiation data frequently occur due to several technical problems, which mainly contributed by monitoring device. Into encountering this matter, estimation missing values in an effort to substitute absent values with imputed data. This paper aimed to evaluate several piecewise interpolation techniques likes linear, splines, cubic, and nearest neighbor into dealing missing values in hourly solar radiation data. Then, proposed an extendable work into investigating the potential used of cubic Bezier technique and cubic Said-ball method as estimator tools. As result, methods for cubic Bezier and Said-ball perform the best compare to another piecewise imputation technique.
Estimation of missing values in solar radiation data using piecewise interpolation methods: Case study at Penang city

Energy Technology Data Exchange (ETDEWEB)

Zainudin, Mohd Lutfi, E-mail: mdlutfi07@gmail.com [School of Quantitative Sciences, UUMCAS, Universiti Utara Malaysia, 06010 Sintok, Kedah (Malaysia); Institut Matematik Kejuruteraan (IMK), Universiti Malaysia Perlis, 02600 Arau, Perlis (Malaysia); Saaban, Azizan, E-mail: azizan.s@uum.edu.my [School of Quantitative Sciences, UUMCAS, Universiti Utara Malaysia, 06010 Sintok, Kedah (Malaysia); Bakar, Mohd Nazari Abu, E-mail: mohdnazari@perlis.uitm.edu.my [Faculty of Applied Science, Universiti Teknologi Mara, 02600 Arau, Perlis (Malaysia)

2015-12-11

The solar radiation values have been composed by automatic weather station using the device that namely pyranometer. The device is functions to records all the radiation values that have been dispersed, and these data are very useful for it experimental works and solar device’s development. In addition, for modeling and designing on solar radiation system application is needed for complete data observation. Unfortunately, lack for obtained the complete solar radiation data frequently occur due to several technical problems, which mainly contributed by monitoring device. Into encountering this matter, estimation missing values in an effort to substitute absent values with imputed data. This paper aimed to evaluate several piecewise interpolation techniques likes linear, splines, cubic, and nearest neighbor into dealing missing values in hourly solar radiation data. Then, proposed an extendable work into investigating the potential used of cubic Bezier technique and cubic Said-ball method as estimator tools. As result, methods for cubic Bezier and Said-ball perform the best compare to another piecewise imputation technique.
Optimal sampling strategies to assess inulin clearance in children by the inulin single-injection method

NARCIS (Netherlands)

van Rossum, Lyonne K.; Mathot, Ron A. A.; Cransberg, Karlien; Vulto, Arnold G.

2003-01-01

Glomerular filtration rate in patients can be determined by estimating the plasma clearance of inulin with the single-injection method. In this method, a single bolus injection of inulin is administered and several blood samples are collected. For practical and convenient application of this method
A prospective study of calf factors affecting age, body size, and body condition score at first calving of holstein dairy heifers.

Science.gov (United States)

Heinrichs, A J; Heinrichs, B S; Harel, O; Rogers, G W; Place, N T

2005-08-01

Data were collected prospectively on parameters related to first calving on 18 farms located in Northeastern Pennsylvania. This project was designed to study possible residual effects of calf management practices and events occurring during the first 16 wk of life on age, BW, skeletal growth, and body condition score at first calving. Multiple imputation method for handling missing data was incorporated in these analyses. This method has the advantage over ad hoc single imputations because the appropriate error structure is maintained. Much similarity was found between the multiple imputation method and a traditional mixed model analysis, except that some estimates from the multiple imputation method seemed more logical in their effects on the parameter measured. Factors related to increased age at first calving were increased difficulty of delivery, antibiotic treatment of sick calves, increased amount of milk or milk replacer fed before weaning, reduced quality of forage fed to weaned calves, maximum humidity, mean daily temperature, and maximum ammonia levels in calf housing areas. Body weight at calving tended to increase with parity of the dam, increased amount of grain fed to calves, increased ammonia levels, and increased mean temperature of the calf housing area. Body condition score at calving tended to be positively influenced by delivery score at first calving, dam parity, and milk or milk replacer dry matter intake. Withers height at calving was positively affected by treatment of animals with antibiotics and increased mean temperature in the calf area. This study demonstrated that nutrition, housing, and management factors that affect health and growth of calves have long-term effects on the animal at least through first calving.
Gallium arsenide single crystal solar cell structure and method of making

Science.gov (United States)

Stirn, Richard J. (Inventor)

1983-01-01

A production method and structure for a thin-film GaAs crystal for a solar cell on a single-crystal silicon substrate (10) comprising the steps of growing a single-crystal interlayer (12) of material having a closer match in lattice and thermal expansion with single-crystal GaAs than the single-crystal silicon of the substrate, and epitaxially growing a single-crystal film (14) on the interlayer. The material of the interlayer may be germanium or graded germanium-silicon alloy, with low germanium content at the silicon substrate interface, and high germanium content at the upper surface. The surface of the interface layer (12) is annealed for recrystallization by a pulsed beam of energy (laser or electron) prior to growing the interlayer. The solar cell structure may be grown as a single-crystal n.sup.+ /p shallow homojunction film or as a p/n or n/p junction film. A Ga(Al)AS heteroface film may be grown over the GaAs film.
Dealing with gene expression missing data.

Science.gov (United States)

Brás, L P; Menezes, J C

2006-05-01

Compared evaluation of different methods is presented for estimating missing values in microarray data: weighted K-nearest neighbours imputation (KNNimpute), regression-based methods such as local least squares imputation (LLSimpute) and partial least squares imputation (PLSimpute) and Bayesian principal component analysis (BPCA). The influence in prediction accuracy of some factors, such as methods' parameters, type of data relationships used in the estimation process (i.e. row-wise, column-wise or both), missing rate and pattern and type of experiment [time series (TS), non-time series (NTS) or mixed (MIX) experiments] is elucidated. Improvements based on the iterative use of data (iterative LLS and PLS imputation--ILLSimpute and IPLSimpute), the need to perform initial imputations (modified PLS and Helland PLS imputation--MPLSimpute and HPLSimpute) and the type of relationships employed (KNNarray, LLSarray, HPLSarray and alternating PLS--APLSimpute) are proposed. Overall, it is shown that data set properties (type of experiment, missing rate and pattern) affect the data similarity structure, therefore influencing the methods' performance. LLSimpute and ILLSimpute are preferable in the presence of data with a stronger similarity structure (TS and MIX experiments), whereas PLS-based methods (MPLSimpute, IPLSimpute and APLSimpute) are preferable when estimating NTS missing data.
Missing data in randomized clinical trials for weight loss: scope of the problem, state of the field, and performance of statistical methods.

Directory of Open Access Journals (Sweden)

Mai A Elobeid

2009-08-01

Full Text Available Dropouts and missing data are nearly-ubiquitous in obesity randomized controlled trails, threatening validity and generalizability of conclusions. Herein, we meta-analytically evaluate the extent of missing data, the frequency with which various analytic methods are employed to accommodate dropouts, and the performance of multiple statistical methods.We searched PubMed and Cochrane databases (2000-2006 for articles published in English and manually searched bibliographic references. Articles of pharmaceutical randomized controlled trials with weight loss or weight gain prevention as major endpoints were included. Two authors independently reviewed each publication for inclusion. 121 articles met the inclusion criteria. Two authors independently extracted treatment, sample size, drop-out rates, study duration, and statistical method used to handle missing data from all articles and resolved disagreements by consensus. In the meta-analysis, drop-out rates were substantial with the survival (non-dropout rates being approximated by an exponential decay curve (e(-lambdat where lambda was estimated to be .0088 (95% bootstrap confidence interval: .0076 to .0100 and t represents time in weeks. The estimated drop-out rate at 1 year was 37%. Most studies used last observation carried forward as the primary analytic method to handle missing data. We also obtained 12 raw obesity randomized controlled trial datasets for empirical analyses. Analyses of raw randomized controlled trial data suggested that both mixed models and multiple imputation performed well, but that multiple imputation may be more robust when missing data are extensive.Our analysis offers an equation for predictions of dropout rates useful for future study planning. Our raw data analyses suggests that multiple imputation is better than other methods for handling missing data in obesity randomized controlled trials, followed closely by mixed models. We suggest these methods supplant last
Principle Component Analysis with Incomplete Data: A simulation of R pcaMethods package in Constructing an Environmental Quality Index with Missing Data

Science.gov (United States)

Missing data is a common problem in the application of statistical techniques. In principal component analysis (PCA), a technique for dimensionality reduction, incomplete data points are either discarded or imputed using interpolation methods. Such approaches are less valid when ...

Calculation methods for single-sided natural ventilation - simplified or detailed?

DEFF Research Database (Denmark)

Larsen, Tine Steen; Plesner, Christoffer; Leprince, Valérie

2016-01-01

A great energy saving potential lies within increased use of natural ventilation, not only during summer and midseason periods, where it is mainly used today, but also during winter periods, where the outdoor air holds a great cooling potential for ventilative cooling if draft problems can...... be handled. This paper presents a newly developed simplified calculation method for single-sided natural ventilation, which is proposed for the revised standard FprEN 16798-7 (earlier EN 15242:2007) for design of ventilative cooling. The aim for predicting ventilative cooling is to find the most suitable......, while maintaining an acceptable correlation with measurements on average and the authors consider the simplified calculation method well suited for the use in standards such as FprEN 16798-7 for the ventilative cooling effects from single-sided natural ventilation The comparison of different design...
The Accuracy and Bias of Single-Step Genomic Prediction for Populations Under Selection

Directory of Open Access Journals (Sweden)

Wan-Ling Hsu

2017-08-01

Full Text Available In single-step analyses, missing genotypes are explicitly or implicitly imputed, and this requires centering the observed genotypes using the means of the unselected founders. If genotypes are only available for selected individuals, centering on the unselected founder mean is not straightforward. Here, computer simulation is used to study an alternative analysis that does not require centering genotypes but fits the mean μg of unselected individuals as a fixed effect. Starting with observed diplotypes from 721 cattle, a five-generation population was simulated with sire selection to produce 40,000 individuals with phenotypes, of which the 1000 sires had genotypes. The next generation of 8000 genotyped individuals was used for validation. Evaluations were undertaken with (J or without (N μg when marker covariates were not centered; and with (JC or without (C μg when all observed and imputed marker covariates were centered. Centering did not influence accuracy of genomic prediction, but fitting μg did. Accuracies were improved when the panel comprised only quantitative trait loci (QTL; models JC and J had accuracies of 99.4%, whereas models C and N had accuracies of 90.2%. When only markers were in the panel, the 4 models had accuracies of 80.4%. In panels that included QTL, fitting μg in the model improved accuracy, but had little impact when the panel contained only markers. In populations undergoing selection, fitting μg in the model is recommended to avoid bias and reduction in prediction accuracy due to selection.
Value and depreciation of mineral resources over the very long run: An empirical contrast of different methods

OpenAIRE

Rubio Varas, M. del Mar

2005-01-01

The paper contrasts empirically the results of alternative methods for estimating the value and the depreciation of mineral resources. The historical data of Mexico and Venezuela, covering the period 1920s-1980s, is used to contrast the results of several methods. These are the present value, the net price method, the user cost method and the imputed income method. The paper establishes that the net price and the user cost are not competing methods as such, but alternative adjustments to diff...
Exploring the Interplay between Rescue Drugs, Data Imputation, and Study Outcomes: Conceptual Review and Qualitative Analysis of an Acute Pain Data Set.

Science.gov (United States)

Singla, Neil K; Meske, Diana S; Desjardins, Paul J

2017-12-01

In placebo-controlled acute surgical pain studies, provisions must be made for study subjects to receive adequate analgesic therapy. As such, most protocols allow study subjects to receive a pre-specified regimen of open-label analgesic drugs (rescue drugs) as needed. The selection of an appropriate rescue regimen is a critical experimental design choice. We hypothesized that a rescue regimen that is too liberal could lead to all study arms receiving similar levels of pain relief (thereby confounding experimental results), while a regimen that is too stringent could lead to a high subject dropout rate (giving rise to a preponderance of missing data). Despite the importance of rescue regimen as a study design feature, there exist no published review articles or meta-analysis focusing on the impact of rescue therapy on experimental outcomes. Therefore, when selecting a rescue regimen, researchers must rely on clinical factors (what analgesics do patients usually receive in similar surgical scenarios) and/or anecdotal evidence. In the following article, we attempt to bridge this gap by reviewing and discussing the experimental impacts of rescue therapy on a common acute surgical pain population: first metatarsal bunionectomy. The function of this analysis is to (1) create a framework for discussion and future exploration of rescue as a methodological study design feature, (2) discuss the interplay between data imputation techniques and rescue drugs, and (3) inform the readership regarding the impact of data imputation techniques on the validity of study conclusions. Our findings indicate that liberal rescue may degrade assay sensitivity, while stringent rescue may lead to unacceptably high dropout rates.
A single-beam titration method for the quantification of open-path Fourier transform infrared spectroscopy

International Nuclear Information System (INIS)

Sung, Lung-Yu; Lu, Chia-Jung

2014-01-01

This study introduced a quantitative method that can be used to measure the concentration of analytes directly from a single-beam spectrum of open-path Fourier Transform Infrared Spectroscopy (OP-FTIR). The peak shapes of the analytes in a single-beam spectrum were gradually canceled (i.e., “titrated”) by dividing an aliquot of a standard transmittance spectrum with a known concentration, and the sum of the squared differential synthetic spectrum was calculated as an indicator for the end point of this titration. The quantity of a standard transmittance spectrum that is needed to reach the end point can be used to calculate the concentrations of the analytes. A NIST traceable gas standard containing six known compounds was used to compare the quantitative accuracy of both this titration method and that of a classic least square (CLS) using a closed-cell FTIR spectrum. The continuous FTIR analysis of industrial exhausting stack showed that concentration trends were consistent between the CLS and titration methods. The titration method allowed the quantification to be performed without the need of a clean single-beam background spectrum, which was beneficial for the field measurement of OP-FTIR. Persistent constituents of the atmosphere, such as NH 3 , CH 4 and CO, were successfully quantified using the single-beam titration method with OP-FTIR data that is normally inaccurate when using the CLS method due to the lack of a suitable background spectrum. Also, the synthetic spectrum at the titration end point contained virtually no peaks of analytes, but it did contain the remaining information needed to provide an alternative means of obtaining an ideal single-beam background for OP-FTIR. - Highlights: • Establish single beam titration quantification method for OP-FTIR. • Define the indicator for the end-point of spectrum titration. • An ideal background spectrum can be obtained using single beam titration. • Compare the quantification between titration
Single well surfactant test to evaluate surfactant floods using multi tracer method

Science.gov (United States)

Sheely, Clyde Q.

1979-01-01

Data useful for evaluating the effectiveness of or designing an enhanced recovery process said process involving mobilizing and moving hydrocarbons through a hydrocarbon bearing subterranean formation from an injection well to a production well by injecting a mobilizing fluid into the injection well, comprising (a) determining hydrocarbon saturation in a volume in the formation near a well bore penetrating formation, (b) injecting sufficient mobilizing fluid to mobilize and move hydrocarbons from a volume in the formation near the well bore, and (c) determining the hydrocarbon saturation in a volume including at least a part of the volume of (b) by an improved single well surfactant method comprising injecting 2 or more slugs of water containing the primary tracer separated by water slugs containing no primary tracer. Alternatively, the plurality of ester tracers can be injected in a single slug said tracers penetrating varying distances into the formation wherein the esters have different partition coefficients and essentially equal reaction times. The single well tracer method employed is disclosed in U.S. Pat. No. 3,623,842. This method designated the single well surfactant test (SWST) is useful for evaluating the effect of surfactant floods, polymer floods, carbon dioxide floods, micellar floods, caustic floods and the like in subterranean formations in much less time and at much reduced cost compared to conventional multiwell pilot tests.
Accurate single-scattering simulation of ice cloud using the invariant-imbedding T-matrix method and the physical-geometric optics method

Science.gov (United States)

Sun, B.; Yang, P.; Kattawar, G. W.; Zhang, X.

2017-12-01

The ice cloud single-scattering properties can be accurately simulated using the invariant-imbedding T-matrix method (IITM) and the physical-geometric optics method (PGOM). The IITM has been parallelized using the Message Passing Interface (MPI) method to remove the memory limitation so that the IITM can be used to obtain the single-scattering properties of ice clouds for sizes in the geometric optics regime. Furthermore, the results associated with random orientations can be analytically achieved once the T-matrix is given. The PGOM is also parallelized in conjunction with random orientations. The single-scattering properties of a hexagonal prism with height 400 (in units of lambda/2*pi, where lambda is the incident wavelength) and an aspect ratio of 1 (defined as the height over two times of bottom side length) are given by using the parallelized IITM and compared to the counterparts using the parallelized PGOM. The two results are in close agreement. Furthermore, the integrated single-scattering properties, including the asymmetry factor, the extinction cross-section, and the scattering cross-section, are given in a completed size range. The present results show a smooth transition from the exact IITM solution to the approximate PGOM result. Because the calculation of the IITM method has reached the geometric regime, the IITM and the PGOM can be efficiently employed to accurately compute the single-scattering properties of ice cloud in a wide spectral range.
Determining Complex Structures using Docking Method with Single Particle Scattering Data

Directory of Open Access Journals (Sweden)

Haiguang Liu

2017-04-01

Full Text Available Protein complexes are critical for many molecular functions. Due to intrinsic flexibility and dynamics of complexes, their structures are more difficult to determine using conventional experimental methods, in contrast to individual subunits. One of the major challenges is the crystallization of protein complexes. Using X-ray free electron lasers (XFELs, it is possible to collect scattering signals from non-crystalline protein complexes, but data interpretation is more difficult because of unknown orientations. Here, we propose a hybrid approach to determine protein complex structures by combining XFEL single particle scattering data with computational docking methods. Using simulations data, we demonstrate that a small set of single particle scattering data collected at random orientations can be used to distinguish the native complex structure from the decoys generated using docking algorithms. The results also indicate that a small set of single particle scattering data is superior to spherically averaged intensity profile in distinguishing complex structures. Given the fact that XFEL experimental data are difficult to acquire and at low abundance, this hybrid approach should find wide applications in data interpretations.
A simple and rapid method for high-resolution visualization of single-ion tracks

Directory of Open Access Journals (Sweden)

Masaaki Omichi

2014-11-01

Full Text Available Prompt determination of spatial points of single-ion tracks plays a key role in high-energy particle induced-cancer therapy and gene/plant mutations. In this study, a simple method for the high-resolution visualization of single-ion tracks without etching was developed through the use of polyacrylic acid (PAA-N, N’-methylene bisacrylamide (MBAAm blend films. One of the steps of the proposed method includes exposure of the irradiated films to water vapor for several minutes. Water vapor was found to promote the cross-linking reaction of PAA and MBAAm to form a bulky cross-linked structure; the ion-track scars were detectable at a nanometer scale by atomic force microscopy. This study demonstrated that each scar is easily distinguishable, and the amount of generated radicals of the ion tracks can be estimated by measuring the height of the scars, even in highly dense ion tracks. This method is suitable for the visualization of the penumbra region in a single-ion track with a high spatial resolution of 50 nm, which is sufficiently small to confirm that a single ion hits a cell nucleus with a size ranging between 5 and 20 μm.
A simple and rapid method for high-resolution visualization of single-ion tracks

Energy Technology Data Exchange (ETDEWEB)

Omichi, Masaaki [Department of Applied Chemistry, Graduate School of Engineering, Osaka University, Osaka 565-0871 (Japan); Center for Collaborative Research, Anan National College of Technology, Anan, Tokushima 774-0017 (Japan); Choi, Wookjin; Sakamaki, Daisuke; Seki, Shu, E-mail: seki@chem.eng.osaka-u.ac.jp [Department of Applied Chemistry, Graduate School of Engineering, Osaka University, Osaka 565-0871 (Japan); Tsukuda, Satoshi [Institute of Multidisciplinary Research for Advanced Materials, Tohoku University, Sendai, Miyagi 980-8577 (Japan); Sugimoto, Masaki [Japan Atomic Energy Agency, Takasaki Advanced Radiation Research Institute, Gunma, Gunma 370-1292 (Japan)

2014-11-15

Prompt determination of spatial points of single-ion tracks plays a key role in high-energy particle induced-cancer therapy and gene/plant mutations. In this study, a simple method for the high-resolution visualization of single-ion tracks without etching was developed through the use of polyacrylic acid (PAA)-N, N’-methylene bisacrylamide (MBAAm) blend films. One of the steps of the proposed method includes exposure of the irradiated films to water vapor for several minutes. Water vapor was found to promote the cross-linking reaction of PAA and MBAAm to form a bulky cross-linked structure; the ion-track scars were detectable at a nanometer scale by atomic force microscopy. This study demonstrated that each scar is easily distinguishable, and the amount of generated radicals of the ion tracks can be estimated by measuring the height of the scars, even in highly dense ion tracks. This method is suitable for the visualization of the penumbra region in a single-ion track with a high spatial resolution of 50 nm, which is sufficiently small to confirm that a single ion hits a cell nucleus with a size ranging between 5 and 20 μm.
Incomplete Data in Smart Grid: Treatment of Values in Electric Vehicle Charging Data

Energy Technology Data Exchange (ETDEWEB)

Majipour, Mostafa; Chu, Peter; Gadh, Rajit; Pota, Hemanshu R.

2014-11-03

In this paper, five imputation methods namely Constant (zero), Mean, Median, Maximum Likelihood, and Multiple Imputation methods have been applied to compensate for missing values in Electric Vehicle (EV) charging data. The outcome of each of these methods have been used as the input to a prediction algorithm to forecast the EV load in the next 24 hours at each individual outlet. The data is real world data at the outlet level from the UCLA campus parking lots. Given the sparsity of the data, both Median and Constant (=zero) imputations improved the prediction results. Since in most missing value cases in our database, all values of that instance are missing, the multivariate imputation methods did not improve the results significantly compared to univariate approaches.
Effect of amaranth dye on the growth and properties of conventional and SR method grown KAP single crystals

Science.gov (United States)

Babu Rao, G.; P., Rajesh; Ramasamy, P.

2018-04-01

The 0.1 mol% amaranth added KAP single crystals were grown from aqueous solutions by both slow evaporation solution technique and Sankaranarayanan-Ramasamy method. The single crystal having dimension of 45 mm length and 12 mm diameter was grown with growth rate of 1.5 mm/day using SR method. 87 % transmittance is obtained for SR method grown amaranth added KAP single crystal. The high intense luminescence at 661 nm is obtained from amaranth added conventional and SR method grown KAP single crystal. The amaranth added KAP single crystal possesses good mechanical and laser damage threshold stability.
The Choice Method of Selected Material has influence single evaporation flash method

International Nuclear Information System (INIS)

Sunaryo, Geni Rina; Sumijanto; Nurul L, Siti

2000-01-01

The final objective of this research is to design the mini scale of desalination installation. It has been started from 1997/1998 and has been doing for this 3 years. Where the study on the assessment of various desalination system has been done in the first year and thermodynamic in the second year. In this third year, literatully study on material resistance from outside pressure has been done. The number of pressure for single evaporator flashing method is mainly depend on the temperature that applied in that system. In this paper, the configuration stage, the choice method of selecting material for main evaporator vessel, tube, tube plates, water boxes, pipework, and valves for multistage flash distillation will be described. The choice of selecting material for MSF is base on economical consideration, cheap, high resistance and easy to be maintained
Who cares and how much? The imputed economic contribution to the Canadian healthcare system of middle-aged and older unpaid caregivers providing care to the elderly.

Science.gov (United States)

Hollander, Marcus J; Liu, Guiping; Chappell, Neena L

2009-01-01

Canadians provide significant amounts of unpaid care to elderly family members and friends with long-term health problems. While some information is available on the nature of the tasks unpaid caregivers perform, and the amounts of time they spend on these tasks, the contribution of unpaid caregivers is often hidden. (It is recognized that some caregiving may be for short periods of time or may entail matters better described as "help" or "assistance," such as providing transportation. However, we use caregiving to cover the full range of unpaid care provided from some basic help to personal care.) Aggregate estimates of the market costs to replace the unpaid care provided are important to governments for policy development as they provide a means to situate the contributions of unpaid caregivers within Canada's healthcare system. The purpose of this study was to obtain an assessment of the imputed costs of replacing the unpaid care provided by Canadians to the elderly. (Imputed costs is used to refer to costs that would be incurred if the care provided by an unpaid caregiver was, instead, provided by a paid caregiver, on a direct hour-for-hour substitution basis.) The economic value of unpaid care as understood in this study is defined as the cost to replace the services provided by unpaid caregivers at rates for paid care providers.
Scatter measurement and correction method for cone-beam CT based on single grating scan

Science.gov (United States)

Huang, Kuidong; Shi, Wenlong; Wang, Xinyu; Dong, Yin; Chang, Taoqi; Zhang, Hua; Zhang, Dinghua

2017-06-01

In cone-beam computed tomography (CBCT) systems based on flat-panel detector imaging, the presence of scatter significantly reduces the quality of slices. Based on the concept of collimation, this paper presents a scatter measurement and correction method based on single grating scan. First, according to the characteristics of CBCT imaging, the scan method using single grating and the design requirements of the grating are analyzed and figured out. Second, by analyzing the composition of object projection images and object-and-grating projection images, the processing method for the scatter image at single projection angle is proposed. In addition, to avoid additional scan, this paper proposes an angle interpolation method of scatter images to reduce scan cost. Finally, the experimental results show that the scatter images obtained by this method are accurate and reliable, and the effect of scatter correction is obvious. When the additional object-and-grating projection images are collected and interpolated at intervals of 30 deg, the scatter correction error of slices can still be controlled within 3%.
Method for single crystal growth of photovoltaic perovskite material and devices

Science.gov (United States)

Huang, Jinsong; Dong, Qingfeng

2017-11-07

Systems and methods for perovskite single crystal growth include using a low temperature solution process that employs a temperature gradient in a perovskite solution in a container, also including at least one small perovskite single crystal, and a substrate in the solution upon which substrate a perovskite crystal nucleates and grows, in part due to the temperature gradient in the solution and in part due to a temperature gradient in the substrate. For example, a top portion of the substrate external to the solution may be cooled.
A single-step method for rapid extraction of total lipids from green microalgae.

Directory of Open Access Journals (Sweden)

Martin Axelsson

Full Text Available Microalgae produce a wide range of lipid compounds of potential commercial interest. Total lipid extraction performed by conventional extraction methods, relying on the chloroform-methanol solvent system are too laborious and time consuming for screening large numbers of samples. In this study, three previous extraction methods devised by Folch et al. (1957, Bligh and Dyer (1959 and Selstam and Öquist (1985 were compared and a faster single-step procedure was developed for extraction of total lipids from green microalgae. In the single-step procedure, 8 ml of a 2∶1 chloroform-methanol (v/v mixture was added to fresh or frozen microalgal paste or pulverized dry algal biomass contained in a glass centrifuge tube. The biomass was manually suspended by vigorously shaking the tube for a few seconds and 2 ml of a 0.73% NaCl water solution was added. Phase separation was facilitated by 2 min of centrifugation at 350 g and the lower phase was recovered for analysis. An uncharacterized microalgal polyculture and the green microalgae Scenedesmus dimorphus, Selenastrum minutum, and Chlorella protothecoides were subjected to the different extraction methods and various techniques of biomass homogenization. The less labour intensive single-step procedure presented here allowed simultaneous recovery of total lipid extracts from multiple samples of green microalgae with quantitative yields and fatty acid profiles comparable to those of the previous methods. While the single-step procedure is highly correlated in lipid extractability (r² = 0.985 to the previous method of Folch et al. (1957, it allowed at least five times higher sample throughput.
Method for Estimating Evaporative Potential (IM/CLO) from ASTM Standard Single Wind Velocity Measures

Science.gov (United States)

2016-08-10

IM/CLO) FROM ASTM STANDARD SINGLE WIND VELOCITY MEASURES DISCLAIMER The opinions or assertions contained herein are the private views of the...USARIEM TECHNICAL REPORT T16-14 METHOD FOR ESTIMATING EVAPORATIVE POTENTIAL (IM/CLO) FROM ASTM STANDARD SINGLE WIND VELOCITY... ASTM STANDARD SINGLE WIND VELOCITY MEASURES Adam W. Potter Biophysics and Biomedical Modeling Division U.S. Army Research Institute of Environmental
Power coordinated control method with frequency support capability for hybrid single/three-phase microgrid

DEFF Research Database (Denmark)

Zhou, Xiaoping; Chen, Yandong; Zhou, Leming

2018-01-01

storage unit (ESU) are added into hybrid single/three-phase microgrid, and a power coordinated control method with frequency support capability is proposed for hybrid single/three-phase microgrid in this study. PEU is connected with three single-phase microgrids to coordinate power exchange among three...... phases and provide frequency support for hybrid microgrid. Meanwhile, a power coordinated control method based on the droop control is proposed for PEU to alleviate three-phase power imbalance and reduce voltage fluctuation of hybrid microgrid. Besides, ESU is injected into the DC-link to buffer......Due to the intermittent output power of distributed generations (DGs) and the variability of loads, voltage fluctuation and three-phase power imbalance easily occur when hybrid single/three-phase microgrid operates in islanded mode. To address these issues, the power exchange unit (PEU) and energy...
The single-sink fixed-charge transportation problem: Applications and solution methods

DEFF Research Database (Denmark)

Goertz, Simon; Klose, Andreas

2007-01-01

The single-sink fixed-charge transportation problem (SSFCTP) consists in finding a minimum cost flow from a number of supplier nodes to a single demand node. Shipping costs comprise costs proportional to the amount shipped as well as a fixed-charge. Although the SSFCTP is an important special case...... of the well-known fixed-charge transportation problem, just a few methods for solving this problem have been proposed in the literature. After summarising some applications of this problem arising in manufacturing and transportation, we give an overview on approximation algorithms and worst-case results...

Photovoltaic device using single wall carbon nanotubes and method of fabricating the same

Science.gov (United States)

Biris, Alexandru S.; Li, Zhongrui

2012-11-06

A photovoltaic device and methods for forming the same. In one embodiment, the photovoltaic device has a silicon substrate, and a film comprising a plurality of single wall carbon nanotubes disposed on the silicon substrate, wherein the plurality of single wall carbon nanotubes forms a plurality of heterojunctions with the silicon in the substrate.
Multivariate missing data in hydrology - Review and applications

Science.gov (United States)

Ben Aissia, Mohamed-Aymen; Chebana, Fateh; Ouarda, Taha B. M. J.

2017-12-01

Water resources planning and management require complete data sets of a number of hydrological variables, such as flood peaks and volumes. However, hydrologists are often faced with the problem of missing data (MD) in hydrological databases. Several methods are used to deal with the imputation of MD. During the last decade, multivariate approaches have gained popularity in the field of hydrology, especially in hydrological frequency analysis (HFA). However, treating the MD remains neglected in the multivariate HFA literature whereas the focus has been mainly on the modeling component. For a complete analysis and in order to optimize the use of data, MD should also be treated in the multivariate setting prior to modeling and inference. Imputation of MD in the multivariate hydrological framework can have direct implications on the quality of the estimation. Indeed, the dependence between the series represents important additional information that can be included in the imputation process. The objective of the present paper is to highlight the importance of treating MD in multivariate hydrological frequency analysis by reviewing and applying multivariate imputation methods and by comparing univariate and multivariate imputation methods. An application is carried out for multiple flood attributes on three sites in order to evaluate the performance of the different methods based on the leave-one-out procedure. The results indicate that, the performance of imputation methods can be improved by adopting the multivariate setting, compared to mean substitution and interpolation methods, especially when using the copula-based approach.
A method for measuring three-dimensional mandibular kinematics in vivo using single-plane fluoroscopy

Science.gov (United States)

Chen, C-C; Lin, C-C; Chen, Y-J; Hong, S-W; Lu, T-W

2013-01-01

Objectives Accurate measurement of the three-dimensional (3D) motion of the mandible in vivo is essential for relevant clinical applications. Existing techniques are either of limited accuracy or require the use of transoral devices that interfere with jaw movements. This study aimed to develop further an existing method for measuring 3D, in vivo mandibular kinematics using single-plane fluoroscopy; to determine the accuracy of the method; and to demonstrate its clinical applicability via measurements on a healthy subject during opening/closing and chewing movements. Methods The proposed method was based on the registration of single-plane fluoroscopy images and 3D low-radiation cone beam CT data. It was validated using roentgen single-plane photogrammetric analysis at static positions and during opening/closing and chewing movements. Results The method was found to have measurement errors of 0.1 ± 0.9 mm for all translations and 0.2° ± 0.6° for all rotations in static conditions, and of 1.0 ± 1.4 mm for all translations and 0.2° ± 0.7° for all rotations in dynamic conditions. Conclusions The proposed method is considered an accurate method for quantifying the 3D mandibular motion in vivo. Without relying on transoral devices, the method has advantages over existing methods, especially in the assessment of patients with missing or unstable teeth, making it useful for the research and clinical assessment of the temporomandibular joint and chewing function. PMID:22842637
Numerical Simulation of Yttrium Aluminum Garnet(YAG) Single Crystal Growth by Resistance Heating Czochralski(CZ) Method

Energy Technology Data Exchange (ETDEWEB)

You, Myeong Hyeon; Cha, Pil Ryung [Kookmin University, Seoul (Korea, Republic of)

2017-01-15

Yttrium Aluminum Garnet (YAG) single crystal has received much attention as the high power solid-state laser’s key component in industrial and medical applications. Various growth methods have been proposed, and currently the induction-heating Czochralski (IHCZ) growth method is mainly used to grow YAG single crystal. Due to the intrinsic properties of the IHCZ method, however, the solid/liquid interface has a downward convex shape and a sharp tip at the center, which causes a core defect and reduces productivity. To produce YAG single crystals with both excellent quality and higher yield, it is essential to control the core defects. In this study, using computer simulations we demonstrate that the resistance-heating CZ (RHCZ) method may avoid a downward convex interface and produce core defect free YAG single crystal. We studied the effects of various design parameters on the interface shape and found that there was an optimum combination of design parameter and operating conditions that produced a flat solid-liquid interface.
Time Series Forecasting with Missing Values

Directory of Open Access Journals (Sweden)

Shin-Fu Wu

2015-11-01

Full Text Available Time series prediction has become more popular in various kinds of applications such as weather prediction, control engineering, financial analysis, industrial monitoring, etc. To deal with real-world problems, we are often faced with missing values in the data due to sensor malfunctions or human errors. Traditionally, the missing values are simply omitted or replaced by means of imputation methods. However, omitting those missing values may cause temporal discontinuity. Imputation methods, on the other hand, may alter the original time series. In this study, we propose a novel forecasting method based on least squares support vector machine (LSSVM. We employ the input patterns with the temporal information which is defined as local time index (LTI. Time series data as well as local time indexes are fed to LSSVM for doing forecasting without imputation. We compare the forecasting performance of our method with other imputation methods. Experimental results show that the proposed method is promising and is worth further investigations.
Combining information from linkage and association mapping for next-generation sequencing longitudinal family data.

Science.gov (United States)

Balliu, Brunilda; Uh, Hae-Won; Tsonaka, Roula; Boehringer, Stefan; Helmer, Quinta; Houwing-Duistermaat, Jeanine J

2014-01-01

In this analysis, we investigate the contributions that linkage-based methods, such as identical-by-descent mapping, can make to association mapping to identify rare variants in next-generation sequencing data. First, we identify regions in which cases share more segments identical-by-descent around a putative causal variant than do controls. Second, we use a two-stage mixed-effect model approach to summarize the single-nucleotide polymorphism data within each region and include them as covariates in the model for the phenotype. We assess the impact of linkage disequilibrium in determining identical-by-descent states between individuals by using markers with and without linkage disequilibrium for the first part and the impact of imputation in testing for association by using imputed genome-wide association studies or raw sequence markers for the second part. We apply the method to next-generation sequencing longitudinal family data from Genetic Association Workshop 18 and identify a significant region at chromosome 3: 40249244-41025167 (p-value = 2.3 × 10(-3)).
Improving the singles rate method for modeling accidental coincidences in high-resolution PET

International Nuclear Information System (INIS)

Oliver, Josep F; Rafecas, Magdalena

2010-01-01

Random coincidences ('randoms') are one of the main sources of image degradation in PET imaging. In order to correct for this effect, an accurate method to estimate the contribution of random events is necessary. This aspect becomes especially relevant for high-resolution PET scanners where the highest image quality is sought and accurate quantitative analysis is undertaken. One common approach to estimate randoms is the so-called singles rate method (SR) widely used because of its good statistical properties. SR is based on the measurement of the singles rate in each detector element. However, recent studies suggest that SR systematically overestimates the correct random rate. This overestimation can be particularly marked for low energy thresholds, below 250 keV used in some applications and could entail a significant image degradation. In this work, we investigate the performance of SR as a function of the activity, geometry of the source and energy acceptance window used. We also investigate the performance of an alternative method, which we call 'singles trues' (ST) that improves SR by properly modeling the presence of true coincidences in the sample. Nevertheless, in any real data acquisition the knowledge of which singles are members of a true coincidence is lost. Therefore, we propose an iterative method, STi, that provides an estimation based on ST but which only requires the knowledge of measurable quantities: prompts and singles. Due to inter-crystal scatter, for wide energy windows ST only partially corrects SR overestimations. While SR deviations are in the range 86-300% (depending on the source geometry), the ST deviations are systematically smaller and contained in the range 4-60%. STi fails to reproduce the ST results, although for not too high activities the deviation with respect to ST is only a few percent. For conventional energy windows, i.e. those without inter-crystal scatter, the ST method corrects the SR overestimations, and deviations from
Education and health and well-being: direct and indirect effects with multiple mediators and interactions with multiple imputed data in Stata.

Science.gov (United States)

Sheikh, Mashhood Ahmed; Abelsen, Birgit; Olsen, Jan Abel

2017-11-01

Previous methods for assessing mediation assume no multiplicative interactions. The inverse odds weighting (IOW) approach has been presented as a method that can be used even when interactions exist. The substantive aim of this study was to assess the indirect effect of education on health and well-being via four indicators of adult socioeconomic status (SES): income, management position, occupational hierarchy position and subjective social status. 8516 men and women from the Tromsø Study (Norway) were followed for 17 years. Education was measured at age 25-74 years, while SES and health and well-being were measured at age 42-91 years. Natural direct and indirect effects (NIE) were estimated using weighted Poisson regression models with IOW. Stata code is provided that makes it easy to assess mediation in any multiple imputed dataset with multiple mediators and interactions. Low education was associated with lower SES. Consequently, low SES was associated with being unhealthy and having a low level of well-being. The effect (NIE) of education on health and well-being is mediated by income, management position, occupational hierarchy position and subjective social status. This study contributes to the literature on mediation analysis, as well as the literature on the importance of education for health-related quality of life and subjective well-being. The influence of education on health and well-being had different pathways in this Norwegian sample. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Fourth-order perturbative extension of the single-double excitation coupled-cluster method

International Nuclear Information System (INIS)

Derevianko, Andrei; Emmons, Erik D.

2002-01-01

Fourth-order many-body corrections to matrix elements for atoms with one valence electron are derived. The obtained diagrams are classified using coupled-cluster-inspired separation into contributions from n-particle excitations from the lowest-order wave function. The complete set of fourth-order diagrams involves only connected single, double, and triple excitations and disconnected quadruple excitations. Approximately half of the fourth-order diagrams are not accounted for by the popular coupled-cluster method truncated at single and double excitations (CCSD). Explicit formulas are tabulated for the entire set of fourth-order diagrams missed by the CCSD method and its linearized version, i.e., contributions from connected triple and disconnected quadruple excitations. A partial summation scheme of the derived fourth-order contributions to all orders of perturbation theory is proposed
Method of Promoting Single Crystal Growth During Melt Growth of Semiconductors

Science.gov (United States)

Su, Ching-Hua (Inventor)

2013-01-01

The method of the invention promotes single crystal growth during fabrication of melt growth semiconductors. A growth ampoule and its tip have a semiconductor source material placed therein. The growth ampoule is placed in a first thermal environment that raises the temperature of the semiconductor source material to its liquidus temperature. The growth ampoule is then transitioned to a second thermal environment that causes the semiconductor source material in the growth ampoule's tip to attain a temperature that is below the semiconductor source material's solidus temperature. The growth ampoule so-transitioned is then mechanically perturbed to induce single crystal growth at the growth ampoule's tip.
Method for preparation and readout of polyatomic molecules in single quantum states

Science.gov (United States)

Patterson, David

2018-03-01

Polyatomic molecular ions contain many desirable attributes of a useful quantum system, including rich internal degrees of freedom and highly controllable coupling to the environment. To date, the vast majority of state-specific experimental work on molecular ions has concentrated on diatomic species. The ability to prepare and read out polyatomic molecules in single quantum states would enable diverse experimental avenues not available with diatomics, including new applications in precision measurement, sensitive chemical and chiral analysis at the single-molecule level, and precise studies of Hz-level molecular tunneling dynamics. While cooling the motional state of a polyatomic ion via sympathetic cooling with a laser-cooled atomic ion is straightforward, coupling this motional state to the internal state of the molecule has proven challenging. Here we propose a method for readout and projective measurement of the internal state of a trapped polyatomic ion. The method exploits the rich manifold of technically accessible rotational states in the molecule to realize robust state preparation and readout with far less stringent engineering than quantum logic methods recently demonstrated on diatomic molecules. The method can be applied to any reasonably small (≲10 atoms) polyatomic ion with an anisotropic polarizability.
A New Method for Single-Epoch Ambiguity Resolution with Indoor Pseudolite Positioning.

Science.gov (United States)

Li, Xin; Zhang, Peng; Guo, Jiming; Wang, Jinling; Qiu, Weining

2017-04-21

Ambiguity resolution (AR) is crucial for high-precision indoor pseudolite positioning. Due to the existing characteristics of the pseudolite positioning system, such as the geometry structure of the stationary pseudolite which is consistently invariant, the indoor signal is easy to interrupt and the first order linear truncation error cannot be ignored, and a new AR method based on the idea of the ambiguity function method (AFM) is proposed in this paper. The proposed method is a single-epoch and nonlinear method that is especially well-suited for indoor pseudolite positioning. Considering the very low computational efficiency of conventional AFM, we adopt an improved particle swarm optimization (IPSO) algorithm to search for the best solution in the coordinate domain, and variances of a least squares adjustment is conducted to ensure the reliability of the solving ambiguity. Several experiments, including static and kinematic tests, are conducted to verify the validity of the proposed AR method. Numerical results show that the IPSO significantly improved the computational efficiency of AFM and has a more elaborate search ability compared to the conventional grid searching method. For the indoor pseudolite system, which had an initial approximate coordinate precision better than 0.2 m, the AFM exhibited good performances in both static and kinematic tests. With the corrected ambiguity gained from our proposed method, indoor pseudolite positioning can achieve centimeter-level precision using a low-cost single-frequency software receiver.
Polygenic analysis of genome-wide SNP data identifies common variants on allergic rhinitis

DEFF Research Database (Denmark)

Mohammadnejad, Afsaneh; Brasch-Andersen, Charlotte; Haagerup, Annette

Background: Allergic Rhinitis (AR) is a complex disorder that affects many people around the world. There is a high genetic contribution to the development of the AR, as twins and family studies have estimated heritability of more than 33%. Due to the complex nature of the disease, single SNP...... analysis has limited power in identifying the genetic variations for AR. We combined genome-wide association analysis (GWAS) with polygenic risk score (PRS) in exploring the genetic basis underlying the disease. Methods: We collected clinical data on 631 Danish subjects with AR cases consisting of 434...... sibling pairs and unrelated individuals and control subjects of 197 unrelated individuals. SNP genotyping was done by Affymetrix Genome-Wide Human SNP Array 5.0. SNP imputation was performed using "IMPUTE2". Using additive effect model, GWAS was conducted in discovery sample, the genotypes...
The measurement of Ksub(IC) in single crystal SiC using the indentation method

International Nuclear Information System (INIS)

Henshall, J.L.; Brookes, C.A.

1985-01-01

The present work has concentrated on investigating the underlying fracture toughness behaviour of SiC single crystals. This material was chosen because of the commercial importance of the various polycrystalline forms of SiC and the relative ready availability of reasonably sized single crystals. This study has examined the feasibility of using the indentation technique to determine Ksub(IC) in SiC single crystals. This requires much more less complex experimentation and also affords the possibility of being able to use this method to study the orientation dependence of Ksub(IC) in a similar manner to that used to investigate anisotropy in indentation hardness behaviour. A single crystal of 6H-SiC was used for all the hardness and conventional Ksub(IC) results reported here. The particular polytype and orientation were determined using the Laue X-ray method. All the measurements were made under ambient conditions. Three-point bend tests, with a 6 mm span on single edge notched beams, SENB, orientated such that the plane of the notch was brace 112-bar0 brace and the crack propagation direction were used for the conventional Ksub(IC) tests. The hardness indentations were all made on one particular SENB test piece after it had been fractured. The results are discussed. (author)
Microfluidic platform for multiplexed detection in single cells and methods thereof

Science.gov (United States)

Wu, Meiye; Singh, Anup K.

2018-05-01

The present invention relates to a microfluidic device and platform configured to conduct multiplexed analysis within the device. In particular, the device allows multiple targets to be detected on a single-cell level. Also provided are methods of performing multiplexed analyses to detect one or more target nucleic acids, proteins, and post-translational modifications.
Clinical methods for single-shot instant MR imaging of the heart

International Nuclear Information System (INIS)

Cohen, M.S.; Weisskoff, R.; Rzedzian, R.

1989-01-01

The authors have compared cardiac protocols for instant MR methods that acquire complete images in 32 msec. Four protocols are compared: continuous scanning at a fixed TR with retrospective reordering; pseudogating by using a TR 50 msec greater than the R-R interval; progressive time delay (PTD), in which the delay from the R wave is electronically advanced; and real-time (RT) imaging at 16 images/sec, which enabled complete movies to be obtained in a single heartbeat. Spin-echo techniques have been used for the first three protocols; the RT method used gradient echoes
A single-probe heat pulse method for estimating sap velocity in trees.

Science.gov (United States)

López-Bernal, Álvaro; Testi, Luca; Villalobos, Francisco J

2017-10-01

Available sap flow methods are still far from being simple, cheap and reliable enough to be used beyond very specific research purposes. This study presents and tests a new single-probe heat pulse (SPHP) method for monitoring sap velocity in trees using a single-probe sensor, rather than the multi-probe arrangements used up to now. Based on the fundamental conduction-convection principles of heat transport in sapwood, convective velocity (V h ) is estimated from the temperature increase in the heater after the application of a heat pulse (ΔT). The method was validated against measurements performed with the compensation heat pulse (CHP) technique in field trees of six different species. To do so, a dedicated three-probe sensor capable of simultaneously applying both methods was produced and used. Experimental measurements in the six species showed an excellent agreement between SPHP and CHP outputs for moderate to high flow rates, confirming the applicability of the method. In relation to other sap flow methods, SPHP presents several significant advantages: it requires low power inputs, it uses technically simpler and potentially cheaper instrumentation, the physical damage to the tree is minimal and artefacts caused by incorrect probe spacing and alignment are removed. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.
Comparison on genomic predictions using GBLUP models and two single-step blending methods with different relationship matrices in the Nordic Holstein population

DEFF Research Database (Denmark)

Gao, Hongding; Christensen, Ole Fredslund; Madsen, Per

2012-01-01

Background A single-step blending approach allows genomic prediction using information of genotyped and non-genotyped animals simultaneously. However, the combined relationship matrix in a single-step method may need to be adjusted because marker-based and pedigree-based relationship matrices may...... not be on the same scale. The same may apply when a GBLUP model includes both genomic breeding values and residual polygenic effects. The objective of this study was to compare single-step blending methods and GBLUP methods with and without adjustment of the genomic relationship matrix for genomic prediction of 16......) a simple GBLUP method, 2) a GBLUP method with a polygenic effect, 3) an adjusted GBLUP method with a polygenic effect, 4) a single-step blending method, and 5) an adjusted single-step blending method. In the adjusted GBLUP and single-step methods, the genomic relationship matrix was adjusted...
Single-well tracer methods for hydrogeologic evaluation of target aquifers

International Nuclear Information System (INIS)

Hall, S.H.

1994-11-01

Designing an efficient well field for an aquifer thermal energy storage (ATES) project requires measuring local groundwater flow parameters as well as estimating horizontal and vertical inhomogeneity. Effective porosity determines the volume of aquifer needed to store a given volume of heated or chilled water. Ground-water flow velocity governs the migration of the thermal plume, and dispersion and heat exchange along the flow path reduces the thermal intensity of the recovered plume. Stratigraphic variations in the aquifer will affect plume dispersion, may bias the apparent rate of migration of the plume, and can prevent efficient hydraulic communication between wells. Single-well tracer methods using a conservative flow tracer such as bromide, along with pumping tests and water-level measurements, provide a rapid and cost-effective means for estimating flow parameters. A drift-and-pumpback tracer test yields effective porosity and flow velocity. Point-dilution tracer testing, using new instrumentation for downhole tracer measurement and a new method for calibrating the point-dilution test itself, yields depth-discrete hydraulic conductivity as it is affected by stratigraphy, and can be used to estimate well transmissivity. Experience in conducting both drift-and-pumpback and point-dilution tests at three different test sites has yielded important information that highlights both the power and the limitations of the single-well tracer methods. These sites are the University of Alabama Student Recreation Center (UASRC) ATES well field and the VA Medical Center (VA) ATES well field, both located in Tuscaloosa, Alabama, and the Hanford bioremediation test site north of Richland, Washington
Adapting the mode profile of planar waveguides to single-mode fibers : a novel method

NARCIS (Netherlands)

Smit, M.K.; Vreede, De A.H.

1991-01-01

A novel method for coupling single-mode fibers to planar optical circuits with small waveguide dimensions is proposed. The method eliminates the need to apply microoptics or to adapt the waveguide dimensions within the planar circuit to the fiber dimensions. Alignment tolerances are comparable to

Genome-wide association identifies nine common variants associated with fasting proinsulin levels and provides new insights into the pathophysiology of type 2 diabetes

OpenAIRE

Strawbridge, Rona; Dupuis, Josée; Prokopenko, Inga; Barker, Adam; Ahlqvist, Emma; Rybin, Denis; Petrie, John; Bouatia-Naji, Nabila; Dimas, Antigone; Wheeler, Eleanor; Chen, Han; Voight, Benjamin; Taneera, Jalal; Kanoni, Stavroula; Peden, John

2011-01-01

textabstractOBJECTIVE - Proinsulin is a precursor of mature insulin and C-peptide. Higher circulating proinsulin levels are associated with impaired b-cell function, raised glucose levels, insulin resistance, and type 2 diabetes (T2D). Studies of the insulin processing pathway could provide new insights about T2D pathophysiology. RESEARCH DESIGN AND METHODS - We have conducted a meta-analysis of genome-wide association tests of ;2.5 million genotyped or imputed single nucleotide polymorphisms...
Matrigel Mattress: A Method for the Generation of Single Contracting Human-Induced Pluripotent Stem Cell-Derived Cardiomyocytes.

Science.gov (United States)

Feaster, Tromondae K; Cadar, Adrian G; Wang, Lili; Williams, Charles H; Chun, Young Wook; Hempel, Jonathan E; Bloodworth, Nathaniel; Merryman, W David; Lim, Chee Chew; Wu, Joseph C; Knollmann, Björn C; Hong, Charles C

2015-12-04

The lack of measurable single-cell contractility of human-induced pluripotent stem cell-derived cardiac myocytes (hiPSC-CMs) currently limits the utility of hiPSC-CMs for evaluating contractile performance for both basic research and drug discovery. To develop a culture method that rapidly generates contracting single hiPSC-CMs and allows quantification of cell shortening with standard equipment used for studying adult CMs. Single hiPSC-CMs were cultured for 5 to 7 days on a 0.4- to 0.8-mm thick mattress of undiluted Matrigel (mattress hiPSC-CMs) and compared with hiPSC-CMs maintained on a control substrate (method enables the rapid generation of robustly contracting hiPSC-CMs and enhances maturation. This new method allows quantification of contractile performance at the single-cell level, which should be valuable to disease modeling, drug discovery, and preclinical cardiotoxicity testing. © 2015 American Heart Association, Inc.
The Seepage Simulation of Single Hole and Composite Gas Drainage Based on LB Method

Science.gov (United States)

Chen, Yanhao; Zhong, Qiu; Gong, Zhenzhao

2018-01-01

Gas drainage is the most effective method to prevent and solve coal mine gas power disasters. It is very important to study the seepage flow law of gas in fissure coal gas. The LB method is a simplified computational model based on micro-scale, especially for the study of seepage problem. Based on fracture seepage mathematical model on the basis of single coal gas drainage, using the LB method during coal gas drainage of gas flow numerical simulation, this paper maps the single-hole drainage gas, symmetric slot and asymmetric slot, the different width of the slot combined drainage area gas flow under working condition of gas cloud of gas pressure, flow path diagram and flow velocity vector diagram, and analyses the influence on gas seepage field under various working conditions, and also discusses effective drainage method of the center hole slot on both sides, and preliminary exploration that is related to the combination of gas drainage has been carried on as well.
A Single Image Dehazing Method Using Average Saturation Prior

Directory of Open Access Journals (Sweden)

Zhenfei Gu

2017-01-01

Full Text Available Outdoor images captured in bad weather are prone to yield poor visibility, which is a fatal problem for most computer vision applications. The majority of existing dehazing methods rely on an atmospheric scattering model and therefore share a common limitation; that is, the model is only valid when the atmosphere is homogeneous. In this paper, we propose an improved atmospheric scattering model to overcome this inherent limitation. By adopting the proposed model, a corresponding dehazing method is also presented. In this method, we first create a haze density distribution map of a hazy image, which enables us to segment the hazy image into scenes according to the haze density similarity. Then, in order to improve the atmospheric light estimation accuracy, we define an effective weight assignment function to locate a candidate scene based on the scene segmentation results and therefore avoid most potential errors. Next, we propose a simple but powerful prior named the average saturation prior (ASP, which is a statistic of extensive high-definition outdoor images. Using this prior combined with the improved atmospheric scattering model, we can directly estimate the scene atmospheric scattering coefficient and restore the scene albedo. The experimental results verify that our model is physically valid, and the proposed method outperforms several state-of-the-art single image dehazing methods in terms of both robustness and effectiveness.
The Rebirth of the Theory of Imputation in the Science of Criminal Law: to an Overcoming Stage or an Involution to Pre-Scientific Conceptions?

Directory of Open Access Journals (Sweden)

Nicolás Santiago Cordini

2015-06-01

Full Text Available The Science of Criminal Law goes through a moment that can be characterized as a “crisis”. Faced with this situation, have been proliferate theories that define themselves as “theories of imputation” that leave, in whole or in part, the theory of crime up to now dominating. The aim of this article is to analyze three theories enrolled under the concept of imputation and determine in which proportion they conserve other they get off the categories proposed by the theory of crime. Then, we will establish in which proportion these theories constitute an advance for the Science of Criminal Law or, on the contrary, they are manifestations of a retreat to a pre-scientific stage.
Anisotropic surface hole-transport property of triphenylamine-derivative single crystal prepared by solution method

Energy Technology Data Exchange (ETDEWEB)

Umeda, Minoru, E-mail: mumeda@vos.nagaokaut.ac.jp [Nagaoka University of Technology, Kamitomioka, Nagaoka, Niigata 940-2188 (Japan); Katagiri, Mitsuhiko; Shironita, Sayoko [Nagaoka University of Technology, Kamitomioka, Nagaoka, Niigata 940-2188 (Japan); Nagayama, Norio [Nagaoka University of Technology, Kamitomioka, Nagaoka, Niigata 940-2188 (Japan); Ricoh Company, Ltd., Nishisawada, Numazu, Shizuoka 410-0007 (Japan)

2016-12-01

Highlights: • A hole transport molecule was investigated based on its electrochemical redox characteristics. • The solubility and supersolubility curves of the molecule were measured in order to prepare a large crystal. • The polarization micrograph and XRD results revealed that a single crystal was obtained. • An anisotropic surface conduction, in which the long-axis direction exceeds that of the amorphous layer, was observed. • The anisotropic surface conduction was well explained by the molecular stacked structure. - Abstract: This paper reports the anisotropic hole transport at the triphenylamine-derivative single crystal surface prepared by a solution method. Triphenylamine derivatives are commonly used in a hole-transport material for organic photoconductors of laser-beam printers, in which the materials are used as an amorphous form. For developing organic photovoltaics using the photoconductor’s technology, preparation of a single crystal seems to be a specific way by realizing the high mobility of an organic semiconductor. In this study, a single crystal of 4-(2,2-diphenylethenyl)-N,N-bis(4-methylphenyl)-benzenamine (TPA) was prepared and its anisotropic hole-transport property measured. First, the hole-transport property of the TPA was investigated based on its chemical structure and electrochemical redox characteristics. Next, a large-scale single crystal formation at a high rate was developed by employing a solution method based on its solubility and supersolubility curves. The grown TPA was found to be a single crystal based on the polarization micrograph observation and crystallographic analysis. For the TPA single crystal, an anisotropic surface conduction was found, which was well explained by its molecular stack structure. The measured current in the long-axis direction is one order of magnitude greater than that of amorphous TPA.
Methods for the preparation of large quantities of complex single-stranded oligonucleotide libraries.

Science.gov (United States)

Murgha, Yusuf E; Rouillard, Jean-Marie; Gulari, Erdogan

2014-01-01

Custom-defined oligonucleotide collections have a broad range of applications in fields of synthetic biology, targeted sequencing, and cytogenetics. Also, they are used to encode information for technologies like RNA interference, protein engineering and DNA-encoded libraries. High-throughput parallel DNA synthesis technologies developed for the manufacture of DNA microarrays can produce libraries of large numbers of different oligonucleotides, but in very limited amounts. Here, we compare three approaches to prepare large quantities of single-stranded oligonucleotide libraries derived from microarray synthesized collections. The first approach, alkaline melting of double-stranded PCR amplified libraries with a biotinylated strand captured on streptavidin coated magnetic beads results in little or no non-biotinylated ssDNA. The second method wherein the phosphorylated strand of PCR amplified libraries is nucleolyticaly hydrolyzed is recommended when small amounts of libraries are needed. The third method combining in vitro transcription of PCR amplified libraries to reverse transcription of the RNA product into single-stranded cDNA is our recommended method to produce large amounts of oligonucleotide libraries. Finally, we propose a method to remove any primer binding sequences introduced during library amplification.
Accuracy of single count methods of WL determination for open-pit uranium mines

International Nuclear Information System (INIS)

Solomon, S.B.; Kennedy, K. N.

1983-01-01

A study of single count methods of WL determination was made using a database respresentative of Australian open pit uranium mine conditions. The aim of the study was to check the existence of the optimum time delay coresponding to the Rolle method, to determine the accuracy of the conversion factor for Australian conditions and to examine any systematic use of data bases of representative radon daughter concentration
Efficacy and safety of cross-cylinder photorefractive keratectomy versus single method in medium-high astigmatism: a randomized clinical trial.

Science.gov (United States)

Sedghipour, Mohammad R; Lotfi, Afshin; Sadeghilar, Ayaz; Banan, Saeeid

2012-09-07

BACKGROUND: To compare efficacy and safety of photorefractive keratectomy (PRK) by cross-cylinder with single methods in medium-high astigmatism. DESIGN: Randomized clinical trial study PARTICIPANTS: Fifty patients with medium-high compound myopic astigmatism were enrolled between September 2007 and September 2008. METHODS: PRK was performed on 100 eyes of 50 patients with compound myopic astigmatism. Each patient underwent PRK by cross-cylinder approach in one eye and single method on the contralateral eye. Vector analysis was used to assess astigmatic results. MAIN OUTCOME MEASURES: Improvement of visual acuity (snelen chart), refraction, aberrometry. RESULTS: Uncorrected visual acuity (UCCA) equal to 20/40 or better after six months, was achieved in 98% of eyes in the cross-cylinder method versus 96% in single method.. Mean preoperative spherical equivalent(SE) was -5.2 ±2.1 D in the cross-cylinder method versus -5.1 ±0.5 D in the single method. At six months, the mean SE was - 0.5±0.4 D and -0.6±0.3 D, respectively. Mean IOS was 0.4±0.3 in the cross-cylinder group and 0.4±0.4 in the single group. Mean postoperative absolute change in total root-mean-square higher order aberrations in the cross-cylinder group and single group were 0.16 pm and 0.17 pm, respectively. Any of the mentioned differences didn't appear to be statistically significant. CONCLUSIONS: Both PRK methods appeared to be safe and effective in correcting medium-high astigmatism. © 2012 The Author. Clinical and Experimental Ophthalmology © 2012 Royal Australian and New Zealand College of Ophthalmologists.
Quantitative optical extinction-based parametric method for sizing a single core-shell Ag-Ag2O nanoparticle

International Nuclear Information System (INIS)

Santillan, J M J; Scaffardi, L B; Schinca, D C

2011-01-01

This paper develops a parametric method for determining the core radius and shell thickness in small silver-silver-oxide core-shell nanoparticles (Nps) based on single particle optical extinction spectroscopy. The method is based on the study of the relationship between plasmon peak wavelength, full width at half maximum (FWHM) and contrast of the extinction spectra as a function of core radius and shell thickness. This study reveals that plasmon peak wavelength is strongly dependent on shell thickness, whereas FWHM and contrast depend on both variables. These characteristics may be used for establishing an easy and fast stepwise procedure to size core-shell NPs from single particle absorption spectrum. The importance of the method lies in the possibility of monitoring the growth of the silver-oxide layer around small spherical silver Nps in real time. Using the electrostatic approximation of Mie theory, core-shell single particle extinction spectra were calculated for a silver particle's core size smaller than about 20 nm and different thicknesses of silver oxide around it. Analysis of the obtained curves shows a very particular characteristic of the plasmon peak of small silver-silver-oxide Nps, expressed in the fact that its position is strongly dependent on oxide thickness and weakly dependent on the core radius. Even a very thin oxide layer shifts the plasmon peak noticeably, enabling plasmon tuning with appropriate shell thickness. This characteristic, together with the behaviour of FWHM and contrast of the extinction spectra can be combined into a parametric method for sizing both core and shell of single silver Nps in a medium using only optical information. In turn, shell thickness can be related to oxygen content in the Np's surrounding media. The method proposed is applied to size silver Nps from single particle extinction spectrum. The results are compared with full optical spectrum fitting using the electrostatic approximation in Mie theory. The method
Comparison of variations detection between whole-genome amplification methods used in single-cell resequencing

DEFF Research Database (Denmark)

Hou, Yong; Wu, Kui; Shi, Xulian

2015-01-01

methods, focusing particularly on variations detection. Low-coverage whole-genome sequencing revealed that DOP-PCR had the highest duplication ratio, but an even read distribution and the best reproducibility and accuracy for detection of copy-number variations (CNVs). However, MDA had significantly...... performance using SCRS amplified by different WGA methods. It will guide researchers to determine which WGA method is best suited to individual experimental needs at single-cell level....
Penalized regression procedures for variable selection in the potential outcomes framework.

Science.gov (United States)

Ghosh, Debashis; Zhu, Yeying; Coffman, Donna L

2015-05-10

A recent topic of much interest in causal inference is model selection. In this article, we describe a framework in which to consider penalized regression approaches to variable selection for causal effects. The framework leads to a simple 'impute, then select' class of procedures that is agnostic to the type of imputation algorithm as well as penalized regression used. It also clarifies how model selection involves a multivariate regression model for causal inference problems and that these methods can be applied for identifying subgroups in which treatment effects are homogeneous. Analogies and links with the literature on machine learning methods, missing data, and imputation are drawn. A difference least absolute shrinkage and selection operator algorithm is defined, along with its multiple imputation analogs. The procedures are illustrated using a well-known right-heart catheterization dataset. Copyright © 2015 John Wiley & Sons, Ltd.
Comparing Single-Point and Multi-point Calibration Methods in Modulated DSC

Energy Technology Data Exchange (ETDEWEB)

Van Buskirk, Caleb Griffith [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

2017-06-14

Heat capacity measurements for High Density Polyethylene (HDPE) and Ultra-high Molecular Weight Polyethylene (UHMWPE) were performed using Modulated Differential Scanning Calorimetry (mDSC) over a wide temperature range, -70 to 115 °C, with a TA Instruments Q2000 mDSC. The default calibration method for this instrument involves measuring the heat capacity of a sapphire standard at a single temperature near the middle of the temperature range of interest. However, this method often fails for temperature ranges that exceed a 50 °C interval, likely because of drift or non-linearity in the instrument's heat capacity readings over time or over the temperature range. Therefore, in this study a method was developed to calibrate the instrument using multiple temperatures and the same sapphire standard.
Redundancy-free single-particle equation-of-motion method for nuclei. Pt. 1

International Nuclear Information System (INIS)

Rolnick, P.; Goswami, A.; Oregon Univ., Eugene

1986-01-01

The problem of coupling an odd nucleon to the collective states of an even core is considered in the intermediate-coupling limit. It is now well known that such intermediate-coupling calculations in spherical open-shell nuclei necessitate the inclusion of ground-state correlation or backward coupling which gives rise to an overcomplete basic set of states for the diagonalization of the hamiltonian. In a recent letter, we have derived a technique to free the single-particle equation-of-motion method of redundancy. Here we shall apply this redundancy-free equation-of-motion method to intermediate-coupling calculations in two regions of near-spherical odd-mass nuclei where forward coupling alone has not been successful. It is shown that qualitative effects of backward coupling previously reported are not spurious effects of double counting, although they are significantly modified by the removal of redundancy. We also discuss what further modifications of the theory will be needed in order to treat the dynamical interplay of collective and single-particle modes in nuclei self-consistently on the same footing. (orig.)
Sources of variability for the single-comparator method in a heavy-water reactor

International Nuclear Information System (INIS)

Damsgaard, E.; Heydorn, K.

1978-11-01

The well thermalized flux in the heavy-water-moderated DR 3 reactor at Risoe prompted us to investigate to what extent a single comparator could be used for multi-element determination instead of multiple comparators. The reliability of the single-comparator method is limited by the thermal-to-epithermal ratio, and experiments were designed to determine the variations in this ratio throughout a reactor operating period (4 weeks including a shut-down period of 4-5 days). The bi-isotopic method using zirconium as monitor was chosen, because 94 Zr and 96 Zr exhibit a large difference in their Isub(o)/Σsub(th) values, and would permit determination of the flux ratio with a precision sufficient to determine variations. One of the irradiation facilities comprises a rotating magazine with 3 channels, each of which can hold five aluminium cans. In this rig, five cans, each holding a polyvial with 1 ml of aqueous zirconium solution were irradiated simultaneously in one channel. Irradiations were carried out in the first and the third week of 4 periods. In another facility consisting of a pneumatic tube system, two samples were simultaneously irradiated on top of each other in a polyethylene rabbit. Experiments were carried out once a week for 4 periods. All samples were counted on a Ge(Li)-detector for 95 Zr, 97 sup(m)Nb and 97 Nb. The thermal-to-epithermal flux ratio was calculated from the induced activity, the nuclear data for the two zirconium isotopes and the detector efficiency. By analysis of variance the total variation of the flux ratio was separated into a random variation between reactor periods, and systematic differences between the positions, as well as the weeks in the operating period. If the variations are in statistical control, the error resulting from use of the single-comparator method in multi-element determination can be estimated for any combination of irradiation position and day in the operating period. With the measure flux ratio variations in DR
Comparison of Single and Multi-Scale Method for Leaf and Wood Points Classification from Terrestrial Laser Scanning Data

Science.gov (United States)

Wei, Hongqiang; Zhou, Guiyun; Zhou, Junjie

2018-04-01

The classification of leaf and wood points is an essential preprocessing step for extracting inventory measurements and canopy characterization of trees from the terrestrial laser scanning (TLS) data. The geometry-based approach is one of the widely used classification method. In the geometry-based method, it is common practice to extract salient features at one single scale before the features are used for classification. It remains unclear how different scale(s) used affect the classification accuracy and efficiency. To assess the scale effect on the classification accuracy and efficiency, we extracted the single-scale and multi-scale salient features from the point clouds of two oak trees of different sizes and conducted the classification on leaf and wood. Our experimental results show that the balanced accuracy of the multi-scale method is higher than the average balanced accuracy of the single-scale method by about 10 % for both trees. The average speed-up ratio of single scale classifiers over multi-scale classifier for each tree is higher than 30.
Power quality improvement of single-phase photovoltaic systems through a robust synchronization method

DEFF Research Database (Denmark)

Hadjidemetriou, Lenos; Kyriakides, Elias; Yang, Yongheng

2014-01-01

An increasing amount of single-phase photovoltaic (PV) systems on the distribution network requires more advanced synchronization methods in order to meet the grid codes with respect to power quality and fault ride through capability. The response of the synchronization technique selected...... is crucial for the performance of PV inverters. In this paper, a new synchronization method with good dynamics and accurate response under highly distorted voltage is proposed. This method uses a Multi-Harmonic Decoupling Cell (MHDC), which cancels out the oscillations on the synchronization signals due...
Quartz-Seq2: a high-throughput single-cell RNA-sequencing method that effectively uses limited sequence reads.

Science.gov (United States)

Sasagawa, Yohei; Danno, Hiroki; Takada, Hitomi; Ebisawa, Masashi; Tanaka, Kaori; Hayashi, Tetsutaro; Kurisaki, Akira; Nikaido, Itoshi

2018-03-09

High-throughput single-cell RNA-seq methods assign limited unique molecular identifier (UMI) counts as gene expression values to single cells from shallow sequence reads and detect limited gene counts. We thus developed a high-throughput single-cell RNA-seq method, Quartz-Seq2, to overcome these issues. Our improvements in the reaction steps make it possible to effectively convert initial reads to UMI counts, at a rate of 30-50%, and detect more genes. To demonstrate the power of Quartz-Seq2, we analyzed approximately 10,000 transcriptomes from in vitro embryonic stem cells and an in vivo stromal vascular fraction with a limited number of reads.
Physical model construction for electrical anisotropy of single crystal zinc oxide micro/nanobelt using finite element method

International Nuclear Information System (INIS)

Yu, Guangbin; Tang, Chaolong; Song, Jinhui; Lu, Wenqiang

2014-01-01

Based on conductivity characterization of single crystal zinc oxide (ZnO) micro/nanobelt (MB/NB), we further investigate the physical mechanism of nonlinear intrinsic resistance-length characteristic using finite element method. By taking the same parameters used in experiment, a model of nonlinear anisotropic resistance change with single crystal MB/NB has been deduced, which matched the experiment characterization well. The nonlinear resistance-length comes from the different electron moving speed in various crystal planes. As the direct outcome, crystallography of the anisotropic semiconducting MB/NB has been identified, which could serve as a simple but effective method to identify crystal growth direction of single crystal semiconducting or conductive nanomaterial
An imputation/copula-based stochastic individual tree growth model for mixed species Acadian forests: a case study using the Nova Scotia permanent sample plot network

Directory of Open Access Journals (Sweden)

John A. KershawJr

2017-09-01

Full Text Available Background A novel approach to modelling individual tree growth dynamics is proposed. The approach combines multiple imputation and copula sampling to produce a stochastic individual tree growth and yield projection system. Methods The Nova Scotia, Canada permanent sample plot network is used as a case study to develop and test the modelling approach. Predictions from this model are compared to predictions from the Acadian variant of the Forest Vegetation Simulator, a widely used statistical individual tree growth and yield model. Results Diameter and height growth rates were predicted with error rates consistent with those produced using statistical models. Mortality and ingrowth error rates were higher than those observed for diameter and height, but also were within the bounds produced by traditional approaches for predicting these rates. Ingrowth species composition was very poorly predicted. The model was capable of reproducing a wide range of stand dynamic trajectories and in some cases reproduced trajectories that the statistical model was incapable of reproducing. Conclusions The model has potential to be used as a benchmarking tool for evaluating statistical and process models and may provide a mechanism to separate signal from noise and improve our ability to analyze and learn from large regional datasets that often have underlying flaws in sample design.

Nondestructive assessment of single-span timber bridges using a vibration- based method

Science.gov (United States)

Xiping Wang; James P. Wacker; Angus M. Morison; John W. Forsman; John R. Erickson; Robert J. Ross

2005-01-01

This paper describes an effort to develop a global dynamic testing technique for evaluating the overall stiffness of timber bridge superstructures. A forced vibration method was used to measure the natural frequency of single-span timber bridges in the laboratory and field. An analytical model based on simple beam theory was proposed to represent the relationship...
Gauge-Invariant Formulation of Time-Dependent Configuration Interaction Singles Method

Directory of Open Access Journals (Sweden)

Takeshi Sato

2018-03-01

Full Text Available We propose a gauge-invariant formulation of the channel orbital-based time-dependent configuration interaction singles (TDCIS method [Phys. Rev. A, 74, 043420 (2006], one of the powerful ab initio methods to investigate electron dynamics in atoms and molecules subject to an external laser field. In the present formulation, we derive the equations of motion (EOMs in the velocity gauge using gauge-transformed time-dependent, not fixed, orbitals that are equivalent to the conventional EOMs in the length gauge using fixed orbitals. The new velocity-gauge EOMs avoid the use of the length-gauge dipole operator, which diverges at large distance, and allows us to exploit computational advantages of the velocity-gauge treatment over the length-gauge one, e.g., a faster convergence in simulations with intense and long-wavelength lasers, and the feasibility of exterior complex scaling as an absorbing boundary. The reformulated TDCIS method is applied to an exactly solvable model of one-dimensional helium atom in an intense laser field to numerically demonstrate the gauge invariance. We also discuss the consistent method for evaluating the time derivative of an observable, which is relevant, e.g., in simulating high-harmonic generation.
Numerical simulation on single bubble rising behavior in liquid metal using moving particle semi-implicit method

International Nuclear Information System (INIS)

Zuo Juanli; Tian Wenxi; Qiu Suizheng; Chen Ronghua; Su Guanghui

2011-01-01

The gas-lift pump in liquid metal cooling fast reactor (LMFR) is an innovational conceptual design to enhance the natural circulation ability of reactor core. The two-phase flow character of gas-liquid metal makes significant improvement of the natural circulation capacity and reactor safety. In present basic study, the rising behavior of a single nitrogen bubble in five kinds of liquid metals (lead bismuth alloy, liquid kalium, sodium, potassium sodium alloy and lithium lead alloy) was numerically simulated using moving particle semi-implicit (MPS) method. The whole growing process of single nitrogen bubble in liquid metal was captured. The bubble shape and rising speed of single nitrogen bubble in each liquid metal were compared. The comparison between simulation results using MPS method and Grace graphical correlation shows a good agreement. (authors)
Synthesis of Large-Scale Single-Crystalline Monolayer WS2 Using a Semi-Sealed Method

Directory of Open Access Journals (Sweden)

Feifei Lan

2018-02-01

Full Text Available As a two-dimensional semiconductor, WS2 has attracted great attention due to its rich physical properties and potential applications. However, it is still difficult to synthesize monolayer single-crystalline WS2 at larger scale. Here, we report the growth of large-scale triangular single-crystalline WS2 with a semi-sealed installation by chemical vapor deposition (CVD. Through this method, triangular single-crystalline WS2 with an average length of more than 300 µm was obtained. The largest one was about 405 μm in length. WS2 triangles with different sizes and thicknesses were analyzed by optical microscope and atomic force microscope (AFM. Their optical properties were evaluated by Raman and photoluminescence (PL spectra. This report paves the way to fabricating large-scale single-crystalline monolayer WS2, which is useful for the growth of high-quality WS2 and its potential applications in the future.
Attempts at estimating mixed venous carbon dioxide tension by the single-breath method.

Science.gov (United States)

Ohta, H; Takatani, O; Matsuoka, T

1989-01-01

The single-breath method was originally proposed by Kim et al. [1] for estimating the blood carbon dioxide tension and cardiac output. Its reliability has not been proven. The present study was undertaken, using dogs, to compare the mixed venous carbon dioxide tension (PVCO2) calculated by the single-breath method with the PVCO2 measured in mixed venous blood, and to evaluate the influence of variations in the exhalation duration and the volume of expired air usually discarded from computations as the deadspace. Among the exhalation durations of 15, 30 and 45 s tested, the 15 s duration was found to be too short to obtain an analyzable O2-CO2 curve, but at either 30 or 45 s, the calculated values of PVCO2 were comparable to the measured PVCO2. A significant agreement between calculated and measured PVCO2 was obtained when the expired gas with PCO2 less than 22 Torr was considered as deadspace gas.
Prediction of regulatory gene pairs using dynamic time warping and gene ontology.

Science.gov (United States)

Yang, Andy C; Hsu, Hui-Huang; Lu, Ming-Da; Tseng, Vincent S; Shih, Timothy K

2014-01-01

Selecting informative genes is the most important task for data analysis on microarray gene expression data. In this work, we aim at identifying regulatory gene pairs from microarray gene expression data. However, microarray data often contain multiple missing expression values. Missing value imputation is thus needed before further processing for regulatory gene pairs becomes possible. We develop a novel approach to first impute missing values in microarray time series data by combining k-Nearest Neighbour (KNN), Dynamic Time Warping (DTW) and Gene Ontology (GO). After missing values are imputed, we then perform gene regulation prediction based on our proposed DTW-GO distance measurement of gene pairs. Experimental results show that our approach is more accurate when compared with existing missing value imputation methods on real microarray data sets. Furthermore, our approach can also discover more regulatory gene pairs that are known in the literature than other methods.
Optimal numerical methods for determining the orientation averages of single-scattering properties of atmospheric ice crystals

International Nuclear Information System (INIS)

Um, Junshik; McFarquhar, Greg M.

2013-01-01

The optimal orientation averaging scheme (regular lattice grid scheme or quasi Monte Carlo (QMC) method), the minimum number of orientations, and the corresponding computing time required to calculate the average single-scattering properties (i.e., asymmetry parameter (g), single-scattering albedo (ω o ), extinction efficiency (Q ext ), scattering efficiency (Q sca ), absorption efficiency (Q abs ), and scattering phase function at scattering angles of 90° (P 11 (90°)), and 180° (P 11 (180°))) within a predefined accuracy level (i.e., 1.0%) were determined for four different nonspherical atmospheric ice crystal models (Gaussian random sphere, droxtal, budding Bucky ball, and column) with maximum dimension D=10μm using the Amsterdam discrete dipole approximation at λ=0.55, 3.78, and 11.0μm. The QMC required fewer orientations and less computing time than the lattice grid. The calculations of P 11 (90°) and P 11 (180°) required more orientations than the calculations of integrated scattering properties (i.e., g, ω o , Q ext , Q sca , and Q abs ) regardless of the orientation average scheme. The fewest orientations were required for calculating g and ω o . The minimum number of orientations and the corresponding computing time for single-scattering calculations decreased with an increase of wavelength, whereas they increased with the surface-area ratio that defines particle nonsphericity. -- Highlights: •The number of orientations required to calculate the average single-scattering properties of nonspherical ice crystals is investigated. •Single-scattering properties of ice crystals are calculated using ADDA. •Quasi Monte Carlo method is more efficient than lattice grid method for scattering calculations. •Single-scattering properties of ice crystals depend on a newly defined parameter called surface area ratio
A method for using unmanned aerial vehicles for emergency investigation of single geo-hazards and sample applications of this method

Science.gov (United States)

Huang, Haifeng; Long, Jingjing; Yi, Wu; Yi, Qinglin; Zhang, Guodong; Lei, Bangjun

2017-11-01

In recent years, unmanned aerial vehicles (UAVs) have become widely used in emergency investigations of major natural hazards over large areas; however, UAVs are less commonly employed to investigate single geo-hazards. Based on a number of successful investigations in the Three Gorges Reservoir area, China, a complete UAV-based method for performing emergency investigations of single geo-hazards is described. First, a customized UAV system that consists of a multi-rotor UAV subsystem, an aerial photography subsystem, a ground control subsystem and a ground surveillance subsystem is described in detail. The implementation process, which includes four steps, i.e., indoor preparation, site investigation, on-site fast processing and application, and indoor comprehensive processing and application, is then elaborated, and two investigation schemes, automatic and manual, that are used in the site investigation step are put forward. Moreover, some key techniques and methods - e.g., the layout and measurement of ground control points (GCPs), route planning, flight control and image collection, and the Structure from Motion (SfM) photogrammetry processing - are explained. Finally, three applications are given. Experience has shown that using UAVs for emergency investigation of single geo-hazards greatly reduces the time, intensity and risks associated with on-site work and provides valuable, high-accuracy, high-resolution information that supports emergency responses.
Quantitative optical extinction-based parametric method for sizing a single core-shell Ag-Ag{sub 2}O nanoparticle

Energy Technology Data Exchange (ETDEWEB)

Santillan, J M J; Scaffardi, L B; Schinca, D C, E-mail: lucias@ciop.unlp.edu.ar [Centro de Investigaciones Opticas (CIOp), (CONICET La Plata-CIC) (Argentina)

2011-03-16

This paper develops a parametric method for determining the core radius and shell thickness in small silver-silver-oxide core-shell nanoparticles (Nps) based on single particle optical extinction spectroscopy. The method is based on the study of the relationship between plasmon peak wavelength, full width at half maximum (FWHM) and contrast of the extinction spectra as a function of core radius and shell thickness. This study reveals that plasmon peak wavelength is strongly dependent on shell thickness, whereas FWHM and contrast depend on both variables. These characteristics may be used for establishing an easy and fast stepwise procedure to size core-shell NPs from single particle absorption spectrum. The importance of the method lies in the possibility of monitoring the growth of the silver-oxide layer around small spherical silver Nps in real time. Using the electrostatic approximation of Mie theory, core-shell single particle extinction spectra were calculated for a silver particle's core size smaller than about 20 nm and different thicknesses of silver oxide around it. Analysis of the obtained curves shows a very particular characteristic of the plasmon peak of small silver-silver-oxide Nps, expressed in the fact that its position is strongly dependent on oxide thickness and weakly dependent on the core radius. Even a very thin oxide layer shifts the plasmon peak noticeably, enabling plasmon tuning with appropriate shell thickness. This characteristic, together with the behaviour of FWHM and contrast of the extinction spectra can be combined into a parametric method for sizing both core and shell of single silver Nps in a medium using only optical information. In turn, shell thickness can be related to oxygen content in the Np's surrounding media. The method proposed is applied to size silver Nps from single particle extinction spectrum. The results are compared with full optical spectrum fitting using the electrostatic approximation in Mie theory
Fine mapping of multiple QTL using combined linkage and linkage disequilibrium mapping – A comparison of single QTL and multi QTL methods

Directory of Open Access Journals (Sweden)

Meuwissen Theo HE

2007-04-01

Full Text Available Abstract Two previously described QTL mapping methods, which combine linkage analysis (LA and linkage disequilibrium analysis (LD, were compared for their ability to detect and map multiple QTL. The methods were tested on five different simulated data sets in which the exact QTL positions were known. Every simulated data set contained two QTL, but the distances between these QTL were varied from 15 to 150 cM. The results show that the single QTL mapping method (LDLA gave good results as long as the distance between the QTL was large (> 90 cM. When the distance between the QTL was reduced, the single QTL method had problems positioning the two QTL and tended to position only one QTL, i.e. a "ghost" QTL, in between the two real QTL positions. The multi QTL mapping method (MP-LDLA gave good results for all evaluated distances between the QTL. For the large distances between the QTL (> 90 cM the single QTL method more often positioned the QTL in the correct marker bracket, but considering the broader likelihood peaks of the single point method it could be argued that the multi QTL method was more precise. Since the distances were reduced the multi QTL method was clearly more accurate than the single QTL method. The two methods combine well, and together provide a good tool to position single or multiple QTL in practical situations, where the number of QTL and their positions are unknown.
Single-slice rebinning method for helical cone-beam CT

International Nuclear Information System (INIS)

Noo, F.; Defrise, M.; Clackdoyle, R.

1999-01-01

In this paper, we present reconstruction results from helical cone-beam CT data, obtained using a simple and fast algorithm, which we call the CB-SSRB algorithm. This algorithm combines the single-slice rebinning method of PET imaging with the weighting schemes of spiral CT algorithms. The reconstruction is approximate but can be performed using 2D multislice fan-beam filtered backprojection. The quality of the results is surprisingly good, and far exceeds what one might expect, even when the pitch of the helix is large. In particular, with this algorithm comparable quality is obtained using helical cone-beam data with a normalized pitch of 10 to that obtained using standard spiral CT reconstruction with a normalized pitch of 2. (author)
Hybrid Control Method for a Single Phase PFC using a Low Cost Microcontroller

DEFF Research Database (Denmark)

Jakobsen, Lars Tønnes; Nielsen, Nils; Wolf, Christian

2005-01-01

This paper presents a hybrid control method for single phase boost PFCs. The high bandwidth current loop is analog while the voltage loop is implemented in an 8-bit microcontroller. The design focuses on minimizing the number of calculations done in the microcontroller. A 1kW prototype has been...
Development of Single Optical Sensor Method for the Measurement Droplet Parameters

Energy Technology Data Exchange (ETDEWEB)

Kim, Tae Ho; Ahn, Tae Hwan; Yun, Byong Jo [Pusan National University, Busan (Korea, Republic of); Bae, Byoung Uhn; Kim, Kyoung Doo [KAERI, Daejeon (Korea, Republic of)

2016-05-15

In this study, we tried to develop single optical fiber probe(S-TOP) sensor method to measure droplet parameters such as diameter, droplet fraction, and droplet velocity and so on. To calibrate and confirm the optical fiber sensor for those parameters, we conducted visualization experiments by using a high speed camera with the optical sensor. To evaluate the performance of the S-TOP accurately, we repeated calibration experiments at a given droplet flow condition. Figure. 3 shows the result of the calibration. In this graph, the x axis is the droplet velocity measured by visualization and the y axis is grd, D which is obtained from S-TOP. In this study, we have developed the single tip optical probe sensor to measure the droplet parameters. From the calibration experiments with high speed camera, we get the calibration curve for the droplet velocity. Additionally, the chord length distribution of droplets is measured by the optical probe.
Development of Single Optical Sensor Method for the Measurement Droplet Parameters

International Nuclear Information System (INIS)

Kim, Tae Ho; Ahn, Tae Hwan; Yun, Byong Jo; Bae, Byoung Uhn; Kim, Kyoung Doo

2016-01-01

In this study, we tried to develop single optical fiber probe(S-TOP) sensor method to measure droplet parameters such as diameter, droplet fraction, and droplet velocity and so on. To calibrate and confirm the optical fiber sensor for those parameters, we conducted visualization experiments by using a high speed camera with the optical sensor. To evaluate the performance of the S-TOP accurately, we repeated calibration experiments at a given droplet flow condition. Figure. 3 shows the result of the calibration. In this graph, the x axis is the droplet velocity measured by visualization and the y axis is grd, D which is obtained from S-TOP. In this study, we have developed the single tip optical probe sensor to measure the droplet parameters. From the calibration experiments with high speed camera, we get the calibration curve for the droplet velocity. Additionally, the chord length distribution of droplets is measured by the optical probe.
A new method for testing pile by single-impact energy and P-S curve

Science.gov (United States)

Xu, Zhao-Yong; Duan, Yong-Kang; Wang, Bin; Hu, Yi-Li; Yang, Run-Hai; Xu, Jun; Zhao, Jin-Ming

2004-11-01

By studying the pile-formula and stress-wave methods ( e.g., CASE method), the authors propose a new method for testing piles using the single-impact energy and P-S curves. The vibration and wave figures are recorded, and the dynamic and static displacements are measured by different transducers near the top of piles when the pile is impacted by a heavy hammer or micro-rocket. By observing the transformation coefficient of driving energy (total energy), the consumed energy of wave motion and vibration and so on, the vertical bearing capacity for single pile is measured and calculated. Then, using the vibration wave diagram, the dynamic relation curves between the force ( P) and the displacement ( S) is calculated and the yield points are determined. Using the static-loading test, the dynamic results are checked and the relative constants of dynamic-static P-S curves are determined. Then the subsidence quantity corresponding to the bearing capacity is determined. Moreover, the shaped quality of the pile body can be judged from the formation of P-S curves.
An Improved Azimuth Angle Estimation Method with a Single Acoustic Vector Sensor Based on an Active Sonar Detection System.

Science.gov (United States)

Zhao, Anbang; Ma, Lin; Ma, Xuefei; Hui, Juan

2017-02-20

In this paper, an improved azimuth angle estimation method with a single acoustic vector sensor (AVS) is proposed based on matched filtering theory. The proposed method is mainly applied in an active sonar detection system. According to the conventional passive method based on complex acoustic intensity measurement, the mathematical and physical model of this proposed method is described in detail. The computer simulation and lake experiments results indicate that this method can realize the azimuth angle estimation with high precision by using only a single AVS. Compared with the conventional method, the proposed method achieves better estimation performance. Moreover, the proposed method does not require complex operations in frequencydomain and achieves computational complexity reduction.
An Improved Azimuth Angle Estimation Method with a Single Acoustic Vector Sensor Based on an Active Sonar Detection System

Directory of Open Access Journals (Sweden)

Anbang Zhao

2017-02-01

Full Text Available In this paper, an improved azimuth angle estimation method with a single acoustic vector sensor (AVS is proposed based on matched filtering theory. The proposed method is mainly applied in an active sonar detection system. According to the conventional passive method based on complex acoustic intensity measurement, the mathematical and physical model of this proposed method is described in detail. The computer simulation and lake experiments results indicate that this method can realize the azimuth angle estimation with high precision by using only a single AVS. Compared with the conventional method, the proposed method achieves better estimation performance. Moreover, the proposed method does not require complex operations in frequencydomain and achieves computational complexity reduction.
Testing survey-based methods for rapid monitoring of child mortality, with implications for summary birth history data.

Science.gov (United States)

Brady, Eoghan; Hill, Kenneth

2017-01-01

Under-five mortality estimates are increasingly used in low and middle income countries to target interventions and measure performance against global development goals. Two new methods to rapidly estimate under-5 mortality based on Summary Birth Histories (SBH) were described in a previous paper and tested with data available. This analysis tests the methods using data appropriate to each method from 5 countries that lack vital registration systems. SBH data are collected across many countries through censuses and surveys, and indirect methods often rely upon their quality to estimate mortality rates. The Birth History Imputation method imputes data from a recent Full Birth History (FBH) onto the birth, death and age distribution of the SBH to produce estimates based on the resulting distribution of child mortality. DHS FBHs and MICS SBHs are used for all five countries. In the implementation, 43 of 70 estimates are within 20% of validation estimates (61%). Mean Absolute Relative Error is 17.7.%. 1 of 7 countries produces acceptable estimates. The Cohort Change method considers the differences in births and deaths between repeated Summary Birth Histories at 1 or 2-year intervals to estimate the mortality rate in that period. SBHs are taken from Brazil's PNAD Surveys 2004-2011 and validated against IGME estimates. 2 of 10 estimates are within 10% of validation estimates. Mean absolute relative error is greater than 100%. Appropriate testing of these new methods demonstrates that they do not produce sufficiently good estimates based on the data available. We conclude this is due to the poor quality of most SBH data included in the study. This has wider implications for the next round of censuses and future household surveys across many low- and middle- income countries.
Single-photon source engineering using a Modal Method

DEFF Research Database (Denmark)

Gregersen, Niels

Solid-state sources of single indistinguishable photons are of great interest for quantum information applications. The semiconductor quantum dot embedded in a host material represents an attractive platform to realize such a single-photon source (SPS). A near-unity efficiency, defined as the num...... nanowire SPSs...
A Single Sided Edge Marking Method for Detecting Pectoral Muscle in Digital Mammograms

Directory of Open Access Journals (Sweden)

G. Toz

2018-02-01

Full Text Available In the computer-assisted diagnosis of breast cancer, the removal of pectoral muscle from mammograms is very important. In this study, a new method, called Single-Sided Edge Marking (SSEM technique, is proposed for the identification of the pectoral muscle border from mammograms. 60 mammograms from the INbreast database were used to test the proposed method. The results obtained were compared for False Positive Rate, False Negative Rate, and Sensitivity using the ground truth values pre-determined by radiologists for the same images. Accordingly, it has been shown that the proposed method can detect the pectoral muscle border with an average of 95.6% sensitivity.

A Benchmark of Lidar-Based Single Tree Detection Methods Using Heterogeneous Forest Data from the Alpine Space

Directory of Open Access Journals (Sweden)

Lothar Eysn

2015-05-01

Full Text Available In this study, eight airborne laser scanning (ALS-based single tree detection methods are benchmarked and investigated. The methods were applied to a unique dataset originating from different regions of the Alpine Space covering different study areas, forest types, and structures. This is the first benchmark ever performed for different forests within the Alps. The evaluation of the detection results was carried out in a reproducible way by automatically matching them to precise in situ forest inventory data using a restricted nearest neighbor detection approach. Quantitative statistical parameters such as percentages of correctly matched trees and omission and commission errors are presented. The proposed automated matching procedure presented herein shows an overall accuracy of 97%. Method based analysis, investigations per forest type, and an overall benchmark performance are presented. The best matching rate was obtained for single-layered coniferous forests. Dominated trees were challenging for all methods. The overall performance shows a matching rate of 47%, which is comparable to results of other benchmarks performed in the past. The study provides new insight regarding the potential and limits of tree detection with ALS and underlines some key aspects regarding the choice of method when performing single tree detection for the various forest types encountered in alpine regions.
Multiple Imputation of Groundwater Data to Evaluate Spatial and Temporal Anthropogenic Influences on Subsurface Water Fluxes in Los Angeles, CA

Science.gov (United States)

Manago, K. F.; Hogue, T. S.; Hering, A. S.

2014-12-01

In the City of Los Angeles, groundwater accounts for 11% of the total water supply on average, and 30% during drought years. Due to ongoing drought in California, increased reliance on local water supply highlights the need for better understanding of regional groundwater dynamics and estimating sustainable groundwater supply. However, in an urban setting, such as Los Angeles, understanding or modeling groundwater levels is extremely complicated due to various anthropogenic influences such as groundwater pumping, artificial recharge, landscape irrigation, leaking infrastructure, seawater intrusion, and extensive impervious surfaces. This study analyzes anthropogenic effects on groundwater levels using groundwater monitoring well data from the County of Los Angeles Department of Public Works. The groundwater data is irregularly sampled with large gaps between samples, resulting in a sparsely populated dataset. A multiple imputation method is used to fill the missing data, allowing for multiple ensembles and improved error estimates. The filled data is interpolated to create spatial groundwater maps utilizing information from all wells. The groundwater data is evaluated at a monthly time step over the last several decades to analyze the effect of land cover and identify other influencing factors on groundwater levels spatially and temporally. Preliminary results show irrigated parks have the largest influence on groundwater fluctuations, resulting in large seasonal changes, exceeding changes in spreading grounds. It is assumed that these fluctuations are caused by watering practices required to sustain non-native vegetation. Conversely, high intensity urbanized areas resulted in muted groundwater fluctuations and behavior decoupling from climate patterns. Results provides improved understanding of anthropogenic effects on groundwater levels in addition to providing high quality datasets for validation of regional groundwater models.
Single molecule force spectroscopy: methods and applications in biology

International Nuclear Information System (INIS)

Shen Yi; Hu Jun

2012-01-01

Single molecule measurements have transformed our view of biomolecules. Owing to the ability of monitoring the activity of individual molecules, we now see them as uniquely structured, fluctuating molecules that stochastically transition between frequently many substrates, as two molecules do not follow precisely the same trajectory. Indeed, it is this discovery of critical yet short-lived substrates that were often missed in ensemble measurements that has perhaps contributed most to the better understanding of biomolecular functioning resulting from single molecule experiments. In this paper, we give a review on the three major techniques of single molecule force spectroscopy, and their applications especially in biology. The single molecular study of biotin-streptavidin interactions is introduced as a successful example. The problems and prospects of the single molecule force spectroscopy are discussed, too. (authors)
Standard test method for isotopic analysis of uranium hexafluoride by double standard single-collector gas mass spectrometer method

CERN Document Server

American Society for Testing and Materials. Philadelphia

2010-01-01

1.1 This is a quantitative test method applicable to determining the mass percent of uranium isotopes in uranium hexafluoride (UF6) samples with 235U concentrations between 0.1 and 5.0 mass %. 1.2 This test method may be applicable for the entire range of 235U concentrations for which adequate standards are available. 1.3 This test method is for analysis by a gas magnetic sector mass spectrometer with a single collector using interpolation to determine the isotopic concentration of an unknown sample between two characterized UF6 standards. 1.4 This test method is to replace the existing test method currently published in Test Methods C761 and is used in the nuclear fuel cycle for UF6 isotopic analyses. 1.5 The values stated in SI units are to be regarded as standard. No other units of measurement are included in this standard. 1.6 This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility of the user of this standard to establish appro...
An Islanding Detection Method by Using Frequency Positive Feedback Based on FLL for Single-Phase Microgrid

DEFF Research Database (Denmark)

Sun, Qinfei; Guerrero, Josep M.; Jing, Tianjun

2017-01-01

An active islanding detection method based on Frequency-Locked Loop (FLL) for constant power controlled inverter in single-phase microgrid is proposed. This method generates a phase shift comparing the instantaneous frequency obtained from FLL unit with the nominal frequency to modify the reference...
A Generic Topology Derivation Method for Single-phase Converters with Active Capacitive DC-links

DEFF Research Database (Denmark)

Wang, Haoran; Wang, Huai; Zhu, Guorong

2016-01-01

capacitive DCDC- link solutions, but important aspects of the topology assess-ment, such as the total energy storage, overall capacitive energy buffer ratio, cost, and reliability are still not available. This paper proposes a generic topology derivation method of single-phase power converters...
A New Synchronous Reference Frame-Based Method for Single-Phase Shunt Active Power Filters

DEFF Research Database (Denmark)

Monfared, Mohammad; Golestan, Saeed; Guerrero, Josep M.

2013-01-01

This paper deals with the design of a novel method in the synchronous reference frame (SRF) to extract the reference compensating current for single-phase shunt active power filters (APFs). Unlike previous works in the SRF, the proposed method has an innovative feature that it does not need...... the fictitious current signal. Frequency-independent operation, accurate reference current extraction and relatively fast transient response are other key features of the presented strategy. The effectiveness of the proposed method is investigated by means of detailed mathematical analysis. The results confirm...
Single-pulse and multi-pulse femtosecond laser damage of optical single films

International Nuclear Information System (INIS)

Yuan Lei; Zhao Yuan'an; He Hongbo; Shao Jianda; Fan Zhengxiu

2006-01-01

Laser-induced damage of a single 500 nm HfO 2 film and a single 500 nm ZrO 2 film were studied with single- and multi-pulse femtosecond laser. The laser-induced damage thresholds (LIDT) of both samples by the 1-on-1 method and the 1000-on-1 method were reported. It was discovered that the LIDT of the HfO 2 single film was higher than that of the ZrO 2 single film by both test methods, which was explained by simple Keldysh's multiphoton ionization theory. The LIDT of multi-pulse was lower than that of single-pulse for both samples as a result of accumulative effect. (authors)
Breast Cancer and Modifiable Lifestyle Factors in Argentinean Women: Addressing Missing Data in a Case-Control Study

Science.gov (United States)

Coquet, Julia Becaria; Tumas, Natalia; Osella, Alberto Ruben; Tanzi, Matteo; Franco, Isabella; Diaz, Maria Del Pilar

2016-01-01

A number of studies have evidenced the effect of modifiable lifestyle factors such as diet, breastfeeding and nutritional status on breast cancer risk. However, none have addressed the missing data problem in nutritional epidemiologic research in South America. Missing data is a frequent problem in breast cancer studies and epidemiological settings in general. Estimates of effect obtained from these studies may be biased, if no appropriate method for handling missing data is applied. We performed Multiple Imputation for missing values on covariates in a breast cancer case-control study of Córdoba (Argentina) to optimize risk estimates. Data was obtained from a breast cancer case control study from 2008 to 2015 (318 cases, 526 controls). Complete case analysis and multiple imputation using chained equations were the methods applied to estimate the effects of a Traditional dietary pattern and other recognized factors associated with breast cancer. Physical activity and socioeconomic status were imputed. Logistic regression models were performed. When complete case analysis was performed only 31% of women were considered. Although a positive association of Traditional dietary pattern and breast cancer was observed from both approaches (complete case analysis OR=1.3, 95%CI=1.0-1.7; multiple imputation OR=1.4, 95%CI=1.2-1.7), effects of other covariates, like BMI and breastfeeding, were only identified when multiple imputation was considered. A Traditional dietary pattern, BMI and breastfeeding are associated with the occurrence of breast cancer in this Argentinean population when multiple imputation is appropriately performed. Multiple Imputation is suggested in Latin America’s epidemiologic studies to optimize effect estimates in the future. PMID:27892664
Optical performance of multifocal soft contact lenses via a single-pass method.

Science.gov (United States)

Bakaraju, Ravi C; Ehrmann, Klaus; Falk, Darrin; Ho, Arthur; Papas, Eric

2012-08-01

A physical model eye capable of carrying soft contact lenses (CLs) was used as a platform to evaluate optical performance of several commercial multifocals (MFCLs) with high- and low-add powers and a single-vision control. Optical performance was evaluated at three pupil sizes, six target vergences, and five CL-correcting positions using a spatially filtered monochromatic (632.8 nm) light source. The various target vergences were achieved by using negative trial lenses. A photosensor in the retinal plane recorded the image point-spread that enabled the computation of visual Strehl ratios. The centration of CLs was monitored by an additional integrated en face camera. Hydration of the correcting lens was maintained using a humidity chamber and repeated instillations of rewetting saline drops. All the MFCLs reduced performance for distance but considerably improved performance along the range of distance to near target vergences, relative to the single-vision CL. Performance was dependent on add power, design, pupil, and centration of the correcting CLs. Proclear (D) design produced good performance for intermediate vision, whereas Proclear (N) design performed well at near vision (p 4 mm in diameter. Acuvue Oasys bifocal produced performance comparable with single-vision CL for most vergences. Direct measurement of single-pass images at the retinal plane of a physical model eye used in conjunction with various MFCLs is demonstrated. This method may have utility in evaluating the relative effectiveness of commercial and prototype designs.
Real-time GPS seismology using a single receiver: method comparison, error analysis and precision validation

Science.gov (United States)

Li, Xingxing

2014-05-01

Earthquake monitoring and early warning system for hazard assessment and mitigation has traditional been based on seismic instruments. However, for large seismic events, it is difficult for traditional seismic instruments to produce accurate and reliable displacements because of the saturation of broadband seismometers and problematic integration of strong-motion data. Compared with the traditional seismic instruments, GPS can measure arbitrarily large dynamic displacements without saturation, making them particularly valuable in case of large earthquakes and tsunamis. GPS relative positioning approach is usually adopted to estimate seismic displacements since centimeter-level accuracy can be achieved in real-time by processing double-differenced carrier-phase observables. However, relative positioning method requires a local reference station, which might itself be displaced during a large seismic event, resulting in misleading GPS analysis results. Meanwhile, the relative/network approach is time-consuming, particularly difficult for the simultaneous and real-time analysis of GPS data from hundreds or thousands of ground stations. In recent years, several single-receiver approaches for real-time GPS seismology, which can overcome the reference station problem of the relative positioning approach, have been successfully developed and applied to GPS seismology. One available method is real-time precise point positioning (PPP) relied on precise satellite orbit and clock products. However, real-time PPP needs a long (re)convergence period, of about thirty minutes, to resolve integer phase ambiguities and achieve centimeter-level accuracy. In comparison with PPP, Colosimo et al. (2011) proposed a variometric approach to determine the change of position between two adjacent epochs, and then displacements are obtained by a single integration of the delta positions. This approach does not suffer from convergence process, but the single integration from delta positions to
Methods for Mediation Analysis with Missing Data

Science.gov (United States)

Zhang, Zhiyong; Wang, Lijuan

2013-01-01

Despite wide applications of both mediation models and missing data techniques, formal discussion of mediation analysis with missing data is still rare. We introduce and compare four approaches to dealing with missing data in mediation analysis including list wise deletion, pairwise deletion, multiple imputation (MI), and a two-stage maximum…
Organometallic halide perovskite single crystals having low deffect density and methods of preparation thereof

KAUST Repository

Bakr, Osman; Shi, Dong

2016-01-01

The present disclosure presents a method of making a single crystal organometallic halide perovskites, with the formula: AMX3, wherein A is an organic cation, M is selected from the group consisting of: Pb, Sn, Cu, Ni, Co, Fe, Mn, Pd, Cd, Ge, and Eu
Four-spacecraft determination of magnetopause orientation, motion and thickness: comparison with results from single-spacecraft methods

Directory of Open Access Journals (Sweden)

S. E. Haaland

2004-04-01

Full Text Available In this paper, we use Cluster data from one magnetopause event on 5 July 2001 to compare predictions from various methods for determination of the velocity, orientation, and thickness of the magnetopause current layer. We employ established as well as new multi-spacecraft techniques, in which time differences between the crossings by the four spacecraft, along with the duration of each crossing, are used to calculate magnetopause speed, normal vector, and width. The timing is based on data from either the Cluster Magnetic Field Experiment (FGM or the Electric Field Experiment (EFW instruments. The multi-spacecraft results are compared with those derived from various single-spacecraft techniques, including minimum-variance analysis of the magnetic field and deHoffmann-Teller, as well as Minimum-Faraday-Residue analysis of plasma velocities and magnetic fields measured during the crossings. In order to improve the overall consistency between multi- and single-spacecraft results, we have also explored the use of hybrid techniques, in which timing information from the four spacecraft is combined with certain limited results from single-spacecraft methods, the remaining results being left for consistency checks. The results show good agreement between magnetopause orientations derived from appropriately chosen single-spacecraft techniques and those obtained from multi-spacecraft timing. The agreement between magnetopause speeds derived from single- and multi-spacecraft methods is quantitatively somewhat less good but it is evident that the speed can change substantially from one crossing to the next within an event. The magnetopause thickness varied substantially from one crossing to the next, within an event. It ranged from 5 to 10 ion gyroradii. The density profile was sharper than the magnetic profile: most of the density change occured in the earthward half of the magnetopause.

Key words. Magnetospheric physics (magnetopause, cusp and
Comparison of Two Methods for Estimation of Work Limitation Scores from Health Status Measures

DEFF Research Database (Denmark)

Anatchkova, M; Fang, H; Kini, N

2015-01-01

Objectives To compare two methods for estimation of Work Limitations Questionnaire scores (WLQ, 8 items) from the Role Physical (RP, 4 items) and Role Emotional scales (RE, 3 items) of the SF-36 Health survey. These measures assess limitations in role performance attributed to health (emotional...... future data collection strategies. Methods We used data from two independent cross-sectional panel samples (Sample1, n=1382, 51% female, 72% Caucasian, 49% with preselected chronic conditions, 15% with fair/poor health; Sample2, n=301, 45% female, 90% Caucasian, 47% with preselected chronic conditions......, 21% with fair/poor health). Method 1 used previously developed and validated IRT based calibration tables. Method 2 used regression models to develop aggregate imputation weights as described in the literature. We evaluated the agreement of observed and estimated WLQ scale scores from the two methods...
Identification of independent association signals and putative functional variants for breast cancer risk through fine-scale mapping of the 12p11 locus

OpenAIRE

Zeng, Chenjie; Guo, Xingyi; Long, Jirong; Kuchenbaecker, Karoline B.; Droit, Arnaud; Michailidou, Kyriaki; Ghoussaini, Maya; Kar, Siddhartha; Freeman, Adam; Hopper, John L.; Milne, Roger L.; Bolla, Manjeet K.; Wang, Qin; Dennis, Joe; Agata, Simona

2016-01-01

Background: Multiple recent genome-wide association studies (GWAS) have identified a single nucleotide polymorphism (SNP), rs10771399, at 12p11 that is associated with breast cancer risk. Method: We performed a fine-scale mapping study of a 700 kb region including 441 genotyped and more than 1300 imputed genetic variants in 48,155 cases and 43,612 controls of European descent, 6269 cases and 6624 controls of East Asian descent and 1116 cases and 932 controls of African descent in the Breast C...
Method of bilateral pleural drainage by single Blake drain after esophagectomy.

Science.gov (United States)

Niwa, Yukiko; Koike, Masahiko; Oya, Hisaharu; Iwata, Naoki; Kobayashi, Daisuke; Kanda, Mitsuro; Tanaka, Chie; Yamada, Suguru; Fujii, Tsutomu; Nakayama, Goro; Sugimoto, Hiroyuki; Nomoto, Shuji; Fujiwara, Michitaka; Kodera, Yasuhiro

2015-03-01

Clinicians often encounter left pleural effusion after esophagectomy, which sometimes necessitates thoracentesis. We have introduced a new drainage method, bilateral pleural drainage by single Blake drain (BDSD), which we have been using since April 2013. This study aims to evaluate the performance of the BDSD. The BDSD method employs a 15-F Blake drain inserted from the right thoracic cavity to the left thoracic cavity across the posterior mediastinum. The conventional drain (CD) group consisted of 50 patients with a 19-F Blake drain placed in the right thoracic cavity during the period from April 2012 to March 2013. The BDSD group consisted of 54 patients treated from April 2013 to June 2014. The amount of total drainage in the BDSD group was significantly higher than that in the CD group (P pleural effusion and left lower lobe atelectasis in the BDSD group were significantly lower than those in the CD group (P pleural effusion necessitating thoracentesis drainage in the BDSD group. Compared with the conventional method, BDSD was able to evacuate bilateral pleural effusion more effectively, and the incidences of left pleural effusion and left atelectasis were lower. This method is therefore clinically useful after esophagectomy.
Rapid synthesis of single-phase bismuth ferrite by microwave-assisted hydrothermal method

Energy Technology Data Exchange (ETDEWEB)

Cao, Wenqian [College of Materials Science and Engineering, China Jiliang University, 258 Xueyuan Street, Xiasha Higher Education District, Hangzhou 310018, Zhejiang Province (China); Chen, Zhi, E-mail: zchen0@gmail.com [College of Materials Science and Engineering, China Jiliang University, 258 Xueyuan Street, Xiasha Higher Education District, Hangzhou 310018, Zhejiang Province (China); Gao, Tong; Zhou, Dantong; Leng, Xiaonan; Niu, Feng [College of Materials Science and Engineering, China Jiliang University, 258 Xueyuan Street, Xiasha Higher Education District, Hangzhou 310018, Zhejiang Province (China); Zhu, Yuxiang [College of Materials Science and Engineering, China Jiliang University, 258 Xueyuan Street, Xiasha Higher Education District, Hangzhou 310018, Zhejiang Province (China); Tianjin Key Laboratory of Marine Resources and Chemistry, Tianjin University of Science and Technology, Tianjin (China); Qin, Laishun, E-mail: qinlaishun@yeah.net [College of Materials Science and Engineering, China Jiliang University, 258 Xueyuan Street, Xiasha Higher Education District, Hangzhou 310018, Zhejiang Province (China); Wang, Jiangying; Huang, Yuexiang [College of Materials Science and Engineering, China Jiliang University, 258 Xueyuan Street, Xiasha Higher Education District, Hangzhou 310018, Zhejiang Province (China)

2016-06-01

This paper describes on the fast synthesis of bismuth ferrite by the simple microwave-assisted hydrothermal method. The phase transformation and the preferred growth facets during the synthetic process have been investigated by X-ray diffraction. Bismuth ferrite can be quickly prepared by microwave hydrothermal method by simply controlling the reaction time, which is further confirmed by Fourier Transform infrared spectroscopy and magnetic measurement. - Graphical abstract: Single-phase BiFeO{sub 3} could be realized at a shortest reaction time of 65 min. The reaction time has strong influences on the phase transformation and the preferred growth facets. - Highlights: • Rapid synthesis (65 min) of BiFeO{sub 3} by microwave-assisted hydrothermal method. • Reaction time has influence on the purity and preferred growth facets. • FTIR and magnetic measurement further confirm the pure phase.
Rapid synthesis of single-phase bismuth ferrite by microwave-assisted hydrothermal method

International Nuclear Information System (INIS)

Cao, Wenqian; Chen, Zhi; Gao, Tong; Zhou, Dantong; Leng, Xiaonan; Niu, Feng; Zhu, Yuxiang; Qin, Laishun; Wang, Jiangying; Huang, Yuexiang

2016-01-01

This paper describes on the fast synthesis of bismuth ferrite by the simple microwave-assisted hydrothermal method. The phase transformation and the preferred growth facets during the synthetic process have been investigated by X-ray diffraction. Bismuth ferrite can be quickly prepared by microwave hydrothermal method by simply controlling the reaction time, which is further confirmed by Fourier Transform infrared spectroscopy and magnetic measurement. - Graphical abstract: Single-phase BiFeO_3 could be realized at a shortest reaction time of 65 min. The reaction time has strong influences on the phase transformation and the preferred growth facets. - Highlights: • Rapid synthesis (65 min) of BiFeO_3 by microwave-assisted hydrothermal method. • Reaction time has influence on the purity and preferred growth facets. • FTIR and magnetic measurement further confirm the pure phase.
Advancing US GHG Inventory by Incorporating Survey Data using Machine-Learning Techniques

Science.gov (United States)

Alsaker, C.; Ogle, S. M.; Breidt, J.

2017-12-01

Crop management data are used in the National Greenhouse Gas Inventory that is compiled annually and reported to the United Nations Framework Convention on Climate Change. Emissions for carbon stock change and N2O emissions for US agricultural soils are estimated using the USDA National Resources Inventory (NRI). NRI provides basic information on land use and cropping histories, but it does not provide much detail on other management practices. In contrast, the Conservation Effects Assessment Project (CEAP) survey collects detailed crop management data that could be used in the GHG Inventory. The survey data were collected from NRI survey locations that are a subset of the NRI every 10 years. Therefore, imputation of the CEAP are needed to represent the management practices across all NRI survey locations both spatially and temporally. Predictive mean matching and an artificial neural network methods have been applied to develop imputation model under a multiple imputation framework. Temporal imputation involves adjusting the imputation model using state-level USDA Agricultural Resource Management Survey data. Distributional and predictive accuracy is assessed for the imputed data, providing not only management data needed for the inventory but also rigorous estimates of uncertainty.

Ultra-fast scintillation properties of β-Ga2O3 single crystals grown by Floating Zone method

Science.gov (United States)

He, Nuotian; Tang, Huili; Liu, Bo; Zhu, Zhichao; Li, Qiu; Guo, Chao; Gu, Mu; Xu, Jun; Liu, Jinliang; Xu, Mengxuan; Chen, Liang; Ouyang, Xiaoping

2018-04-01

In this investigation, β-Ga2O3 single crystals were grown by the Floating Zone method. At room temperature, the X-ray excited emission spectrum includes ultraviolet and blue emission bands. The scintillation light output is comparable to the commercial BGO scintillator. The scintillation decay times are composed of the dominant ultra-fast component of 0.368 ns and a small amount of slightly slow components of 8.2 and 182 ns. Such fast component is superior to most commercial inorganic scintillators. In contrast to most semiconductor crystals prepared by solution method such as ZnO, β-Ga2O3 single crystals can be grown by traditional melt-growth method. Thus we can easily obtain large bulk crystals and mass production.
Bayesian nonparametric generative models for causal inference with missing at random covariates.

Science.gov (United States)

Roy, Jason; Lum, Kirsten J; Zeldow, Bret; Dworkin, Jordan D; Re, Vincent Lo; Daniels, Michael J

2018-03-26

We propose a general Bayesian nonparametric (BNP) approach to causal inference in the point treatment setting. The joint distribution of the observed data (outcome, treatment, and confounders) is modeled using an enriched Dirichlet process. The combination of the observed data model and causal assumptions allows us to identify any type of causal effect-differences, ratios, or quantile effects, either marginally or for subpopulations of interest. The proposed BNP model is well-suited for causal inference problems, as it does not require parametric assumptions about the distribution of confounders and naturally leads to a computationally efficient Gibbs sampling algorithm. By flexibly modeling the joint distribution, we are also able to impute (via data augmentation) values for missing covariates within the algorithm under an assumption of ignorable missingness, obviating the need to create separate imputed data sets. This approach for imputing the missing covariates has the additional advantage of guaranteeing congeniality between the imputation model and the analysis model, and because we use a BNP approach, parametric models are avoided for imputation. The performance of the method is assessed using simulation studies. The method is applied to data from a cohort study of human immunodeficiency virus/hepatitis C virus co-infected patients. © 2018, The International Biometric Society.
Growth of Cd0.96Zn0.04Te single crystals by vapor phase gas transport method

Directory of Open Access Journals (Sweden)

S. H. Tabatabai Yazdi

2006-03-01

Full Text Available Cd0.96Zn0.04Te crystals were grown using vapor phase gas transport method (VPGT. The results show that dendritic crystals with grain size up to 3.5 mm can be grown with this technique. X-ray diffraction and Laue back-reflection patterns show that dendritic crystals are single-phase, whose single crystal grains are randomly oriented with respect to the gas-transport axis. Electrical measurements, carried out using Van der Pauw method, show that the as-grown crystals have resistivity of about 104 Ω cm and n-type conductivity.
An equivalent ground thermal test method for single-phase fluid loop space radiator

Directory of Open Access Journals (Sweden)

Xianwen Ning

2015-02-01

Full Text Available Thermal vacuum test is widely used for the ground validation of spacecraft thermal control system. However, the conduction and convection can be simulated in normal ground pressure environment completely. By the employment of pumped fluid loops’ thermal control technology on spacecraft, conduction and convection become the main heat transfer behavior between radiator and inside cabin. As long as the heat transfer behavior between radiator and outer space can be equivalently simulated in normal pressure, the thermal vacuum test can be substituted by the normal ground pressure thermal test. In this paper, an equivalent normal pressure thermal test method for the spacecraft single-phase fluid loop radiator is proposed. The heat radiation between radiator and outer space has been equivalently simulated by combination of a group of refrigerators and thermal electrical cooler (TEC array. By adjusting the heat rejection of each device, the relationship between heat flux and surface temperature of the radiator can be maintained. To verify this method, a validating system has been built up and the experiments have been carried out. The results indicate that the proposed equivalent ground thermal test method can simulate the heat rejection performance of radiator correctly and the temperature error between in-orbit theory value and experiment result of the radiator is less than 0.5 °C, except for the equipment startup period. This provides a potential method for the thermal test of space systems especially for extra-large spacecraft which employs single-phase fluid loop radiator as thermal control approach.
Support for calcium channel gene defects in autism spectrum disorders

Directory of Open Access Journals (Sweden)

Lu Ake Tzu-Hui

2012-12-01

Full Text Available Abstract Background Alternation of synaptic homeostasis is a biological process whose disruption might predispose children to autism spectrum disorders (ASD. Calcium channel genes (CCG contribute to modulating neuronal function and evidence implicating CCG in ASD has been accumulating. We conducted a targeted association analysis of CCG using existing genome-wide association study (GWAS data and imputation methods in a combined sample of parent/affected child trios from two ASD family collections to explore this hypothesis. Methods A total of 2,176 single-nucleotide polymorphisms (SNP (703 genotyped and 1,473 imputed covering the genes that encode the α1 subunit proteins of 10 calcium channels were tested for association with ASD in a combined sample of 2,781 parent/affected child trios from 543 multiplex Caucasian ASD families from the Autism Genetics Resource Exchange (AGRE and 1,651 multiplex and simplex Caucasian ASD families from the Autism Genome Project (AGP. SNP imputation using IMPUTE2 and a combined reference panel from the HapMap3 and the 1,000 Genomes Project increased coverage density of the CCG. Family-based association was tested using the FBAT software which controls for population stratification and accounts for the non-independence of siblings within multiplex families. The level of significance for association was set at 2.3E-05, providing a Bonferroni correction for this targeted 10-gene panel. Results Four SNPs in three CCGs were associated with ASD. One, rs10848653, is located in CACNA1C, a gene in which rare de novo mutations are responsible for Timothy syndrome, a Mendelian disorder that features ASD. Two others, rs198538 and rs198545, located in CACN1G, and a fourth, rs5750860, located in CACNA1I, are in CCGs that encode T-type calcium channels, genes with previous ASD associations. Conclusions These associations support a role for common CCG SNPs in ASD.
PbO networks composed of single crystalline nanosheets synthesized by a facile chemical precipitation method

Energy Technology Data Exchange (ETDEWEB)

Samberg, Joshua P. [Department of Materials Science and Engineering, North Carolina State University, 911 Partners Way, Engineering Building I, Raleigh, NC 27695-7907 (United States); Kajbafvala, Amir, E-mail: amir.kajbafvala@gmail.com [Department of Materials Science and Engineering, North Carolina State University, 911 Partners Way, Engineering Building I, Raleigh, NC 27695-7907 (United States); Koolivand, Amir [Department of Chemistry, North Carolina State University, 2620 Yarbrough Drive, Raleigh, NC 27695 (United States)

2014-03-01

Graphical abstract: - Highlights: • Synthesis of PbO networks through a simple chemical precipitation route. • The synthesis method is rapid and low-cost. • Each network is composed of single crystalline PbO nanosheets. • A possible growth mechanism is proposed for synthesized PbO networks. - Abstract: For the field of energy storage, nanostructured lead oxide (PbO) shows immense potential for increased specific energy and deep discharge for lead acid battery technologies. In this work, PbO networks composed of single crystalline nanosheets were synthesized utilizing a simple, low cost and rapid chemical precipitation method. The PbO networks were prepared in a single reaction vessel from starting reagents of lead acetate dehydrate, ammonium hydroxide and deionized water. Lead acetate dehydrate was chosen as a reagent, as opposed to lead nitrate, to eliminate the possibility of nitrate contamination of the final product. X-ray diffraction (XRD) analysis, high resolution scanning electron microscopy (HRSEM) and high resolution transmission electron microscopy (HRTEM) analysis were used to characterize the synthesized PbO networks. The reproducible method described herein synthesized pure β-PbO (massicot) powders, with no byproducts. A possible formation mechanism for these PbO networks is proposed. The growth is found to proceed predominately in the 〈1 1 1〉 and 〈2 0 0〉 directions while being limited in the 〈0 1 1〉 direction.
PbO networks composed of single crystalline nanosheets synthesized by a facile chemical precipitation method

International Nuclear Information System (INIS)

Samberg, Joshua P.; Kajbafvala, Amir; Koolivand, Amir

2014-01-01

Graphical abstract: - Highlights: • Synthesis of PbO networks through a simple chemical precipitation route. • The synthesis method is rapid and low-cost. • Each network is composed of single crystalline PbO nanosheets. • A possible growth mechanism is proposed for synthesized PbO networks. - Abstract: For the field of energy storage, nanostructured lead oxide (PbO) shows immense potential for increased specific energy and deep discharge for lead acid battery technologies. In this work, PbO networks composed of single crystalline nanosheets were synthesized utilizing a simple, low cost and rapid chemical precipitation method. The PbO networks were prepared in a single reaction vessel from starting reagents of lead acetate dehydrate, ammonium hydroxide and deionized water. Lead acetate dehydrate was chosen as a reagent, as opposed to lead nitrate, to eliminate the possibility of nitrate contamination of the final product. X-ray diffraction (XRD) analysis, high resolution scanning electron microscopy (HRSEM) and high resolution transmission electron microscopy (HRTEM) analysis were used to characterize the synthesized PbO networks. The reproducible method described herein synthesized pure β-PbO (massicot) powders, with no byproducts. A possible formation mechanism for these PbO networks is proposed. The growth is found to proceed predominately in the 〈1 1 1〉 and 〈2 0 0〉 directions while being limited in the 〈0 1 1〉 direction
Radiation Doses to Members of the U.S. Population from Ubiquitous Radionuclides in the Body: Part 2, Methods and Dose Calculations

International Nuclear Information System (INIS)

Watson, David J.; Strom, Daniel J.

2011-01-01

This paper is part two of a three-part series investigating annual effective doses to residents of the United States from intakes of ubiquitous radionuclides, including radionuclides occurring naturally, radionuclides whose concentrations are technologically enhanced, and anthropogenic radionuclides. This series of papers explicitly excludes intakes from inhaling 222Rn, 220Rn, and their short-lived decay products; it also excludes intakes of radionuclides in occupational and medical settings. Part one reviewed, summarized, characterized, and grouped all published and some unpublished data for U.S. residents on ubiquitous radionuclide concentrations in tissues and organs. Assumptions about equilibrium with long-lived parents are made for the 28 other radionuclides in these series lacking data. This paper describes the methods developed to group the collected data into source regions described in the Radiation Dose Assessment Resource (RADAR) dosimetric methodology. Methods for converting the various units of data published over 50 years into a standard form are developed and described. Often, meaningful values of uncertainty of measurements were not published so that variability in data sets is confounded with measurement uncertainty. A description of the methods developed to estimate variability is included in this paper. The data described in part one are grouped by gender and age to match the RADAR dosimetric phantoms. Within these phantoms, concentration values are grouped into source tissue regions by radionuclide, and they are imputed for source regions lacking tissue data. Radionuclide concentrations are then imputed for other phantoms source regions with missing concentration values, and the uncertainties of the imputed values are increased. The content concentrations of hollow organs are calculated, and activities are apportioned to the bone source regions using assumptions about each radionuclide's bone-seeking behavior. The data sets are then ready to be
Organometallic halide perovskite single crystals having low deffect density and methods of preparation thereof

KAUST Repository

Bakr, Osman M.

2016-02-18

The present disclosure presents a method of making a single crystal organometallic halide perovskites, with the formula: AMX3, wherein A is an organic cation, M is selected from the group consisting of: Pb, Sn, Cu, Ni, Co, Fe, Mn, Pd, Cd, Ge, and Eu, and X is a halide. The method comprises the use of two reservoirs containing different precursors and allowing the vapor diffusion from one reservoir to the other one. A solar cell comprising said crystal is also disclosed.
A fast and reliable method for simultaneous waveform, amplitude and latency estimation of single-trial EEG/MEG data.

Directory of Open Access Journals (Sweden)

Wouter D Weeda

Full Text Available The amplitude and latency of single-trial EEG/MEG signals may provide valuable information concerning human brain functioning. In this article we propose a new method to reliably estimate single-trial amplitude and latency of EEG/MEG signals. The advantages of the method are fourfold. First, no a-priori specified template function is required. Second, the method allows for multiple signals that may vary independently in amplitude and/or latency. Third, the method is less sensitive to noise as it models data with a parsimonious set of basis functions. Finally, the method is very fast since it is based on an iterative linear least squares algorithm. A simulation study shows that the method yields reliable estimates under different levels of latency variation and signal-to-noise ratioÕs. Furthermore, it shows that the existence of multiple signals can be correctly determined. An application to empirical data from a choice reaction time study indicates that the method describes these data accurately.
Advanced x-ray stress analysis method for a single crystal using different diffraction plane families

International Nuclear Information System (INIS)

Imafuku, Muneyuki; Suzuki, Hiroshi; Sueyoshi, Kazuyuki; Akita, Koichi; Ohya, Shin-ichi

2008-01-01

Generalized formula of the x-ray stress analysis for a single crystal with unknown stress-free lattice parameter was proposed. This method enables us to evaluate the plane stress states with any combination of diffraction planes. We can choose and combine the appropriate x-ray sources and diffraction plane families, depending on the sample orientation and the apparatus, whenever diffraction condition is satisfied. The analysis of plane stress distributions in an iron single crystal was demonstrated combining with the diffraction data for Fe{211} and Fe{310} plane families
Accuracy of effective dose estimation in personal dosimetry: a comparison between single-badge and double-badge methods and the MOSFET method.

Science.gov (United States)

Januzis, Natalie; Belley, Matthew D; Nguyen, Giao; Toncheva, Greta; Lowry, Carolyn; Miller, Michael J; Smith, Tony P; Yoshizumi, Terry T

2014-05-01

The purpose of this study was three-fold: (1) to measure the transmission properties of various lead shielding materials, (2) to benchmark the accuracy of commercial film badge readings, and (3) to compare the accuracy of effective dose (ED) conversion factors (CF) of the U.S. Nuclear Regulatory Commission methods to the MOSFET method. The transmission properties of lead aprons and the accuracy of film badges were studied using an ion chamber and monitor. ED was determined using an adult male anthropomorphic phantom that was loaded with 20 diagnostic MOSFET detectors and scanned with a whole body CT protocol at 80, 100, and 120 kVp. One commercial film badge was placed at the collar and one at the waist. Individual organ doses and waist badge readings were corrected for lead apron attenuation. ED was computed using ICRP 103 tissue weighting factors, and ED CFs were calculated by taking the ratio of ED and badge reading. The measured single badge CFs were 0.01 (±14.9%), 0.02 (±9.49%), and 0.04 (±15.7%) for 80, 100, and 120 kVp, respectively. Current regulatory ED CF for the single badge method is 0.3; for the double-badge system, they are 0.04 (collar) and 1.5 (under lead apron at the waist). The double-badge system provides a better coefficient for the collar at 0.04; however, exposure readings under the apron are usually negligible to zero. Based on these findings, the authors recommend the use of ED CF of 0.01 for the single badge system from 80 kVp (effective energy 50.4 keV) data.
Combustion Model and Control Parameter Optimization Methods for Single Cylinder Diesel Engine

Directory of Open Access Journals (Sweden)

Bambang Wahono

2014-01-01

Full Text Available This research presents a method to construct a combustion model and a method to optimize some control parameters of diesel engine in order to develop a model-based control system. The construction purpose of the model is to appropriately manage some control parameters to obtain the values of fuel consumption and emission as the engine output objectives. Stepwise method considering multicollinearity was applied to construct combustion model with the polynomial model. Using the experimental data of a single cylinder diesel engine, the model of power, BSFC, NOx, and soot on multiple injection diesel engines was built. The proposed method succesfully developed the model that describes control parameters in relation to the engine outputs. Although many control devices can be mounted to diesel engine, optimization technique is required to utilize this method in finding optimal engine operating conditions efficiently beside the existing development of individual emission control methods. Particle swarm optimization (PSO was used to calculate control parameters to optimize fuel consumption and emission based on the model. The proposed method is able to calculate control parameters efficiently to optimize evaluation item based on the model. Finally, the model which added PSO then was compiled in a microcontroller.
Single-sided NMR

CERN Document Server

Casanova, Federico; Blümich, Bernhard

2011-01-01

Single-Sided NMR describes the design of the first functioning single-sided tomograph, the related measurement methods, and a number of applications. One of the key advantages to this method is the speed at which the images are obtained.
Single-Camera-Based Method for Step Length Symmetry Measurement in Unconstrained Elderly Home Monitoring.

Science.gov (United States)

Cai, Xi; Han, Guang; Song, Xin; Wang, Jinkuan

2017-11-01

single-camera-based gait monitoring is unobtrusive, inexpensive, and easy-to-use to monitor daily gait of seniors in their homes. However, most studies require subjects to walk perpendicularly to camera's optical axis or along some specified routes, which limits its application in elderly home monitoring. To build unconstrained monitoring environments, we propose a method to measure step length symmetry ratio (a useful gait parameter representing gait symmetry without significant relationship with age) from unconstrained straight walking using a single camera, without strict restrictions on walking directions or routes. according to projective geometry theory, we first develop a calculation formula of step length ratio for the case of unconstrained straight-line walking. Then, to adapt to general cases, we propose to modify noncollinear footprints, and accordingly provide general procedure for step length ratio extraction from unconstrained straight walking. Our method achieves a mean absolute percentage error (MAPE) of 1.9547% for 15 subjects' normal and abnormal side-view gaits, and also obtains satisfactory MAPEs for non-side-view gaits (2.4026% for 45°-view gaits and 3.9721% for 30°-view gaits). The performance is much better than a well-established monocular gait measurement system suitable only for side-view gaits with a MAPE of 3.5538%. Independently of walking directions, our method can accurately estimate step length ratios from unconstrained straight walking. This demonstrates our method is applicable for elders' daily gait monitoring to provide valuable information for elderly health care, such as abnormal gait recognition, fall risk assessment, etc. single-camera-based gait monitoring is unobtrusive, inexpensive, and easy-to-use to monitor daily gait of seniors in their homes. However, most studies require subjects to walk perpendicularly to camera's optical axis or along some specified routes, which limits its application in elderly home monitoring
A novel method for preparing vertically grown single-crystalline gold nanowires

International Nuclear Information System (INIS)

Tung, H-T; Nien, Y-T; Chen, I-G; Song, J-M

2008-01-01

A surfactant-free, template-less and seed-less method, namely the thermal-assisted photoreduction (TAP) process, has been developed to synthesize vertically grown Au nanowires (30-80 nm in diameter and about 2 μm in length) on the surface of thin film titanium dioxide (TiO 2 ), which is locally excited by blackbody radiation. The Au nanowires thus produced are single-crystalline with a preferred [11 bar 0] growth direction. The electrical behavior investigated using a nanomanipulation device indicates that the Au nanowires possess an excellent electrical resistivity of about 3.49 x 10 -8 Ω m.
Single-step electrochemical method for producing very sharp Au scanning tunneling microscopy tips

International Nuclear Information System (INIS)

Gingery, David; Buehlmann, Philippe

2007-01-01

A single-step electrochemical method for making sharp gold scanning tunneling microscopy tips is described. 3.0M NaCl in 1% perchloric acid is compared to several previously reported etchants. The addition of perchloric acid to sodium chloride solutions drastically shortens etching times and is shown by transmission electron microscopy to produce very sharp tips with a mean radius of curvature of 15 nm
Rapid, single-step most-probable-number method for enumerating fecal coliforms in effluents from sewage treatment plants

Science.gov (United States)

Munoz, E. F.; Silverman, M. P.

1979-01-01

A single-step most-probable-number method for determining the number of fecal coliform bacteria present in sewage treatment plant effluents is discussed. A single growth medium based on that of Reasoner et al. (1976) and consisting of 5.0 gr. proteose peptone, 3.0 gr. yeast extract, 10.0 gr. lactose, 7.5 gr. NaCl, 0.2 gr. sodium lauryl sulfate, and 0.1 gr. sodium desoxycholate per liter is used. The pH is adjusted to 6.5, and samples are incubated at 44.5 deg C. Bacterial growth is detected either by measuring the increase with time in the electrical impedance ratio between the innoculated sample vial and an uninnoculated reference vial or by visual examination for turbidity. Results obtained by the single-step method for chlorinated and unchlorinated effluent samples are in excellent agreement with those obtained by the standard method. It is suggested that in automated treatment plants impedance ratio data could be automatically matched by computer programs with the appropriate dilution factors and most probable number tables already in the computer memory, with the corresponding result displayed as fecal coliforms per 100 ml of effluent.
Stability analysis of single-phase thermosyphon loops by finite difference numerical methods

International Nuclear Information System (INIS)

Ambrosini, W.

1998-01-01

In this paper, examples of the application of finite difference numerical methods in the analysis of stability of single-phase natural circulation loops are reported. The problem is here addressed for its relevance for thermal-hydraulic system code applications, in the aim to point out the effect of truncation error on stability prediction. The methodology adopted for analysing in a systematic way the effect of various finite difference discretization can be considered the numerical analogue of the usual techniques adopted for PDE stability analysis. Three different single-phase loop configurations are considered involving various kinds of boundary conditions. In one of these cases, an original dimensionless form of the governing equations is proposed, adopting the Reynolds number as a flow variable. This allows for an appropriate consideration of transition between laminar and turbulent regimes, which is not possible with other dimensionless forms, thus enlarging the field of validity of model assumptions. (author). 14 refs., 8 figs
Methods to control for unmeasured confounding in pharmacoepidemiology: an overview.

Science.gov (United States)

Uddin, Md Jamal; Groenwold, Rolf H H; Ali, Mohammed Sanni; de Boer, Anthonius; Roes, Kit C B; Chowdhury, Muhammad A B; Klungel, Olaf H

2016-06-01

Background Unmeasured confounding is one of the principal problems in pharmacoepidemiologic studies. Several methods have been proposed to detect or control for unmeasured confounding either at the study design phase or the data analysis phase. Aim of the Review To provide an overview of commonly used methods to detect or control for unmeasured confounding and to provide recommendations for proper application in pharmacoepidemiology. Methods/Results Methods to control for unmeasured confounding in the design phase of a study are case only designs (e.g., case-crossover, case-time control, self-controlled case series) and the prior event rate ratio adjustment method. Methods that can be applied in the data analysis phase include, negative control method, perturbation variable method, instrumental variable methods, sensitivity analysis, and ecological analysis. A separate group of methods are those in which additional information on confounders is collected from a substudy. The latter group includes external adjustment, propensity score calibration, two-stage sampling, and multiple imputation. Conclusion As the performance and application of the methods to handle unmeasured confounding may differ across studies and across databases, we stress the importance of using both statistical evidence and substantial clinical knowledge for interpretation of the study results.

Multiple-Trait Genomic Selection Methods Increase Genetic Value Prediction Accuracy

Science.gov (United States)

Jia, Yi; Jannink, Jean-Luc

2012-01-01

Genetic correlations between quantitative traits measured in many breeding programs are pervasive. These correlations indicate that measurements of one trait carry information on other traits. Current single-trait (univariate) genomic selection does not take advantage of this information. Multivariate genomic selection on multiple traits could accomplish this but has been little explored and tested in practical breeding programs. In this study, three multivariate linear models (i.e., GBLUP, BayesA, and BayesCπ) were presented and compared to univariate models using simulated and real quantitative traits controlled by different genetic architectures. We also extended BayesA with fixed hyperparameters to a full hierarchical model that estimated hyperparameters and BayesCπ to impute missing phenotypes. We found that optimal marker-effect variance priors depended on the genetic architecture of the trait so that estimating them was beneficial. We showed that the prediction accuracy for a low-heritability trait could be significantly increased by multivariate genomic selection when a correlated high-heritability trait was available. Further, multiple-trait genomic selection had higher prediction accuracy than single-trait genomic selection when phenotypes are not available on all individuals and traits. Additional factors affecting the performance of multiple-trait genomic selection were explored. PMID:23086217
Time Series Forecasting with Missing Values

OpenAIRE

Shin-Fu Wu; Chia-Yung Chang; Shie-Jue Lee

2015-01-01

Time series prediction has become more popular in various kinds of applications such as weather prediction, control engineering, financial analysis, industrial monitoring, etc. To deal with real-world problems, we are often faced with missing values in the data due to sensor malfunctions or human errors. Traditionally, the missing values are simply omitted or replaced by means of imputation methods. However, omitting those missing values may cause temporal discontinuity. Imputation methods, o...
Non-metal single/dual doped carbon quantum dots: a general flame synthetic method and electro-catalytic properties

Science.gov (United States)

Han, Yuzhi; Tang, Di; Yang, Yanmei; Li, Chuanxi; Kong, Weiqian; Huang, Hui; Liu, Yang; Kang, Zhenhui

2015-03-01

A combustion flame method is developed for the convenient and scalable fabrication of single- and dual-doped carbon quantum dots (CQDs) (N-CQDs, B-CQDs, P-CQDs, and S-CQDs and dual-doped B,N-CQDs, P,N-CQDs, and S,N-CQDs), and the doping contents can be easily adjusted by simply changing the concentrations of precursors in ethanol. These single/dual-doped CQDs, especially B,N-CQDs, show high catalytic activities for the oxygen reduction reaction.A combustion flame method is developed for the convenient and scalable fabrication of single- and dual-doped carbon quantum dots (CQDs) (N-CQDs, B-CQDs, P-CQDs, and S-CQDs and dual-doped B,N-CQDs, P,N-CQDs, and S,N-CQDs), and the doping contents can be easily adjusted by simply changing the concentrations of precursors in ethanol. These single/dual-doped CQDs, especially B,N-CQDs, show high catalytic activities for the oxygen reduction reaction. Electronic supplementary information (ESI) available: TEM images, UV-Vis absorption, PL, Raman, FTIR, XPS, CV, and LSV data of single/dual doped CQDs, a table for the calculated mass concentrations of different atoms in various B, N, P or S containing CQDs and a table for summary of the ORR performance of various catalysts in an O2-saturated 0.1 M KOH solution. See DOI: 10.1039/c4nr07116f
Statistical noise with the weighted backprojection method for single photon emission computed tomography

International Nuclear Information System (INIS)

Murayama, Hideo; Tanaka, Eiichi; Toyama, Hinako.

1985-01-01

The weighted backprojection (WBP) method and the radial post-correction (RPC) method were compared with other several attenuation correction methods for single photon emission computed tomography by computer simulation. These methods are the pre-correction method with arithmetic means of opposing projections, the post-correction method with a correction matrix, and the inverse attenuated Randon transform method. Statistical mean square noise in a reconstructed image was formulated, and was displayed two-dimensionally for typical simulated phantoms. The noise image for the WBP method was dependent on several parameters, namely, size of an attenuating object, distribution of activity, the attenuation coefficient, and choise of the reconstruction index, k and position of the reconstruction origin. The noise image for the WBP method with k=0 was almost the same for the RPC method. It has been shown that position of the reconstruction origin has to be chosen appropriately in order to improve the noise properties of the reconstructed image for the WBP method as well as the RPC method. Comparision of the different attenuation correction methods accomplished by using both the reconstructed images and the statistical noise images with the same mathematical phantom and convolving function concluded that the WBP method and the RPC method were more amenable to any radioisotope distributions than the other methods, and had the advantage of flexibility to improve image noise of any local positions. (author)
Comparison of Three Methods Estimating Baseline Creatinine For Acute Kidney Injury in Hospitalized Patients: a Multicentre Survey in Third-Level Urban Hospitals of China

Directory of Open Access Journals (Sweden)

Xia-bing Lang

2018-02-01

Full Text Available Background/Aims: A lack of baseline serum creatinine (SCr data leads to underestimation of the burden caused by acute kidney injury (AKI in developing countries. The goal of this study was to investigate the effects of various baseline SCr analysis methods on the current diagnosis of AKI in hospitalized patients. Methods: Patients with at least one SCr value during their hospital stay between January 1, 2011 and December 31, 2012 were retrospectively included in the study. The baseline SCr was determined either by the minimum SCr (SCrMIN or the estimated SCr using the MDRD formula (SCrGFR-75. We also used the dynamic baseline SCr (SCrdynamic in accordance with the 7 day/48 hour time window. AKI was defined based on the KDIGO SCr criteria. Results: Of 562,733 hospitalized patients, 350,458 (62.3% had at least one SCr determination, and 146,185 (26.0% had repeat SCr tests. AKI was diagnosed in 13,883 (2.5% patients using the SCrMIN, 21,281 (3.8% using the SCrGFR-75 and 9,288 (1.7% using the SCrdynamic. Compared with the non-AKI patients, AKI patients had a higher in-hospital mortality rate regardless of the baseline SCr analysis method. Conclusions: Because of the scarcity of SCr data, imputation of the baseline SCr is necessary to remedy the missing data. The detection rate of AKI varies depending on the different imputation methods. SCrGFR-75 can identify more AKI cases than the other two methods.
Accuracy of Dual-Energy Virtual Monochromatic CT Numbers: Comparison between the Single-Source Projection-Based and Dual-Source Image-Based Methods.

Science.gov (United States)

Ueguchi, Takashi; Ogihara, Ryota; Yamada, Sachiko

2018-03-21

To investigate the accuracy of dual-energy virtual monochromatic computed tomography (CT) numbers obtained by two typical hardware and software implementations: the single-source projection-based method and the dual-source image-based method. A phantom with different tissue equivalent inserts was scanned with both single-source and dual-source scanners. A fast kVp-switching feature was used on the single-source scanner, whereas a tin filter was used on the dual-source scanner. Virtual monochromatic CT images of the phantom at energy levels of 60, 100, and 140 keV were obtained by both projection-based (on the single-source scanner) and image-based (on the dual-source scanner) methods. The accuracy of virtual monochromatic CT numbers for all inserts was assessed by comparing measured values to their corresponding true values. Linear regression analysis was performed to evaluate the dependency of measured CT numbers on tissue attenuation, method, and their interaction. Root mean square values of systematic error over all inserts at 60, 100, and 140 keV were approximately 53, 21, and 29 Hounsfield unit (HU) with the single-source projection-based method, and 46, 7, and 6 HU with the dual-source image-based method, respectively. Linear regression analysis revealed that the interaction between the attenuation and the method had a statistically significant effect on the measured CT numbers at 100 and 140 keV. There were attenuation-, method-, and energy level-dependent systematic errors in the measured virtual monochromatic CT numbers. CT number reproducibility was comparable between the two scanners, and CT numbers had better accuracy with the dual-source image-based method at 100 and 140 keV. Copyright © 2018 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.
An Improved Fuzzy Based Missing Value Estimation in DNA Microarray Validated by Gene Ranking

Directory of Open Access Journals (Sweden)

Sujay Saha

2016-01-01

Full Text Available Most of the gene expression data analysis algorithms require the entire gene expression matrix without any missing values. Hence, it is necessary to devise methods which would impute missing data values accurately. There exist a number of imputation algorithms to estimate those missing values. This work starts with a microarray dataset containing multiple missing values. We first apply the modified version of the fuzzy theory based existing method LRFDVImpute to impute multiple missing values of time series gene expression data and then validate the result of imputation by genetic algorithm (GA based gene ranking methodology along with some regular statistical validation techniques, like RMSE method. Gene ranking, as far as our knowledge, has not been used yet to validate the result of missing value estimation. Firstly, the proposed method has been tested on the very popular Spellman dataset and results show that error margins have been drastically reduced compared to some previous works, which indirectly validates the statistical significance of the proposed method. Then it has been applied on four other 2-class benchmark datasets, like Colorectal Cancer tumours dataset (GDS4382, Breast Cancer dataset (GSE349-350, Prostate Cancer dataset, and DLBCL-FL (Leukaemia for both missing value estimation and ranking the genes, and the results show that the proposed method can reach 100% classification accuracy with very few dominant genes, which indirectly validates the biological significance of the proposed method.
Optical properties of Ni-doped MgGa2O4 single crystals grown by floating zone method

International Nuclear Information System (INIS)

Suzuki, Takenobu; Hughes, Mark; Ohishi, Yasutake

2010-01-01

The single crystal growth conditions and spectroscopic characterization of Ni-doped MgGa 2 O 4 with inverse-spinel structure crystal family are described. Single crystals of this material have been grown by floating zone method. Ni-doped MgGa 2 O 4 single crystals have broadband fluorescence in the 1100-1600 nm wavelength range, 1.6 ms room temperature lifetime, 56% quantum efficiency and 1.05x10 -21 cm 2 stimulated emission cross section at the emission peak. This new material is very promising for tunable laser applications covering the important optical communication and eye safe wavelength region.
A neutral glyoxal gel electrophoresis method for the detection and semi-quantitation of DNA single-strand breaks.

Science.gov (United States)

Pachkowski, Brian; Nakamura, Jun

2013-01-01

Single-strand breaks are among the most prevalent lesions found in DNA. Traditional electrophoretic methods (e.g., the Comet assay) used for investigating these lesions rely on alkaline conditions to denature DNA prior to electrophoresis. However, the presence of alkali-labile sites in DNA can result in the introduction of additional single-strand breaks upon alkali treatment during DNA sample processing. Herein, we describe a neutral glyoxal gel electrophoresis assay which is based on alkali-free DNA denaturation and is suitable for qualitative and semi-quantitative analyses of single-strand breaks in DNA isolated from different organisms.
Electrical properties of zirconium diselenide single crystals grown by iodine transport method

International Nuclear Information System (INIS)

Patel, S.G.; Agarwal, M.K.; Batra, N.M.; Lakshminarayana, D.

1998-01-01

Single crystals of zirconium diselenide (ZrSe 2 ) were grown by chemical vapour transport method using iodine as the transporting agent. The crystals were found to exhibit metallic behaviour in the temperature range 77-300 K and semiconducting nature in 300-443 K range. The measurements of thermoelectric power and conductivity enabled the determination of both carrier mobility and carrier concentration. The variation of carrier mobility and carrier concentration with temperature indicates the presence of deep trapping centres and their reduction with temperature in these crystals. (author)
A robust method to analyze copy number alterations of less than 100 kb in single cells using oligonucleotide array CGH.

Directory of Open Access Journals (Sweden)

Birte Möhlendick

Full Text Available Comprehensive genome wide analyses of single cells became increasingly important in cancer research, but remain to be a technically challenging task. Here, we provide a protocol for array comparative genomic hybridization (aCGH of single cells. The protocol is based on an established adapter-linker PCR (WGAM and allowed us to detect copy number alterations as small as 56 kb in single cells. In addition we report on factors influencing the success of single cell aCGH downstream of the amplification method, including the characteristics of the reference DNA, the labeling technique, the amount of input DNA, reamplification, the aCGH resolution, and data analysis. In comparison with two other commercially available non-linear single cell amplification methods, WGAM showed a very good performance in aCGH experiments. Finally, we demonstrate that cancer cells that were processed and identified by the CellSearch® System and that were subsequently isolated from the CellSearch® cartridge as single cells by fluorescence activated cell sorting (FACS could be successfully analyzed using our WGAM-aCGH protocol. We believe that even in the era of next-generation sequencing, our single cell aCGH protocol will be a useful and (cost- effective approach to study copy number alterations in single cells at resolution comparable to those reported currently for single cell digital karyotyping based on next generation sequencing data.
Single-molecule experiments in biological physics: methods and applications.

Science.gov (United States)

Ritort, F

2006-08-16

I review single-molecule experiments (SMEs) in biological physics. Recent technological developments have provided the tools to design and build scientific instruments of high enough sensitivity and precision to manipulate and visualize individual molecules and measure microscopic forces. Using SMEs it is possible to manipulate molecules one at a time and measure distributions describing molecular properties, characterize the kinetics of biomolecular reactions and detect molecular intermediates. SMEs provide additional information about thermodynamics and kinetics of biomolecular processes. This complements information obtained in traditional bulk assays. In SMEs it is also possible to measure small energies and detect large Brownian deviations in biomolecular reactions, thereby offering new methods and systems to scrutinize the basic foundations of statistical mechanics. This review is written at a very introductory level, emphasizing the importance of SMEs to scientists interested in knowing the common playground of ideas and the interdisciplinary topics accessible by these techniques. The review discusses SMEs from an experimental perspective, first exposing the most common experimental methodologies and later presenting various molecular systems where such techniques have been applied. I briefly discuss experimental techniques such as atomic-force microscopy (AFM), laser optical tweezers (LOTs), magnetic tweezers (MTs), biomembrane force probes (BFPs) and single-molecule fluorescence (SMF). I then present several applications of SME to the study of nucleic acids (DNA, RNA and DNA condensation) and proteins (protein-protein interactions, protein folding and molecular motors). Finally, I discuss applications of SMEs to the study of the nonequilibrium thermodynamics of small systems and the experimental verification of fluctuation theorems. I conclude with a discussion of open questions and future perspectives.
Single-molecule experiments in biological physics: methods and applications

International Nuclear Information System (INIS)

Ritort, F

2006-01-01

I review single-molecule experiments (SMEs) in biological physics. Recent technological developments have provided the tools to design and build scientific instruments of high enough sensitivity and precision to manipulate and visualize individual molecules and measure microscopic forces. Using SMEs it is possible to manipulate molecules one at a time and measure distributions describing molecular properties, characterize the kinetics of biomolecular reactions and detect molecular intermediates. SMEs provide additional information about thermodynamics and kinetics of biomolecular processes. This complements information obtained in traditional bulk assays. In SMEs it is also possible to measure small energies and detect large Brownian deviations in biomolecular reactions, thereby offering new methods and systems to scrutinize the basic foundations of statistical mechanics. This review is written at a very introductory level, emphasizing the importance of SMEs to scientists interested in knowing the common playground of ideas and the interdisciplinary topics accessible by these techniques. The review discusses SMEs from an experimental perspective, first exposing the most common experimental methodologies and later presenting various molecular systems where such techniques have been applied. I briefly discuss experimental techniques such as atomic-force microscopy (AFM), laser optical tweezers (LOTs), magnetic tweezers (MTs), biomembrane force probes (BFPs) and single-molecule fluorescence (SMF). I then present several applications of SME to the study of nucleic acids (DNA, RNA and DNA condensation) and proteins (protein-protein interactions, protein folding and molecular motors). Finally, I discuss applications of SMEs to the study of the nonequilibrium thermodynamics of small systems and the experimental verification of fluctuation theorems. I conclude with a discussion of open questions and future perspectives. (topical review)
Experimental Evaluation of a Method for Turbocharging Four-Stroke, Single Cylinder, Internal Combustion Engines

Science.gov (United States)

Buchman, Michael; Winter, Amos

2015-11-01

Turbocharging an engine increases specific power, improves fuel economy, reduces emissions, and lowers cost compared to a naturally aspirated engine of the same power output. These advantages make turbocharging commonplace for multi-cylinder engines. Single cylinder engineers are not commonly turbocharged due to the phase lag between the exhaust stroke, which powers the turbocharger, and the intake stroke, when air is pumped into the engine. Our proposed method of turbocharging single cylinder engines is to add an ``air capacitor'' to the intake manifold, an additional volume that acts as a buffer to store compressed air between the exhaust and intake strokes, and smooth out the pressure pulses from the turbocharger. This talk presents experimental results from a single cylinder, turbocharged diesel engine fit with various sized air capacitors. Power output from the engine was measured using a dynamometer made from a generator, with the electrical power dissipated with resistive heating elements. We found that intake air density increases with capacitor size as theoretically predicted, ranging from 40 to 60 percent depending on heat transfer. Our experiment was able to produce 29 percent more power compared to using natural aspiration. These results validated that an air capacitor and turbocharger may be a simple, cost effective means of increasing the power density of single cylinder engines.
Single gaze gestures

DEFF Research Database (Denmark)

Møllenbach, Emilie; Lilholm, Martin; Gail, Alastair

2010-01-01

This paper examines gaze gestures and their applicability as a generic selection method for gaze-only controlled interfaces. The method explored here is the Single Gaze Gesture (SGG), i.e. gestures consisting of a single point-to-point eye movement. Horizontal and vertical, long and short SGGs were...
Modeling and E-M estimation of haplotype-specific relative risks from genotype data for a case-control study of unrelated individuals.

Science.gov (United States)

Stram, Daniel O; Leigh Pearce, Celeste; Bretsky, Phillip; Freedman, Matthew; Hirschhorn, Joel N; Altshuler, David; Kolonel, Laurence N; Henderson, Brian E; Thomas, Duncan C

2003-01-01

The US National Cancer Institute has recently sponsored the formation of a Cohort Consortium (http://2002.cancer.gov/scpgenes.htm) to facilitate the pooling of data on very large numbers of people, concerning the effects of genes and environment on cancer incidence. One likely goal of these efforts will be generate a large population-based case-control series for which a number of candidate genes will be investigated using SNP haplotype as well as genotype analysis. The goal of this paper is to outline the issues involved in choosing a method of estimating haplotype-specific risk estimates for such data that is technically appropriate and yet attractive to epidemiologists who are already comfortable with odds ratios and logistic regression. Our interest is to develop and evaluate extensions of methods, based on haplotype imputation, that have been recently described (Schaid et al., Am J Hum Genet, 2002, and Zaykin et al., Hum Hered, 2002) as providing score tests of the null hypothesis of no effect of SNP haplotypes upon risk, which may be used for more complex tasks, such as providing confidence intervals, and tests of equivalence of haplotype-specific risks in two or more separate populations. In order to do so we (1) develop a cohort approach towards odds ratio analysis by expanding the E-M algorithm to provide maximum likelihood estimates of haplotype-specific odds ratios as well as genotype frequencies; (2) show how to correct the cohort approach, to give essentially unbiased estimates for population-based or nested case-control studies by incorporating the probability of selection as a case or control into the likelihood, based on a simplified model of case and control selection, and (3) finally, in an example data set (CYP17 and breast cancer, from the Multiethnic Cohort Study) we compare likelihood-based confidence interval estimates from the two methods with each other, and with the use of the single-imputation approach of Zaykin et al. applied under both
Lazy collaborative filtering for data sets with missing values.

Science.gov (United States)

Ren, Yongli; Li, Gang; Zhang, Jun; Zhou, Wanlei

2013-12-01

As one of the biggest challenges in research on recommender systems, the data sparsity issue is mainly caused by the fact that users tend to rate a small proportion of items from the huge number of available items. This issue becomes even more problematic for the neighborhood-based collaborative filtering (CF) methods, as there are even lower numbers of ratings available in the neighborhood of the query item. In this paper, we aim to address the data sparsity issue in the context of neighborhood-based CF. For a given query (user, item), a set of key ratings is first identified by taking the historical information of both the user and the item into account. Then, an auto-adaptive imputation (AutAI) method is proposed to impute the missing values in the set of key ratings. We present a theoretical analysis to show that the proposed imputation method effectively improves the performance of the conventional neighborhood-based CF methods. The experimental results show that our new method of CF with AutAI outperforms six existing recommendation methods in terms of accuracy.
High Performance Harmonic Isolation By Means of The Single-phase Series Active Filter Employing The Waveform Reconstruction Method

DEFF Research Database (Denmark)

Senturk, Osman Selcuk; Hava, Ahmet M.

2009-01-01

current sampling delay reduction method (SDRM), a single-phase SAF compensated system provides higher harmonic isolation performance and higher stability margins compared to the system using conventional synchronous reference frame based methods. The analytical, simulation, and experimental studies of a 2...
Performance enhancement of the single-phase series active filter by employing the load voltage waveform reconstruction and line current sampling delay reduction methods

DEFF Research Database (Denmark)

Senturk, O.S.; Hava, A.M.

2011-01-01

This paper proposes the waveform reconstruction method (WRM), which is utilized in the single-phase series active filter's (SAF's) control algorithm, in order to extract the load harmonic voltage component of voltage harmonic type single-phase diode rectifier loads. Employing WRM and the line...... current sampling delay reduction method, a single-phase SAF compensated system provides higher harmonic isolation performance and higher stability margins compared to the system using conventional synchronous-reference-frame-based methods. The analytical, simulation, and experimental studies of a 2.5 k...
Optimal Design of Low-Density SNP Arrays for Genomic Prediction: Algorithm and Applications.

Directory of Open Access Journals (Sweden)

Xiao-Lin Wu

Full Text Available Low-density (LD single nucleotide polymorphism (SNP arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for the optimal design of LD SNP chips. A multiple-objective, local optimization (MOLO algorithm was developed for design of optimal LD SNP chips that can be imputed accurately to medium-density (MD or high-density (HD SNP genotypes for genomic prediction. The objective function facilitates maximization of non-gap map length and system information for the SNP chip, and the latter is computed either as locus-averaged (LASE or haplotype-averaged Shannon entropy (HASE and adjusted for uniformity of the SNP distribution. HASE performed better than LASE with ≤1,000 SNPs, but required considerably more computing time. Nevertheless, the differences diminished when >5,000 SNPs were selected. Optimization was accomplished conditionally on the presence of SNPs that were obligated to each chromosome. The frame location of SNPs on a chip can be either uniform (evenly spaced or non-uniform. For the latter design, a tunable empirical Beta distribution was used to guide location distribution of frame SNPs such that both ends of each chromosome were enriched with SNPs. The SNP distribution on each chromosome was finalized through the objective function that was locally and empirically maximized. This MOLO algorithm was capable of selecting a set of approximately evenly-spaced and highly-informative SNPs, which in turn led to increased imputation accuracy compared with selection solely of evenly-spaced SNPs. Imputation accuracy increased with LD chip size, and imputation error rate was extremely low for chips with ≥3,000 SNPs. Assuming that genotyping or imputation error occurs at random, imputation error rate can be viewed as the upper limit for genomic prediction error. Our results show that about 25% of imputation error rate was propagated to genomic prediction in an Angus

An in vitro tag-and-modify protein sample generation method for single-molecule fluorescence resonance energy transfer.

Science.gov (United States)

Hamadani, Kambiz M; Howe, Jesse; Jensen, Madeleine K; Wu, Peng; Cate, Jamie H D; Marqusee, Susan

2017-09-22

Biomolecular systems exhibit many dynamic and biologically relevant properties, such as conformational fluctuations, multistep catalysis, transient interactions, folding, and allosteric structural transitions. These properties are challenging to detect and engineer using standard ensemble-based techniques. To address this drawback, single-molecule methods offer a way to access conformational distributions, transient states, and asynchronous dynamics inaccessible to these standard techniques. Fluorescence-based single-molecule approaches are parallelizable and compatible with multiplexed detection; to date, however, they have remained limited to serial screens of small protein libraries. This stems from the current absence of methods for generating either individual dual-labeled protein samples at high throughputs or protein libraries compatible with multiplexed screening platforms. Here, we demonstrate that by combining purified and reconstituted in vitro translation, quantitative unnatural amino acid incorporation via AUG codon reassignment, and copper-catalyzed azide-alkyne cycloaddition, we can overcome these challenges for target proteins that are, or can be, methionine-depleted. We present an in vitro parallelizable approach that does not require laborious target-specific purification to generate dual-labeled proteins and ribosome-nascent chain libraries suitable for single-molecule FRET-based conformational phenotyping. We demonstrate the power of this approach by tracking the effects of mutations, C-terminal extensions, and ribosomal tethering on the structure and stability of three protein model systems: barnase, spectrin, and T4 lysozyme. Importantly, dual-labeled ribosome-nascent chain libraries enable single-molecule co-localization of genotypes with phenotypes, are well suited for multiplexed single-molecule screening of protein libraries, and should enable the in vitro directed evolution of proteins with designer single-molecule conformational
Curing dynamics of photopolymers measured by single-shot heterodyne transient grating method.

Science.gov (United States)

Arai, Mika; Fujii, Tomomi; Inoue, Hayato; Kuwahara, Shota; Katayama, Kenji

2013-01-01

The heterodyne transient grating (HD-TG) method was first applied to the curing dynamics measurement of photopolymers. The curing dynamics for various monomers including an initiator (2.5 vol%) was monitored optically via the refractive index change after a single UV pulse irradiation. We could obtain the polymerization time and the final change in the refractive index, and the parameters were correlated with the viscosity, molecular structure, and reaction sites. As the polymerization time was longer, the final refractive change was larger, and the polymerization time was explained in terms of the monomer properties.
A Method for Interactive 3D Reconstruction of Piecewise Planar Objects from Single Images

OpenAIRE

Sturm , Peter; Maybank , Steve

1999-01-01

International audience; We present an approach for 3D reconstruction of objects from a single image. Obviously, constraints on the 3D structure are needed to perform this task. Our approach is based on user-provided coplanarity, perpendicularity and parallelism constraints. These are used to calibrate the image and perform 3D reconstruction. The method is described in detail and results are provided.
Genome-wide association study using high-density single nucleotide polymorphism arrays and whole-genome sequences for clinical mastitis traits in dairy cattle.

Science.gov (United States)

Sahana, G; Guldbrandtsen, B; Thomsen, B; Holm, L-E; Panitz, F; Brøndum, R F; Bendixen, C; Lund, M S

2014-11-01

Mastitis is a mammary disease that frequently affects dairy cattle. Despite considerable research on the development of effective prevention and treatment strategies, mastitis continues to be a significant issue in bovine veterinary medicine. To identify major genes that affect mastitis in dairy cattle, 6 chromosomal regions on Bos taurus autosome (BTA) 6, 13, 16, 19, and 20 were selected from a genome scan for 9 mastitis phenotypes using imputed high-density single nucleotide polymorphism arrays. Association analyses using sequence-level variants for the 6 targeted regions were carried out to map causal variants using whole-genome sequence data from 3 breeds. The quantitative trait loci (QTL) discovery population comprised 4,992 progeny-tested Holstein bulls, and QTL were confirmed in 4,442 Nordic Red and 1,126 Jersey cattle. The targeted regions were imputed to the sequence level. The highest association signal for clinical mastitis was observed on BTA 6 at 88.97 Mb in Holstein cattle and was confirmed in Nordic Red cattle. The peak association region on BTA 6 contained 2 genes: vitamin D-binding protein precursor (GC) and neuropeptide FF receptor 2 (NPFFR2), which, based on known biological functions, are good candidates for affecting mastitis. However, strong linkage disequilibrium in this region prevented conclusive determination of the causal gene. A different QTL on BTA 6 located at 88.32 Mb in Holstein cattle affected mastitis. In addition, QTL on BTA 13 and 19 were confirmed to segregate in Nordic Red cattle and QTL on BTA 16 and 20 were confirmed in Jersey cattle. Although several candidate genes were identified in these targeted regions, it was not possible to identify a gene or polymorphism as the causal factor for any of these regions. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
System and method for single-phase, single-stage grid-interactive inverter

Science.gov (United States)

Liu, Liming; Li, Hui

2015-09-01

The present invention provides for the integration of distributed renewable energy sources/storages utilizing a cascaded DC-AC inverter, thereby eliminating the need for a DC-DC converter. The ability to segment the energy sources and energy storages improves the maintenance capability and system reliability of the distributed generation system, as well as achieve wide range reactive power compensation. In the absence of a DC-DC converter, single stage energy conversion can be achieved to enhance energy conversion efficiency.
Mortality incidence estimation using federal death certificate and natality data with an application to Tay-Sachs disease.

Science.gov (United States)

Jalal, Kabir; Carter, Randy L

2015-09-01

For confidentiality reasons, US federal death certificate data are incomplete with regards to the dates of birth and death for the decedents, making calculation of total lifetime of a decedent impossible and thus estimation of mortality incidence difficult. This paper proposes the use of natality data and an imputation-based method to estimate age-specific mortality incidence rates in the face of this missing information. By utilizing previously determined probabilities of birth, a birth date and death date are imputed for every decedent in the dataset. Thus, the birth cohort of each individual is imputed, and the total on-study time can be calculated. This idea is implemented in two approaches for estimation of mortality incidence rates. The first is an extension of a person-time approach, while the second is an extension of a life table approach. Monte Carlo simulations showed that both approaches perform well in comparison to the ideal complete data methods, but that the person-time method is preferred. An application to Tay-Sachs disease is demonstrated. It is concluded that the imputation methods proposed provide valid estimates of the incidence of death from death certificate data without the need for additional assumptions under which usual mortality rates provide valid estimates. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Growth of high quality Bi2Sr2CaCu2Oy single crystals by the modified vertical Bridgman method

International Nuclear Information System (INIS)

Nagashima, O.; Tanaka, H.; Echizen, Y.; Kishida, S.

2004-01-01

We grew Bi 2 Sr 2 CaCu 2 O y (Bi-2212) single crystals by the modified vertical Bridgman (VB) method, and investigated their characteristics in order to clarify the optimum growth conditions for obtaining high-quality Bi-2212 single crystals. The Bi-2212 single crystals were grown changing pulling rates or using starting materials after pre-treatments. We found that the superconducting critical temperature (T c ) of the single crystal prepared at a slow growth rate of 0.25 mm/h was about 88 K and that the single crystals were a Bi-2212 single phase. Moreover, the single crystals grown using the starting materials pre-treated in Ar and O 2 atmospheres, had the T c of about 88 and 86 K, respectively. In addition, both of single crystals were Bi-2212 single phase
Comparison of Three Methods Estimating Baseline Creatinine For Acute Kidney Injury in Hospitalized Patients: a Multicentre Survey in Third-Level Urban Hospitals of China.

Science.gov (United States)

Lang, Xia-Bing; Yang, Yi; Yang, Ju-Rong; Wan, Jian-Xin; Yu, Sheng-Qiang; Cui, Jiong; Tang, Xiao-Jing; Chen, Jianghua

2018-01-01

A lack of baseline serum creatinine (SCr) data leads to underestimation of the burden caused by acute kidney injury (AKI) in developing countries. The goal of this study was to investigate the effects of various baseline SCr analysis methods on the current diagnosis of AKI in hospitalized patients. Patients with at least one SCr value during their hospital stay between January 1, 2011 and December 31, 2012 were retrospectively included in the study. The baseline SCr was determined either by the minimum SCr (SCrMIN) or the estimated SCr using the MDRD formula (SCrGFR-75). We also used the dynamic baseline SCr (SCrdynamic) in accordance with the 7 day/48 hour time window. AKI was defined based on the KDIGO SCr criteria. Of 562,733 hospitalized patients, 350,458 (62.3%) had at least one SCr determination, and 146,185 (26.0%) had repeat SCr tests. AKI was diagnosed in 13,883 (2.5%) patients using the SCrMIN, 21,281 (3.8%) using the SCrGFR-75 and 9,288 (1.7%) using the SCrdynamic. Compared with the non-AKI patients, AKI patients had a higher in-hospital mortality rate regardless of the baseline SCr analysis method. Because of the scarcity of SCr data, imputation of the baseline SCr is necessary to remedy the missing data. The detection rate of AKI varies depending on the different imputation methods. SCrGFR-75 can identify more AKI cases than the other two methods. © 2018 The Author(s). Published by S. Karger AG, Basel.
A Renormalisation Group Method. V. A Single Renormalisation Group Step

Science.gov (United States)

Brydges, David C.; Slade, Gordon

2015-05-01

This paper is the fifth in a series devoted to the development of a rigorous renormalisation group method applicable to lattice field theories containing boson and/or fermion fields, and comprises the core of the method. In the renormalisation group method, increasingly large scales are studied in a progressive manner, with an interaction parametrised by a field polynomial which evolves with the scale under the renormalisation group map. In our context, the progressive analysis is performed via a finite-range covariance decomposition. Perturbative calculations are used to track the flow of the coupling constants of the evolving polynomial, but on their own perturbative calculations are insufficient to control error terms and to obtain mathematically rigorous results. In this paper, we define an additional non-perturbative coordinate, which together with the flow of coupling constants defines the complete evolution of the renormalisation group map. We specify conditions under which the non-perturbative coordinate is contractive under a single renormalisation group step. Our framework is essentially combinatorial, but its implementation relies on analytic results developed earlier in the series of papers. The results of this paper are applied elsewhere to analyse the critical behaviour of the 4-dimensional continuous-time weakly self-avoiding walk and of the 4-dimensional -component model. In particular, the existence of a logarithmic correction to mean-field scaling for the susceptibility can be proved for both models, together with other facts about critical exponents and critical behaviour.
Statistical Methods for Single-Particle Electron Cryomicroscopy

DEFF Research Database (Denmark)

Jensen, Katrine Hommelhoff

Electron cryomicroscopy (cryo-EM) is a form of transmission electron microscopy, aimed at reconstructing the 3D structure of a macromolecular complex from a large set of 2D projection images, as they exhibit a very low signal-to-noise ratio (SNR). In the single-particle reconstruction (SPR) probl...
A comparison of selected parametric and non-parametric imputation methods for estimating forest biomass and basal area

Science.gov (United States)

Donald Gagliasso; Susan Hummel; Hailemariam. Temesgen

2014-01-01

Various methods have been used to estimate the amount of above ground forest biomass across landscapes and to create biomass maps for specific stands or pixels across ownership or project areas. Without an accurate estimation method, land managers might end up with incorrect biomass estimate maps, which could lead them to make poorer decisions in their future...
An Analytical Method for Determining the Load Distribution of Single-Column Multibolt Connection

Directory of Open Access Journals (Sweden)

Nirut Konkong

2017-01-01

Full Text Available The purpose of this research was to investigate the effect of geometric variables on the bolt load distributions of a cold-formed steel bolt connection. The study was conducted using an experimental test, finite element analysis, and an analytical method. The experimental study was performed using single-lap shear testing of a concentrically loaded bolt connection fabricated from G550 cold-formed steel. Finite element analysis with shell elements was used to model the cold-formed steel plate while solid elements were used to model the bolt fastener for the purpose of studying the structural behavior of the bolt connections. Material nonlinearities, contact problems, and a geometric nonlinearity procedure were used to predict the failure behavior of the bolt connections. The analytical method was generated using the spring model. The bolt-plate interaction stiffness was newly proposed which was verified by the experiment and finite element model. It was applied to examine the effect of geometric variables on the single-column multibolt connection. The effects were studied of varying bolt diameter, plate thickness, and the plate thickness ratio (t2/t1 on the bolt load distribution. The results of the parametric study showed that the t2/t1 ratio controlled the efficiency of the bolt load distribution more than the other parameters studied.
Building Chaotic Model From Incomplete Time Series

Science.gov (United States)

Siek, Michael; Solomatine, Dimitri

2010-05-01

This paper presents a number of novel techniques for building a predictive chaotic model from incomplete time series. A predictive chaotic model is built by reconstructing the time-delayed phase space from observed time series and the prediction is made by a global model or adaptive local models based on the dynamical neighbors found in the reconstructed phase space. In general, the building of any data-driven models depends on the completeness and quality of the data itself. However, the completeness of the data availability can not always be guaranteed since the measurement or data transmission is intermittently not working properly due to some reasons. We propose two main solutions dealing with incomplete time series: using imputing and non-imputing methods. For imputing methods, we utilized the interpolation methods (weighted sum of linear interpolations, Bayesian principle component analysis and cubic spline interpolation) and predictive models (neural network, kernel machine, chaotic model) for estimating the missing values. After imputing the missing values, the phase space reconstruction and chaotic model prediction are executed as a standard procedure. For non-imputing methods, we reconstructed the time-delayed phase space from observed time series with missing values. This reconstruction results in non-continuous trajectories. However, the local model prediction can still be made from the other dynamical neighbors reconstructed from non-missing values. We implemented and tested these methods to construct a chaotic model for predicting storm surges at Hoek van Holland as the entrance of Rotterdam Port. The hourly surge time series is available for duration of 1990-1996. For measuring the performance of the proposed methods, a synthetic time series with missing values generated by a particular random variable to the original (complete) time series is utilized. There exist two main performance measures used in this work: (1) error measures between the actual
Improved Design Methods for Robust Single- and Three-Phase ac-dc-ac Power Converters

DEFF Research Database (Denmark)

Qin, Zian

. The approaches for improving their performance, in terms of the voltage stress, efficiency, power density, cost, loss distribution, and temperature, will be studied. The structure of the thesis is as follows, Chapter 1 presents the introduction and motivation of the whole project as well as the background...... becomes a emerging challenge. Accordingly, installation of sustainable power generators like wind turbines and solar panels has experienced a large increase during the last decades. Meanwhile, power electronics converters, as interfaces in electrical system, are delivering approximately 80 % electricity...... back-to-back, and meanwhile improve the harmonics, control flexibility, and thermal distribution between the switches. Afterwards, active power decoupling methods for single-phase inverters or rectifiers that are similar to the single-phase ac-dc-ac converter, are studied in Chapter 4...
A new method for explicit modelling of single failure event within different common cause failure groups

International Nuclear Information System (INIS)

Kančev, Duško; Čepin, Marko

2012-01-01

Redundancy and diversity are the main principles of the safety systems in the nuclear industry. Implementation of safety components redundancy has been acknowledged as an effective approach for assuring high levels of system reliability. The existence of redundant components, identical in most of the cases, implicates a probability of their simultaneous failure due to a shared cause—a common cause failure. This paper presents a new method for explicit modelling of single component failure event within multiple common cause failure groups simultaneously. The method is based on a modification of the frequently utilised Beta Factor parametric model. The motivation for development of this method lays in the fact that one of the most widespread softwares for fault tree and event tree modelling as part of the probabilistic safety assessment does not comprise the option for simultaneous assignment of single failure event to multiple common cause failure groups. In that sense, the proposed method can be seen as an advantage of the explicit modelling of common cause failures. A standard standby safety system is selected as a case study for application and study of the proposed methodology. The results and insights implicate improved, more transparent and more comprehensive models within probabilistic safety assessment.
Analyzing the Impacts of Alternated Number of Iterations in Multiple Imputation Method on Explanatory Factor Analysis

Directory of Open Access Journals (Sweden)

Duygu KOÇAK

2017-11-01

Full Text Available The study aims to identify the effects of iteration numbers used in multiple iteration method, one of the methods used to cope with missing values, on the results of factor analysis. With this aim, artificial datasets of different sample sizes were created. Missing values at random and missing values at complete random were created in various ratios by deleting data. For the data in random missing values, a second variable was iterated at ordinal scale level and datasets with different ratios of missing values were obtained based on the levels of this variable. The data were generated using “psych” program in R software, while “dplyr” program was used to create codes that would delete values according to predetermined conditions of missing value mechanism. Different datasets were generated by applying different iteration numbers. Explanatory factor analysis was conducted on the datasets completed and the factors and total explained variances are presented. These values were first evaluated based on the number of factors and total variance explained of the complete datasets. The results indicate that multiple iteration method yields a better performance in cases of missing values at random compared to datasets with missing values at complete random. Also, it was found that increasing the number of iterations in both missing value datasets decreases the difference in the results obtained from complete datasets.
Estimating range of influence in case of missing spatial data

DEFF Research Database (Denmark)

Bihrmann, Kristine; Ersbøll, Annette Kjær

2015-01-01

BACKGROUND: The range of influence refers to the average distance between locations at which the observed outcome is no longer correlated. In many studies, missing data occur and a popular tool for handling missing data is multiple imputation. The objective of this study was to investigate how...... the estimated range of influence is affected when 1) the outcome is only observed at some of a given set of locations, and 2) multiple imputation is used to impute the outcome at the non-observed locations. METHODS: The study was based on the simulation of missing outcomes in a complete data set. The range...... of influence was estimated from a logistic regression model with a spatially structured random effect, modelled by a Gaussian field. Results were evaluated by comparing estimates obtained from complete, missing, and imputed data. RESULTS: In most simulation scenarios, the range estimates were consistent...
See me, feel me: methods to concurrently visualize and manipulate single DNA molecules and associated proteins

NARCIS (Netherlands)

van Mameren, J.; Peterman, E.J.G.; Wuite, G.J.L.

2008-01-01

Direct visualization of DNA and proteins allows researchers to investigate DNA-protein interactions with great detail. Much progress has been made in this area as a result of increasingly sensitive single-molecule fluorescence techniques. At the same time, methods that control the conformation of
A method for 3D-reconstruction of a muscle thick filament using the tilt series images of a single filament electron tomogram.

Science.gov (United States)

Márquez, G; Pinto, A; Alamo, L; Baumann, B; Ye, F; Winkler, H; Taylor, K; Padrón, R

2014-05-01

Myosin interacting-heads (MIH) motifs are visualized in 3D-reconstructions of thick filaments from striated muscle. These reconstructions are calculated by averaging methods using images from electron micrographs of grids prepared using numerous filament preparations. Here we propose an alternative method to calculate the 3D-reconstruction of a single thick filament using only a tilt series images recorded by electron tomography. Relaxed thick filaments, prepared from tarantula leg muscle homogenates, were negatively stained. Single-axis tilt series of single isolated thick filaments were obtained with the electron microscope at a low electron dose, and recorded on a CCD camera by electron tomography. An IHRSR 3D-recontruction was calculated from the tilt series images of a single thick filament. The reconstruction was enhanced by including in the search stage dual tilt image segments while only single tilt along the filament axis is usually used, as well as applying a band pass filter just before the back projection. The reconstruction from a single filament has a 40 Å resolution and clearly shows the presence of MIH motifs. In contrast, the electron tomogram 3D-reconstruction of the same thick filament - calculated without any image averaging and/or imposition of helical symmetry - only reveals MIH motifs infrequently. This is - to our knowledge - the first application of the IHRSR method to calculate a 3D reconstruction from tilt series images. This single filament IHRSR reconstruction method (SF-IHRSR) should provide a new tool to assess structural differences between well-ordered thick (or thin) filaments in a grid by recording separately their electron tomograms. Copyright © 2014 Elsevier Inc. All rights reserved.
Estimation of 131J-Jodohippurateclearance by a simplified method using a single plasma sample

International Nuclear Information System (INIS)

Botsch, H.; Golde, G.; Kampf, D.

1980-01-01

Theoretical volumes calculated from the reciprocal of the plasma concentration of 131 J-Jodohippurate were compared in 95 patients with clearance values calculated by the 2-compartment-method and in 18 patients with conventional PAH-clearance. For estimating Hippurate-clearance from a single blood sampling the most favorable time is 45 min. after injection (r = 0.96; clearance 400/ml/min.: r = 0.98). Clearance values may be derived from the formula: C = 0.4 + 7.26 V - 0.021 x V 2 (V = injected activity/activity per l plasma taken 45 min. after injection). The simplicity, precision and reproducibility of the above mentioned clearance-method is emphasized. (orig.) [de

SALP, a new single-stranded DNA library preparation method especially useful for the high-throughput characterization of chromatin openness states.

Science.gov (United States)

Wu, Jian; Dai, Wei; Wu, Lin; Wang, Jinke

2018-02-13

Next-generation sequencing (NGS) is fundamental to the current biological and biomedical research. Construction of sequencing library is a key step of NGS. Therefore, various library construction methods have been explored. However, the current methods are still limited by some shortcomings. This study developed a new NGS library construction method, Single strand Adaptor Library Preparation (SALP), by using a novel single strand adaptor (SSA). SSA is a double-stranded oligonucleotide with a 3' overhang of 3 random nucleotides, which can be efficiently ligated to the 3' end of single strand DNA by T4 DNA ligase. SALP can be started with any denatured DNA fragments such as those sheared by Tn5 tagmentation, enzyme digestion and sonication. When started with Tn5-tagmented chromatin, SALP can overcome a key limitation of ATAC-seq and become a high-throughput NGS library construction method, SALP-seq, which can be used to comparatively characterize the chromatin openness state of multiple cells unbiasly. In this way, this study successfully characterized the comparative chromatin openness states of four different cell lines, including GM12878, HepG2, HeLa and 293T, with SALP-seq. Similarly, this study also successfully characterized the chromatin openness states of HepG2 cells with SALP-seq by using 10 5 to 500 cells. This study developed a new NGS library construction method, SALP, by using a novel kind of single strand adaptor (SSA), which should has wide applications in the future due to its unique performance.
A NEW FRACTIONAL MODEL OF SINGLE DEGREE OF FREEDOM SYSTEM, BY USING GENERALIZED DIFFERENTIAL TRANSFORM METHOD

Directory of Open Access Journals (Sweden)

HASHEM SABERI NAJAFI

2016-07-01

Full Text Available Generalized differential transform method (GDTM is a powerful method to solve the fractional differential equations. In this paper, a new fractional model for systems with single degree of freedom (SDOF is presented, by using the GDTM. The advantage of this method compared with some other numerical methods has been shown. The analysis of new approximations, damping and acceleration of systems are also described. Finally, by reducing damping and analysis of the errors, in one of the fractional cases, we have shown that in addition to having a suitable solution for the displacement close to the exact one, the system enjoys acceleration once crossing the equilibrium point.
Simulations of a single vortex ring using an unbounded, regularized particle-mesh based vortex method

DEFF Research Database (Denmark)

Hejlesen, Mads Mølholm; Spietz, Henrik J.; Walther, Jens Honore

2014-01-01

, unbounded particle-mesh based vortex method is used to simulate the instability, transition to turbulence and eventual destruction of a single vortex ring. From the simulation data a novel method on analyzing the dynamics of the enstrophy is presented based on the alignment of the vorticity vector...... with the principal axis of the strain rate tensor. We find that the dynamics of the enstrophy density is dominated by the local flow deformation and axis of rotation, which is used to infer some concrete tendencies related to the topology of the vorticity field....
Detection of Brucella melitensis and Brucella abortus strains using a single-stage PCR method

Directory of Open Access Journals (Sweden)

Alamian, S.

2015-04-01

Full Text Available Brucella melitensis and Brucella abortus are of the most important causes of brucellosis, an infectious disease which is transmitted either directly or indirectly including consuming unpasteurized dairy products. Both strains are considered endemic in Iran. Common diagnostic methods such as bacteriologic cultures are difficult and time consuming regarding the bacteria. The aim of this study was to suggest a single-stage PCR method using a pair of primers to detect both B. melitensis and B. abortus. The primers were named UF1 and UR1 and the results showed that the final size of PCR products were 84 bp and 99 bp for B. melitensis and B. abortus, respectively. Therefore the method could be useful for rapid detection of B. melitensis and B. abortus simultaneously.
Statistical methods for the analysis of left-censored variables [Statistische Analysemethoden für linkszensierte Variablen und Beobachtungen mit Werten unterhalb einer Bestimmungs- oder Nachweisgrenze

Directory of Open Access Journals (Sweden)

Pesch, Beate

2013-03-01

Full Text Available [english] In some applications statisticians are confronted with values which are reported to be below a limit of detection or quantitation. These left-censored variables are a challenge in the statistical analysis. In a simulation study, we compare different methods to deal with this type of data in statistical applications. These include measures of location, dispersion, association, and statistical modeling. Our simulation study showed that the multiple imputation approach and the Tobit regression lead to unbiased estimates, whereas the naïve methods including simple substitution of non-detects lead to unreliable estimates. We illustrate the application of the multiple imputation approach and the Tobit regression with an example from occupational epidemiology. [german] In der statistischen Praxis treten immer wieder Variablen mit Werten unterhalb einer Bestimmungs- oder Nachweisgrenze auf. Diese sind linkszensiert und stellen daher eine Herausforderung für die statistische Analyse dar. Im Rahmen einer Simulationsstudie vergleichen wir Schätzmethoden zur Berechnung von Lage- und Streuungmaßen, Korrelationen und Regressionsparametern bei diesen Variablen. Unsere Ergebnisse zeigen, dass die multiple Imputationsmethode und die Tobit Regression zu unverzerrten Schätzungen führen. Naive Methoden, einschließlich der einfachen Substitution von zensierten Beobachtungen, ergeben hingegen unzuverlässige Schätzungen. Wir illustrieren die Anwendung der multiplen Imputationsmethode und der Tobit Regression anhand eines Beispiels aus der Epidemiologie der Arbeitswelt.
Hydrogen storage in single-walled carbon nanotubes: methods and results

International Nuclear Information System (INIS)

Poirier, E.; Chahine, R.; Tessier, A.; Cossement, D.; Lafi, L.; Bose, T.K.

2004-01-01

We present high sensitivity gravimetric and volumetric hydrogen sorption measurement systems adapted for in situ conditioning under high temperature and high vacuum. These systems, which allow for precise measurements on small samples and thorough degassing, are used for sorption measurements on carbon nanostructures. We developed one volumetric system for the pressure range 0-1 bar, and two gravimetric systems for 0-1 bar and 0-100 bars. The use of both gravimetric and volumetric methods allows for the cross-checking of the results. The accuracy of the systems has been determined from hydrogen absorption measurements on palladium. The accuracies of the 0-1 bar volumetric and gravimetric systems are about 10 μg and 20 μg respectively. The accuracy of the 0-100 bars gravimetric system is about 20 μg. Hydrogen sorption measurements on single-walled carbon nanotubes (SWNTs) and metal-incorporated- SWNTs are presented. (author)
Alternative methods for CYP2D6 phenotyping: comparison of dextromethorphan metabolic ratios from AUC, single point plasma, and urine.

Science.gov (United States)

Chen, Rui; Wang, Haotian; Shi, Jun; Hu, Pei

2016-05-01

CYP2D6 is a high polymorphic enzyme. Determining its phenotype before CYP2D6 substrate treatment can avoid dose-dependent adverse events or therapeutic failures. Alternative phenotyping methods of CYP2D6 were compared to aluate the appropriate and precise time points for phenotyping after single-dose and ultiple-dose of 30-mg controlled-release (CR) dextromethorphan (DM) and to explore the antimodes for potential sampling methods. This was an open-label, single and multiple-dose study. 21 subjects were assigned to receive a single dose of CR DM 30 mg orally, followed by a 3-day washout period prior to oral administration of CR DM 30 mg every 12 hours for 6 days. Metabolic ratios (MRs) from AUC∞ after single dosing and from AUC0-12h at steady state were taken as the gold standard. The correlations of metabolic ratios of DM to dextrorphan (MRDM/DX) values based on different phenotyping methods were assessed. Linear regression formulas were derived to calculate the antimodes for potential sample methods. In the single-dose part of the study statistically significant correlations were found between MRDM/DX from AUC∞ and from serial plasma points from 1 to 30 hours or from urine (all p-values < 0.001). In the multiple-dose part, statistically significant correlations were found between MRDM/DX from AUC0-12h on day 6 and MRDM/DX from serial plasma points from 0 to 36 hours after the last dosing (all p-values < 0.001). Based on reported urinary antimode and linear regression analysis, the antimodes of AUC and plasma points were derived to profile the trend of antimodes as the drug concentrations changed. MRDM/DX from plasma points had good correlations with MRDM/DX from AUC. Plasma points from 1 to 30 hours after single dose of 30-mg CR DM and any plasma point at steady state after multiple doses of CR DM could potentially be used for phenotyping of CYP2D6.
Pressure transient analysis in single and two-phase water by finite difference methods

International Nuclear Information System (INIS)

Berry, G.F.; Daley, J.G.

1977-01-01

An important consideration in the design of LMFBR steam generators is the possibility of leakage from a steam generator water tube. The ensuing sodium/water reaction will be largely controlled by the amount of water available at the leak site, thus analysis methods treating this event must have the capability of accurately modeling pressure transients through all states of water occurring in a steam generator, whether single or two-phase. The equation systems of the present model consist of the conservation equations together with an equation of state for one-dimensional homogeneous flow. These equations are then solved using finite difference techniques with phase considerations and non-equilibrium effects being treated through the equation of state. The basis for water property computation is Keenan's 'fundamental equation of state' which is applicable to single-phase water at pressures less than 1000 bars and temperatures less than 1300 0 C. This provides formulations allowing computation of any water property to any desired precision. Two-phase properties are constructed from values on the saturation line. The use of formulations permits the direct calculation of any thermodynamic property (or property derivative) to great precision while requiring very little computer storage, but does involve considerable computation time. For this reason an optional calculation scheme based on the method of 'transfinite interpolation' is included to give rapid computation in selected regions with decreased precision. The conservation equations were solved using the second order Lax-Wendroff scheme which includes wall friction, allows the formation of shocks and locally supersonic flow. Computational boundary conditions were found from a method-of-characteristics solution at the reservoir and receiver ends. The local characteristics were used to interpolate data from inside the pipe to the boundary
Genome-wide association study with 1000 genomes imputation identifies signals for nine sex hormone-related phenotypes.

Science.gov (United States)

Ruth, Katherine S; Campbell, Purdey J; Chew, Shelby; Lim, Ee Mun; Hadlow, Narelle; Stuckey, Bronwyn G A; Brown, Suzanne J; Feenstra, Bjarke; Joseph, John; Surdulescu, Gabriela L; Zheng, Hou Feng; Richards, J Brent; Murray, Anna; Spector, Tim D; Wilson, Scott G; Perry, John R B

2016-02-01

Genetic factors contribute strongly to sex hormone levels, yet knowledge of the regulatory mechanisms remains incomplete. Genome-wide association studies (GWAS) have identified only a small number of loci associated with sex hormone levels, with several reproductive hormones yet to be assessed. The aim of the study was to identify novel genetic variants contributing to the regulation of sex hormones. We performed GWAS using genotypes imputed from the 1000 Genomes reference panel. The study used genotype and phenotype data from a UK twin register. We included 2913 individuals (up to 294 males) from the Twins UK study, excluding individuals receiving hormone treatment. Phenotypes were standardised for age, sex, BMI, stage of menstrual cycle and menopausal status. We tested 7,879,351 autosomal SNPs for association with levels of dehydroepiandrosterone sulphate (DHEAS), oestradiol, free androgen index (FAI), follicle-stimulating hormone (FSH), luteinizing hormone (LH), prolactin, progesterone, sex hormone-binding globulin and testosterone. Eight independent genetic variants reached genome-wide significance (P<5 × 10(-8)), with minor allele frequencies of 1.3-23.9%. Novel signals included variants for progesterone (P=7.68 × 10(-12)), oestradiol (P=1.63 × 10(-8)) and FAI (P=1.50 × 10(-8)). A genetic variant near the FSHB gene was identified which influenced both FSH (P=1.74 × 10(-8)) and LH (P=3.94 × 10(-9)) levels. A separate locus on chromosome 7 was associated with both DHEAS (P=1.82 × 10(-14)) and progesterone (P=6.09 × 10(-14)). This study highlights loci that are relevant to reproductive function and suggests overlap in the genetic basis of hormone regulation.
Efficiency of different methods of extra-cavity second harmonic generation of continuous wave single-frequency radiation.

Science.gov (United States)

Khripunov, Sergey; Kobtsev, Sergey; Radnatarov, Daba

2016-01-20

This work presents for the first time to the best of our knowledge a comparative efficiency analysis among various techniques of extra-cavity second harmonic generation (SHG) of continuous-wave single-frequency radiation in nonperiodically poled nonlinear crystals within a broad range of power levels. Efficiency of nonlinear radiation transformation at powers from 1 W to 10 kW was studied in three different configurations: with an external power-enhancement cavity and without the cavity in the case of single and double radiation pass through a nonlinear crystal. It is demonstrated that at power levels exceeding 1 kW, the efficiencies of methods with and without external power-enhancement cavities become comparable, whereas at even higher powers, SHG by a single or double pass through a nonlinear crystal becomes preferable because of the relatively high efficiency of nonlinear transformation and fairly simple implementation.
Spectral methods. Fundamentals in single domains

International Nuclear Information System (INIS)

Canuto, C.

2006-01-01

Since the publication of ''Spectral Methods in Fluid Dynamics'' 1988, spectral methods have become firmly established as a mainstream tool for scientific and engineering computation. The authors of that book have incorporated into this new edition the many improvements in the algorithms and the theory of spectral methods that have been made since then. This latest book retains the tight integration between the theoretical and practical aspects of spectral methods, and the chapters are enhanced with material on the Galerkin with numerical integration version of spectral methods. The discussion of direct and iterative solution methods is also greatly expanded. (orig.)
Construction Method Study For Installation Of A Large Riser In A Single-Shell Tank

International Nuclear Information System (INIS)

Adkisson, D.A.

2010-01-01

This study evaluates and identifies a construction method for cutting a hole in a single-shell tank dome. This study also identifies and evaluates vendors for performing the cut. Single-shell tanks (SST) in the 241-C tank farm are currently being retrieved using various retrieval technologies (e.g., modified sluicing). The Hanford Federal Facility Agreement and Consent Order require that the SSTs be retrieved to less than 360 cubic feet of radioactive waste. The current technologies identified and deployed for tank retrieval have not been able to retrieve waste in accordance with the Hanford Federal Facility Agreement and Consent Order. As such, alternative retrieval systems have been proposed and are currently under construction that will have the ability to retrieve waste to this defined level. The proposed retrieval systems will not fit down existing risers. New risers will need to be installed to provide the retrieval systems access to the inside of the SSTs. The purpose of this study is two-fold. The first objective is to identify multiple concrete cutting technologies and perform an initial pre-screening, evaluate the technologies identified for more in-depth analysis, and recommend a technology/methodology for cutting a hole in the tank dome. The identified/pre-screened methods will be evaluated based on the following criteria: (1) Maturity/complexity; (2) Waste generation; (3) Safety; (4) Cost; and (5) Schedule. Once the preferred method is identified to cut the hole in the tank dome, the second objective is to identify, evaluate, and recommend a vendor for the technology selected that will perform the cutting process.
A novel single-step, multipoint calibration method for instrumented Lab-on-Chip systems

DEFF Research Database (Denmark)

Pfreundt, Andrea; Patou, François; Zulfiqar, Azeem

2014-01-01

for instrument-based PoC blood biomarker analysis systems. Motivated by the complexity of associating high-accuracy biosensing using silicon nanowire field effect transistors with ease of use for the PoC system user, we propose a novel one-step, multipoint calibration method for LoC-based systems. Our approach...... specifically addresses the important interfaces between a novel microfluidic unit to integrate the sensor array and a mobile-device hardware accessory. A multi-point calibration curve is obtained by generating a defined set of reference concentrations from a single input. By consecutively splitting the flow...
QED radiative correction for the single-W production using a parton shower method

International Nuclear Information System (INIS)

Kurihara, Y.; Fujimoto, J.; Ishikawa, T.; Shimizu, Y.; Kato, K.; Tobimatsu, K.; Munehisa, T.

2001-01-01

A parton shower method for the photonic radiative correction is applied to single W-boson production processes. The energy scale for the evolution of the parton shower is determined so that the correct soft-photon emission is reproduced. Photon spectra radiated from the partons are compared with those from the exact matrix elements, and show a good agreement. Possible errors due to an inappropriate energy-scale selection or due to the ambiguity of the energy-scale determination are also discussed, particularly for the measurements on triple gauge couplings. (orig.)
Method for the irradiation of single targets

International Nuclear Information System (INIS)

Krimmel, E.; Dullnig, H.

1977-01-01

The invention pertains to a system for the irradiation of single targets with particle beams. The targets all have frames around them. The system consists of an automatic advance leading into a high-vacuum chamber, and a positioning element which guides one target after the other into the irradiation position, at right angles to the automatic advance, and back into the automatic advance after irradiation. (GSCH) [de
Influence of modulation method on using LC-traps with single-phase voltage source converters

DEFF Research Database (Denmark)

Wang, Xiongfei; Min, Huang; Bai, Haofeng

2015-01-01

The switching-frequency LC-trap filter has recently been employed with high-order passive filters for Voltage Source Inverters (VSIs). This paper investigates the influence of modulation method on using the LC-traps with single-phase VSIs. Two-level (bipolar) and three-level (unipolar) modulations...... that include phase distortion and alternative phase opposition distortion methods are analyzed. Harmonic filtering performances of four LC-trap-based filters with different locations of LC-traps are compared. It is shown that the use of parallel-LC-traps in series with filter inductors, either grid...... or converter side, has a worse harmonic filtering performance than using series-LC-trap in the shunt branch. Simulations and experimental results are presented for verifications....
Effects of Singapore's Model Method on Elementary Student Problem Solving Performance: Single Subject Research

Science.gov (United States)

Mahoney, Kevin

2012-01-01

This research investigation examined the effects of Singapore's Model Method, also known as "model drawing" or "bar modeling" on the word problem-solving performance of American third and fourth grade students. Employing a single-case design, a researcher-designed teaching intervention was delivered to a child in third…
Automated bone removal in CT angiography: Comparison of methods based on single energy and dual energy scans

International Nuclear Information System (INIS)

Straten, Marcel van; Schaap, Michiel; Dijkshoorn, Marcel L.; Greuter, Marcel J.; Lugt, Aad van der; Krestin, Gabriel P.; Niessen, Wiro J.

2011-01-01

Purpose: To evaluate dual energy based methods for bone removal in computed tomography angiography (CTA) images and compare these with single energy based methods that use an additional, nonenhanced, CT scan. Methods: Four different bone removal methods were applied to CT scans of an anthropomorphic thorax phantom, acquired with a second generation dual source CT scanner. The methods differed by the way information on the presence of bone was obtained (either by using an additional, nonenhanced scan or by scanning with two tube voltages at the same time) and by the way the bone was removed from the CTA images (either by masking or subtracting the bone). The phantom contained parts which mimic vessels of various diameters in direct contact with bone. Both a quantitative and qualitative analysis of image quality after bone removal was performed. Image quality was quantified by the contrast-to-noise ratio (CNR) normalized to the square root of the dose (CNRD). At locations where vessels touch bone, the quality of the bone removal and the vessel preservation were visually assessed. The dual energy based methods were assessed with and without the addition of a 0.4 mm tin filter to the high voltage x-ray tube filtration. For each bone removal method, the dose required to obtain a certain CNR after bone removal was compared with the dose of a reference scan with the same CNR but without automated bone removal. The CNRD value of the reference scan was maximized by choosing the lowest tube voltage available. Results: All methods removed the bone completely. CNRD values were higher for the masking based methods than for the subtraction based methods. Single energy based methods had a higher CNRD value than the corresponding dual energy based methods. For the subtraction based dual energy method, tin filtration improved the CNRD value with approximately 50%. For the masking based dual energy method, it was easier to differentiate between iodine and bone when tin filtration
The method of arbitrarily large moments to calculate single scale processes in quantum field theory

Energy Technology Data Exchange (ETDEWEB)

Bluemlein, Johannes [Deutsches Elektronen-Synchrotron (DESY), Zeuthen (Germany); Schneider, Carsten [Johannes Kepler Univ., Linz (Austria). Research Inst. for Symbolic Computation (RISC)

2017-01-15

We device a new method to calculate a large number of Mellin moments of single scale quantities using the systems of differential and/or difference equations obtained by integration-by-parts identities between the corresponding Feynman integrals of loop corrections to physical quantities. These scalar quantities have a much simpler mathematical structure than the complete quantity. A sufficiently large set of moments may even allow the analytic reconstruction of the whole quantity considered, holding in case of first order factorizing systems. In any case, one may derive highly precise numerical representations in general using this method, which is otherwise completely analytic.
Ionization detector, electrode configuration and single polarity charge detection method

Science.gov (United States)

He, Z.

1998-07-07

An ionization detector, an electrode configuration and a single polarity charge detection method each utilize a boundary electrode which symmetrically surrounds first and second central interlaced and symmetrical electrodes. All of the electrodes are held at a voltage potential of a first polarity type. The first central electrode is held at a higher potential than the second central or boundary electrodes. By forming the first and second central electrodes in a substantially interlaced and symmetrical pattern and forming the boundary electrode symmetrically about the first and second central electrodes, signals generated by charge carriers are substantially of equal strength with respect to both of the central electrodes. The only significant difference in measured signal strength occurs when the charge carriers move to within close proximity of the first central electrode and are received at the first central electrode. The measured signals are then subtracted and compared to quantitatively measure the magnitude of the charge. 10 figs.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.